Why is comfyui faster reddit.

Why is comfyui faster reddit You will have to learn Stable Diffusion more deeply though. In addition to being faster at under 30 seconds per 1MP Flux (once model is in memory -- must have enough to cache it) it also has more VRAM available. It's efficient as software for expert users, since it hides nothing. There's also an arrow for each toggle that will hyperjump right to the group. 4" - Free Workflow for ComfyUI. Progressively, it seemed to get a bit slower, but negligible. I've been wanting to try Foocus and ComfyUI for Ai image generation. First, I've been using webui for over a year and tried ComfyUI for about two weeks around six months ago. I'm into it. The answer is that it's painfully slow, taking several minutes for a single image. The big current advantage of ComfyUI over Automatic1111 is it appears to handle VRAM much better. it's the perfect tool to explore generative ai. SDXL running on ComfUI at 1. But with the GPU memory loading and image saving overhead it was more like 50% faster on my 4090. In general, image generation on MPS is slow, even on an M2 Max. Just installed A1111 to compare with ComfyUI and EasyDiffusion. I've played around with different upscale models in both applications as well as settings. It's not. Filter and sort from their properties (right-click on the node and select "Node Help" for more info). I accidentally tested ComphyUI for the first time about 20 min ago and noticed I clicked on the CPU bat file (my bad🤦‍♂️). I am a big fan of both A1111 and ComfyUI. Introducing "Fast Creator v1. I, on the other hand, are on a RTX 3090 TI and inference for me is 4 to 6 times slower than in Automatic's. It is not as fast but is more reliable. Feb 2, 2025 · for me it is the amount of VRAM available on my 4090 Windows takes 4GB of it to show me my desktop. comfyUI takes 1:30s, auto1111 is taking over 2:05s so, wanted to ask, is it just me or others are facing the same performance issues with auto1111? what comfyui devs say and what people do with customs nodes are different thing. The CPP version overheats my computer MUCH faster than A1111 or ComfyUI. ComfyUI is also trivial to extend with custom nodes. 0 with refiner. ComfyUI weights prompts differently than A1111. But it seems that when tested on slower machines, Windows 10, and even Windows 7 or XP, run faster. I really like the extensions library and ecosystem that already exists around A1111- in particular stuff like 'OneButtonPrompt' which is great for inspiration on styles, etc. I would typically get around 1. if I need a few ideas. Have used FaceEnhancer/DDetailer with ComfyUI and done manual retouching of images produced by EasyDiffusion, but would like to get it working. But the speed difference is far more noticeable on lower-VRAM setups, as ComfyUI is way more efficient when it comes to using RAM and VRAM. Faster to start up, faster to load models, faster to gen, faster to change things it's a real eye opener after the snail paced A1111. I have 1060 with 6gb ram. And you may need to do some fiddling to get certain models to work but copying them over works if you are super duper uper lazy. There has been a loader for diffusers models but its no longer in development, that's why people are having trouble using lcm in comfy now and also the new 60% faster sdxl (both only support diffusers) The main problem is that moving large files from and to an ssd repeatedly is going to wear it out pretty fast. The big difference is that looking at Task Manager (on different runs so as not influence results), my CPU usage is at 100% with CPP with low RAM usage, while in the others my CPU usage is very ow with very high ram usage. When ComfyUI just starts, the first image generation will always be fast (1 minute is the best), but the second generation (no changes to settings and parameters) and so on will always be slower, almost 1 minute slower. So far the images look pretty good except I'm sure they could be a lot better. It's not a waste of time to point out an optimization. you define the complexity of what you build. ComfyUI also uses xformers by default, which is non-deterministic. I just put my setup on a Sandisk SSD connected to my laptop with a USB-C to USB-C connector. So, as long as you don't expect comfyui not to break occasionally, sure give it a go. I recommend you to install the ComfyUI Manager extension, with it you can grab some other custom nodes available. I want to switch from a1111 to comfyui. I'd ask about command-line args, but I get the impression ComfyUI sets them automatically, somehow, based on the type of GPU. Nodes can be grouped into modules to simplify the workflow. 1) in A1111. on Linux 140MB (with the browser for ComfyUI on another host to save even more VRAM). The comfyui target audience are mainly engineer minded high tech people (heck, I'm dealing with PC's for almost 24 years and I scratched my head multiple times on some workflows) This is advertised like it's targeted to families and kids. The main reason is of course just how much faster Comfy is. this breaks the composition a little bit, because the mapped face is most of the time to clean or has a slightly different lighting etc. Only things I have changed are: --medvram (wich shouldn´t speed up generations afaik) Feb 23, 2023 · A friend of mine for example is doing this on a GTX 960 (what a madman) and he's experiencing up to 3 times the speed when doing inference in ComfyUI over Automatic's. But it is fast, for whatever that counts for. 5. Take it easy! 👍 I had previously used ComfyUI with SDXL 0. Unless cost is not a constraint and you have enough space to backup your files, move everything to an ssd. But yeah it goes fast in ComfyUi. ComfyUI was made by the creator so that he could understand SD better. Save up for a Nvidia card, and it doesn't have to be the 4090 one. despite the complex look, it's actually very The floating point precision on fp16 is very very poor for very very small decimals. We would like to show you a description here but the site won’t allow us. Here's the thing, ComfyUI is very intimidating at first so I completely understand why people are put off by it. About knowing what nodes do, this is the hard thing about ComfyUI, but there's a wiki created by the dev (comfyanonymus) that will help to understand many things I using WSL on windows for all kinds of stuff but not for stable diffusion I think the most things realated to performance are a) loading speed of model (which is faster without wsl), b) gpu speed for the calculations (which should be comparable between wsl and native windows) and c) cpu speed for certain nodes (like image upscale/downscale without models, ) which could also be comparable. There's a discussion on the github page about it, but I can't find it ATM. I might as well try running XP on a memory disk. Sure, my paintbrush never crashed after an update, but then comfyui doesn't get crimped in my bag, my loras don't need cleaning, and a png is quite a bit cheaper than canvas. 30s/it in A1111, but I regularly get 1. 2) and just gives weird results. I also use CTRL+B and CTRL+M on various nodes to toggle what controlnet nodes are applying to my clip (using fast bypass and fast mute nodes connected to them to quickly toggle individual node state!) everything ai is changing left and right, so a flexible approach is the best imho. Feb 16, 2025 · If I understand correctly, your workflow is faster with PyTorch ver 2. I have both shark and 1111 installed, they can work simultaneously. A lot of people are just discovering this technology, and want to show off what they created. thats pip install xformers (when a prebuild wheel is available, it take a few days in general after a pytorch update)/ If you do simple t2i or i2i you don't need xformers anymore, pytorch attention is enough. Feb 23, 2023 · Why is there such big speed differences when generating between ComfyUI, Automatic1111 and other solutions? And why is it so different for each GPU? A friend of mine for example is doing this on a GTX 960 (what a madman) and he's experiencing up to 3 times the speed when doing inference in ComfyUI over Automatic's. Oh dear, I had mistakenly thought Windows 11 was fast because it only boots on fast machines. This update includes new features and improvements to make your image creation process faster and more efficient. I think the noise is also generated differently where A1111 uses GPU by default and ComfyUI uses CPU by default, which makes using the same seed give different results. Also "octane" might invoke "fast render" instead of "octane style". It's just the nature of how the gpu works that makes it so much faster. I'll try in in ComfyUI later, once I set up the refiner workflow, which I've yet to do. Please share your tips, tricks, and workflows for using this software to create your AI art. In ComfyUI using Juggernaut XL, it would usually take 30 seconds to a minute to run a batch of 4 images. On my rig, it's about 50% faster, so I tend to mass-generate images on ComfyUI, then bring any images I need to fine-tune over to A1111 for inpainting and the like. and don't get scared by the noodle forests you see on some screenshots. Please keep posted images SFW. once you get comfy with comfy you don't want to go back. 2 seconds, with TensorRT. I'm always on a budget so I stored all my models in an hdd. VFX artists are also typically very familiar with node based UIs as they are very common in that space. And since it's node based it's inherently non-destructive and procedural. I wont use comfyUI either because there are too many options for workflows and it leads to "the paradox of choice" which causes me to hate using it. I've been scheduling prompt on hundred of image for animatediff for a long time with giant batch a 1000+ frames. For instance (word:1. As stated on the tittle I have the scroll speed at 3 lines, and the video shows how that performs on said settings on ComfyUI vs. The weights are also interpreted differently. Healthy competition, even between direct rivals, is good for both parties. Also, if this is new and exciting to you, feel free to post We would like to show you a description here but the site won’t allow us. Turns out that with the exponential scheduler this happens at 43% of the steps. My laptop is kinda slow with this because it only has 4gb of dedicated gpu vram. ) however, you can also run any workflow online, the GPUs are abstracted so you don't have to rent any GPU manually, and since the site is in beta right now, running workflows online is free, and, unlike simply running ComfyUI on some arbitrary cloud GPU, our cloud sets up everything automatically so that there are no missing files/custom nodes Welcome to the unofficial ComfyUI subreddit. The other file is run_cpu. I have the same card, Shark is easier to setup and way faster, although no extentions, lora is still available. I can link to the paper discussing why the sampler was created and why it's so much faster if you would like to read it. In comparison, the Mac is about 50% the speed of a 3090 RTX, and 15-20% of a 4090 RTX, which I rent with a Vast. You think I'm here worried about a few extra seconds, but your assumption turned out to be wrong. For example, SD and MJ are pushing themselves ahead faster and further because of each other. Hello, For more consistent faces i sample an image using the ipadapter node (so that the sampled image has a similar face), then i latent upscale the image and use the reactor node to map the same face used in the ipadapter on the latent upscaled image. com The best thing about ComfyUI, for someone who is not a savant, is that you can literally drag a png produced by someone else onto your own ComfyUI screen and it will instantly replicate the entire workflow used to produce that image, which you can then customize and save as a json. Then i tested my previous loras with comfyui they sucked also. Comfyui wasn't designed for Animatediff and long batch, yet it's the best platform for it thanks to the community. Here are some examples I did generate using comfyUI + SDXL 1. com in the process, we learned that many people found it hard to locally install & run the workflows that were on the site, due to hardware requirements, not having the right custom nodes, model checkpoints, etc. Fooocus is convenient, it very much speeded up my workflow, and the inpaint engine works like wonders. run K-sampler, feed that into Facedetailer, then Faster service, less operational overhead, for a workflow optimization. It's a waste of time to not optimize this workflow. Takes a minute to load. Workflows are much more easily reproducible and versionable. Some of the ones with 16gb vram are pretty cheap now. When you start ComfyUi there are 2 files in the main folder to start it up run_nvidia_gpu. I believe I got fast ram, which might explain it. Welcome to the unofficial ComfyUI subreddit. In short: Turning off the guidance makes the steps go twice as fast. so if your image is 512x512, and then you upscale to 2048x2045, then run facedetailer, its going to render the face in the same resolution as the original render, not the upscale, and then just basic-scale it to fit the dimensions of the final image. This KSampler uses the exact same prompt, model, and image that has been generated by the previous one, so why? For me it seems like adding more steps to the previous sampler would achieve similar results. 5-2it/s, with A1111 opened aside its 10-12it/s We would like to show you a description here but the site won’t allow us. I read that Forge increses speed for my gpu by 70%. I had this problem up until today. - I have an RTX 2070 + 16GB Ram, and it seems like ComfyUI has been working fineBut today when generating images, after a few generations ComfyUI seems to slow down from about 15 seconds to generate an image to 1 minute and a half. 1) in ComfyUI is much stronger than (word:1. Comfyui is much better suited for studio use than other GUIs available now. The original workflow doesn't use lcm as sampler, I just use it to make the generation faster. Hope I didn't crush your dreams. This might seem like a dumb question, but I've started trying to run SDXL locally to see what my computer was able to achieve. On my 12GB 3060, A1111 can't generate a single SDXL 1024x1024 image without using RAM for VRAM at some point near the end of generation, even with --medvram set. 4". After all, the more tools there are in the SD ecosystem, the better for SAI, even if ComfyUI and its core library is the official code base for SAI now days. A1111 is like ComfyUI with prebuilt workflows and a GUI for easier usage. 0+cu126. “(Composition) will be different between comfyui and a1111 due to various reasons”. SHARK is SUPER fast. Tested failed loras with a1111 they were great. I have a 4090 rig, and i can 4x the exact same images at least 30x faster than using ComfyUI workflows. This doesn't sound normal as ComfyUI is usually faster. You can lose the top 4 nodes as they are just duplicates, you can link them back to the original ones. But I still need to fixautomatic1111, might have to re-install. (If I'm wrong, remember I said I don't know much about ComfyUI. While FooocusUI is super simple to use, stupid easy, it suffers from the fact that the creator make it too easy to use, its so dumbed down that its been stripped of all the things that make A1111 That said, Upscayl is SIGNIFICANTLY faster for me. Just Google shark stable diffusion and you'll get a link to the github, just follow the guide from there. With comfy you can optimize your stuff how you want. Seems unlikely, given ComfyUI's generally superior handling of VRAM, but it's something to consider, I suppose. You can also easily upload & share your own ComfyUI workflows, so that others can build on top of them! :) Why I built this: I just started learning ComfyUI, and really like how it saves the workflow info within each image it generates. Hi there, i've been racking my brain as to why my 7zip, can't extract a zip file at a reasonable amount of time. Hi guys quick question. and nothing gets close to comfyui here. So while reading it off an external SSD may not be as fast, in the grand scheme of things I can't tell much of a difference. For some reason a1111 started to perform much better with sdxl today. Bookmark That could easily be why things are going so fast, I'll have to test it out and see if that's an issue with generation quality. a regular website. Also the limit on batch size means that xformers catch up for larger batches. EasyDiffusion is my favorite interface, but it lacks ADetailer. This is something I posted just last week on GitHub: When I started using ComfyUI with Pytorch nightly for macOS, at the beginning of August, the generation speed on my M2 Max with 96GB RAM was on par with A1111/SD. CUI can do a batch of 4 and stay within the 12 GB. UPDATE: In Automatic1111, my 3060 (12GB) can generate a 20 base-step, 10 refiner-step 1024x1024 Euler a image in just a few seconds over a minute. . 6 seconds in ComfyUI) and I cannot get TensorRT to work in ComfyUI as the installation is pretty complicated and I don't have 3 hours to burn doing it. While comfyUI is better than default A1111, TensorRT is supported on A1111, uses much less vram and image generation is 2-3X faster. If you want to keep using Automatic1111 on an 8gb card, you have to be very mindful of your vram usage and pretty much close everything that uses even a little bit of it such as other browser and election based applications. Give it a try, and you may find that ComfyUI runs faster and more efficiently than ever before. No matter what, UPSCAYL is a speed demon in comparison. Sounds like it might be running on CPU only. (Same image takes 5. I am running ComfyUI on a machine with 2xRTX4090 and am trying to use the ComfyUI_NetDist custom node to run multiple copies of ComfyUI server, each using separate GPU, to speed up batch generation. I think for me at least for now with my current laptop using comfyUI is the way to go. I don't find ComfyUI faster, I can make an SDXL image in Automatic 1111 in 4 . Does comfyui working similar way and giving me boost against a1111? Thanks Yeah, look like it's just my Automatic1111 that has a problem, CompfyUI is working fast. There's less overhead with ComfyUI (as you only load in the things you want to use). Belittling their efforts will get you banned. 1+cu124 than the later Pytorch ver 2. bat which only uses the CPU but will be very slow. about a month ago, we built a site for people to upload & share ComfyUI workflows with each other: comfyworkflows. Personally, I got about the same performance in A1111, but it did require command line flags whereas Comfy worked out of the box. ai account and a Jupyter Notebook for when I'm trying out new things, want/need to work fast and for img2img batch iterative upscaling. When the path changes, the "deactived" path now gets a small blank image as the input, that path processes faster as a result. Are you sure it's not the other way around - meaning it/s in Comfy and s/it in A1111? I'm not super familiar with Comfy but I'd say try a fresh workflow as the model loader node should apply the best settings for your GPU. If you’ve been struggling with slowdowns, adjusting the CUDA-System Fallback Policy could be the solution you’ve been looking for. So yea like people say on here, your negatives are just too basic. 07it/s in ComfyUI. Autos memory management is quiet messy compared to Comfy and that's why Comfy is much faster with 8gb. I have found the workflows by Searge to be extremely useful. I think the main advancement won't be in a better UI but rather in better models that don't require a noodle salad to get desirable results. That would be really cool actually, as internet is super not accessible for visually impaired, especially pictures, this could be used to generate descriptions of pictures compared to the traditional approach of the image descriptions websites are supposed to implement but just most of the time half ass or don't bother at all. I am still confused as to why Windows runs faster if both my Pytorch versions are the same, on Windows and Linux- if the only difference is driver perhaps Windows has better backward compatibility for older GPU’s like my 1070 The one thing I wonder about is why the same level of scrutiny isn't really being applied when the opposite is asserted in favor of ComfyUI. If I restart the app, then it will be faster again, but again, the second generation and so on will be slower again. Fast Groups Muter & Fast Groups Bypasser Like their "Fast Muter" and "Fast Bypasser" counterparts, but collecting groups automatically in your workflow. A few weeks ago I did a "spring-cleaning" on my PC and completely wiped my Anaconda environments, packages, etc. And above all, BE NICE. It's still 30 seconds slower than comfyUI with the same 1366x768 resolution and 105 steps. You don't need to switch to one or the other. It can be done without any loss in quality when the sigma are low enough (~1). which is why it looks burry and crappy. 0. But those structures it has prebuilt for you aren’t optimized for low end hardware specifically. Next. Dec 3, 2024 · use the following search parameters to narrow your results: subreddit:subreddit find submissions in "subreddit" author:username find submissions by "username" site:example. I ignored it for a while when it first came out. Note the flipping of s/it to it/s. 9 and it was quite fast on my 8GB VRAM GPU (RTX 3070 Laptop). Point the install path in the automatic 1111 settings to the comfyUI folder inside your comfy ui install folder which is probably something like comfyui_portable\comfyUI or something like that. I have tried it (a) with one copy of SDXL running on each GPU and (b) with two copies of SDXL running per GPU. Quick test: Batch generation on a 4090 (seconds, batches x images) Using ComfyUI was a better experience the images took around 1:50mns to 2:25mns 1024x1024 / 1024x768 all with the refiner. While kohya samples were very good comfyui tests were awful. 6. Results using it are practically always worse than nearly every other sampler available. bat is the one you want to use to start so it will run using the GPU. Apparently the issue here are logitech drivers and/or chromium, because ComfyUI works flawlessly on Firefox. Not to say that you're in that camp. 😁 Sorry to say that it won't be much faster, even if you overclock the cpu. But you an achieve this faster in A1111 considering the workflow of comfy ui. And it didn't go away until I restarted Comfyui and did not load this preprocessor. If I did, the issue returned. And i don't understand why, because even when A1111 not being used, the simple fact its opened it slow down my comfyUI (SDXL) generations by 500 to 600%. To troubleshoot, I'd restart comfy then load a default workflow with minimal nodes and see if it is still slow. Bf16 is capable of much better representation for very small decimals. Aug 27, 2024 · Optimizing your Nvidia 3D settings is a quick and easy way to enhance ComfyUI’s performance. But once you get the hang of it, you understand its power and how much more you can do in it. It's high quality, and easy to control the amount of detail added, using control scale and restore cfg, but it slows down at higher scales faster than ultimate SD upscale does. All it takes is taking a little time to compile the specific model with resolution settings you plan to use. /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. with my 8 gb rx 6600 which I was only able to run sdxl with sd-next (out of memory after 1-2 runs and on default 1024x1024), I was able to use this is comfyui BUT only with 512x512 or 768x512 - 512x768 (memory errors even with these from time to time) Curiously it is like %25 faster run running a sd 1. Everything that has to do with diffusers is pretty much deprecated in comfy rn. I've found ComfyUI to be quicker on my card (1060 6GB). Offering good advice is not a waste of time. 5 checkpoint on the same pc BUT the quality -at least comparing a few prompts for testing I spent all night last night playing with SUPIR. If it's fast again, you'll have to try to isolate the troublesome node. Here are my Pro and Contra so far for ComfyUI: Pro: Standalone Portable Almost no requirements/setup Starts very fast SDXL Support Shows the technical relationships of the individual modules Cons: Complex UI that can be confusing Without advanced knowledge about AI/ML hard to use/create workflows Welcome to the unofficial ComfyUI subreddit. Also Fooocus still run SDXL so much faster and smoother than A1111 WebUI, even Forge (by the same author of Fooocus) wasn't really as good as Fooocus. A1111 is fairly bloated (though, it always has been). I expect it will be faster. If it allowed more control then more people would be interested but it just replace dropdown menus and windows with nodes. It has now taken upwards of 10 minutes to do seemingly the same run. Hey everyone! I'm excited to share the latest update to my free workflow for ComfyUI, "Fast Creator v1. I initially used ComfyUI when SDXL was first released but didn't feel the need to continue as I could use SDXL comfortably with webui. Asked reddit wtf is going on everyone blindly copy pasted the same thing over and over. It also seems like ComfyUI is way too intense on using heavier weights on (words:1. With Automatic1111, it does seem like there are more built in tools perhaps that are helping process the image that may not be on for ComfyUI? Speed - generation of single images is really fast, peaking at twice the it/s of xformers. CUI is also faster. Download & drop any image from the website into ComfyUI, and ComfyUI will load that image's entire workflow. auxwh cbis slzwm kqgfl ltcpsi velkz wfeq bynbyw kiy flr