What are the Basic Nodes in ComfyUIComfyUI is a node-based user interface for Stable Diffusion, widely popular for its high customizability and flexibility. Its core lies in building image generation workflows by connecting various functional nodes. Here are some commonly used components (nodes) in ComfyUI. I will explain their purposes in plain language to help you get started quickly!
1. Load CheckpointFunction: This is the foundational node of a workflow, used to load Stable Diffusion models (such as SD 1.5, SDXL, etc.). It loads three parts of the model simultaneously: U-Net (generation core), CLIP (text understanding), and VAE (image encoding/decoding).Plain Explanation: It’s like preparing brushes and paints for a painter; this node loads the “drawing brain”.Common Usage: Select a .ckpt or .safetensors file to start the entire generation process.
2. CLIP Text EncodeFunction: Converts your text prompts into “digital language” (embedding vectors) that the model can understand, divided into positive prompts and negative prompts.Plain Explanation: Equivalent to translating your description for the painter, such as “draw a cat” or “don’t make it too blurry”.Common Usage: Connect to the CLIP output of Load Checkpoint, one for desired content and one for content to avoid.
3. KSamplerFunction: Controls the process of generating images from noise. You can choose different sampling methods (such as Euler, DPM++) and steps.Plain Explanation: Like a painter deciding how many strokes to use to finish a work. More steps mean more detail but slower, fewer steps mean faster but rougher.Common Usage: Connect the U-Net model and CLIP encoded prompts, adjust steps (20-50 is common) and CFG (guidance scale, 7-12 is common).
4. VAE DecodeFunction: Decodes the “latent image” (compressed data) generated by KSampler into the final pixel image.Plain Explanation: Equivalent to turning the painter’s draft into a finished painting.Common Usage: Connect the output of KSampler and the VAE output of Load Checkpoint to generate a visible image.
5. Save ImageFunction: Saves the generated image to the hard drive.Plain Explanation: Like framing the finished painting and archiving it.Common Usage: Connect the output of VAE Decode, specify the save path and filename.
6. Empty Latent ImageFunction: Generates a blank “canvas” (noise in latent space), specifying image dimensions (such as 512x512 or 1024x1024).Plain Explanation: Giving the painter a blank sheet of paper to start drawing from scratch.Common Usage: As the input starting point for KSampler, dimensions should match model requirements (512x512 for SD 1.5, 1024x1024 for SDXL).
7. Preview ImageFunction: Displays the generated image directly on the interface without saving it.Plain Explanation: Letting the painter show you the finished product first; save it if you like it.Common Usage: Connect the output of VAE Decode for easy workflow debugging.
8. Conditioning (Set Area)Function: Adds area restrictions to prompts, such as “draw a cat on the left, draw a dog on the right”.Plain Explanation: Telling the painter what to draw in which part of the canvas.Common Usage: Combined with CLIP Text Encode, used for local control of generated content.
9. LoRA LoaderFunction: Loads LoRA models to fine-tune the style or features of the main model (such as anime style, specific characters).Plain Explanation: Adding a “style filter” to the painter to make the drawing more personalized.Common Usage: Connect the MODEL and CLIP outputs of Load Checkpoint, adjust LoRA strength (usually 0.5-1.0).
10. ControlNetFunction: Controls the structure of the generated image through additional reference images (such as line art, edge maps, pose maps).Plain Explanation: Giving the painter a sketch to follow for details.Common Usage: Requires a ControlNet model file, connects to KSampler, inputs reference image.
11. VAE EncodeFunction: Encodes a normal image into a representation in latent space, used for image-to-image (img2img) generation.Plain Explanation: Handing an old painting to the painter and asking them to modify it.Common Usage: Input existing image, connect to KSampler to start modification.
12. Upscale ModelFunction: Loads super-resolution models (such as ESRGAN, SwinIR) to upscale images.Plain Explanation: Giving the painting a magnifying glass to make it clearer.Common Usage: Connect the generated image to further improve resolution.
Example of a Simple WorkflowA basic text-to-image workflow might look like this:
Load Checkpoint: Load SDXL model.
CLIP Text Encode: Input “A cat in the sunlight”.
Empty Latent Image: Set a 1024x1024 canvas.
KSampler: Generate using 30 steps and Euler method.
VAE Decode: Decode the result into an image.
Preview Image: Take a look.
SummaryThese components are the “basic toolbox” of ComfyUI. Mastering them allows you to build simple generation flows. As your needs increase, you might use more advanced nodes, such as:
Latent Upscale
Inpaint
AnimateDiff (Animation generation)
The charm of ComfyUI lies in its modular design, allowing you to freely combine these nodes as needed. If you are a beginner, it is recommended to start with the default workflow and gradually try adding features like LoRA or ControlNet. If you have any specific components you want to know more about, feel free to ask!