Learning and continuous output will continue to update your knowledge system! Today we’re trying to generate a large image with a single node! For example, this beauty picture is directly generated by comfyui, but in our cognition, the general SD1.5 model is trained with 512*512 images, and the SDXL images are generally trained with 1024*1024 images, and the size of my image is 2048*960!
Students who have played comfyui know that such a picture is directly born! So how did we do it!
UnetLoader: Kwai-Kolors/Kolors at main (huggingface.co)
Text_encode: Kwai-Kolors/Kolors-diffusers at main (huggingface.co)
1. Set up the Kolor Wensheng diagram workflow
After building the model, we try to generate a 2048*960 size drawing, of course, horizontal and vertical drawings!
2. Load the Patch Model and DownSacle node
as shown in the figure above, in fact, only one node is needed, the Patch Model and DownSacle node, which roughly means to adjust the scale of the model to achieve the effect of deep shirnk deep scaling!
In order to better understand the parameters of this model, we try to learn and memorize:
let me explain this in a very simple way:
Imagine the model is like a big sandwich, and each “block” is a layer in the sandwich. When you try to make a picture using the model, different layers (or blocks) help in making different parts of the picture look better.
- Block 1: If you use this, the picture might have a lot of random stuff (noise) that doesn’t match what you want.
- Block 2: A bit better, but still some randomness.
- Block 3: Starts to look like what you asked for (the picture you want).
- Block 4, 5, 6, 7: As you go higher in the blocks, the picture gets closer and closer to what you asked for, with fewer random parts.
So, the higher the block number, the more it focuses on what you asked for, like “5 beauties.” You can play around with these block numbers to see which one makes the best picture for you!
Think of it like adjusting the layers of a sandwich to make it taste just right!
2.1, block number
Operate on that block of the model. In deep learning models, blocks are typically subnetworks of multiple convolutional layers. Indicates an operation on the 3rd block.
In the actual test:
block 1: effect
block_number = 3
Block 2 Effects:
Block 4 Effects:
Block 5 Effects:
Block 6 Effects:
Block 7 Effects:
In fact, for the illustrations in this article, my prompt word is 5 beauties, so here I found that the larger the block value, the closer it is to the prompt word, and less than 3 will have a lot of noise! It may not be all, I guess it is likely that each block of the model represents different information! What do you think? In
fact, I personally feel that if you produce a picture you like, you can control the seed and adjust the block value to generate a better effect!
2.2、downscale factor
When the downscale is 2, it means that a specific block part of the model has been downscaled because the downscaling operation only affects the resolution calculation of a specific layer inside the model, but does not directly affect the size of the final output image.
2.3, start_percent and end_percent
Set the start and end of the operation to a specific layer of the model, downscaling, and narrowing the operation.
2.4, downscale_method and upscale_method
Methods of image reduction and enlargement ( is a smoothing method)
bicubic
Finally, to sum up, the disadvantages of the Patch Model and DownSacle node are:
1. Loss of detail: Downgrading will lead to the loss of some image details, because the image becomes smaller during processing, and subtle features may be blurred. This can result in the resulting image lacking some high-frequency details (e.g., hair, texture).
2. Blur effect: Using downscaling may make some areas of the image appear smoother or blurrier, because the model works at a smaller resolution, and the enlarged image may not fully restore its original sharpness when finally decoded.
3. Affect the generation of specific styles: For some images that need to emphasize details (such as surrealist style, high-definition landscape images, etc.), downscaling may not be suitable because it will reduce the detail performance. For styles that don’t require too much detail (e.g., comic style, stick figures, etc.), downscaling may not have a negative impact on the result.
Let’s take a look at the other effects of this workflow:
Effect 1: 3 kittens
Effect 2: Cyber Hello World
Well, that’s all for today, thanks for watching!
Hi, waiting for workflow for ComfyUI Upscale Flux : How to Upscale Blurry Image High Detail Image Using Supir and Flux.
we are working on it..
ok.
me too