SkyReels V2 AI: Infinite Video Generation Comfyui Workflow

I’ve been running SkyReels V2 AI through its paces in ComfyUI this past week, and let’s be real – the infinite video generation claims sounded too good to be true at first. But after pushing it with some test workflows, I’m seeing some genuinely impressive results, especially with the 14B model.

Here’s the thing about the “diffusion forcing” approach – it’s not just stitching frames together. The way it handles frame overlap actually maintains motion consistency better than I expected. I threw a 97-frame wolf animation at it, and the 14B model kept the paw movements looking natural through the entire sequence. The 13B version worked too, though I noticed some fur details getting lost in longer generations.

Getting SkyReels V2 Running in ComfyUI

Installation wasn’t completely smooth – you’ll need to grab the One Video Wrapper manually since it’s not in ComfyUI Manager by default.

git clone https://github.com/kijai/ComfyUI-WanVideoWrapper.git
cd ComfyUI-WanVideoWrapper
pip install -r requirements.txt

I just cloned the repo from Kijai’s GitHub and ran the requirements install.

Models you’ll need:

For hardware, I tested both the 1.3B and 14B models. The smaller one runs on my 6GB VRAM setup without too much trouble, though you’ll want to stick with the FP32 version unless you’ve got serious GPU power. The full workflow template includes a text-to-video setup, but I modified mine to focus on video extensions.

One node that made a big difference was the Diffusion Forcing Sampler – it’s specifically tuned for these long generations and handles the segment transitions much better than the standard sampler. The prefix sampling option seems to be what keeps motions flowing naturally between chunks.

The Good and Not-So-Good

What surprised me most was how well it maintained consistency in backgrounds. Normally when you extend videos, you get that telltale flickering or warping, but SkyReels V2 keeps things stable. Close-up facial details are still tricky though – the 14B handles them better, but even then, I wouldn’t call it perfect.

The 13B model works in a pinch for simpler scenes, but you’ll notice quality differences side by side. Texture details start to degrade after a while, and complex motions can get a bit jittery. For most of my test cases though, it held up better than I thought it would.

Download Workflows

🤖 Hey, welcome! Thanks for visiting comfyuiblog.com

Text to Video

🤖 Hey, welcome! Thanks for visiting comfyuiblog.com