FramePack isn’t just another AI video tool, Developed by Lvmin Zhang (ControlNet legend) and Maneesh Agrawala, this framework lets you generate *2-minute videos on a gaming laptop* without melting your GPU into a puddle of regret. When I first tested it on my RTX 3060 (yes, the “budget” one), I expected a crash. Instead, I got a buttery 30fps clip of a cyberpunk cat playing guitar.
TL;DR for the impatient:
My hot take: Most “AI video” guides oversell speed and ignore stability. FramePack does both.
What it is: FramePack AI, a *constant-length context compression* system for long-form video generation.
Why it matters: Runs on 6GB VRAM (yes, really), fixes “drifting” artifacts, and uses bidirectional sampling for smoother motion.
Key perk: Processes 120-second videos without slowing down—unlike traditional models that choke after 5 seconds.
Here’s the *idiot-proof* method to running FramePack two ways: the easy Gradio UI for normies and the ComfyUI power-user route.
Step 1: Download & Unzip (Without the Usual Despair)
- Windows Users: Grab
FramePack-Standalone-Windows.7z
(2GB). Unzip it to find:run.bat
(Your new best friend)update.bat
(For future you who forgets everything)webui/
(Where the magic hides)README.md
(The doc you’ll ignore until something breaks)
- First Launch: Double-click
run.bat
. It’ll:- Auto-download 15GB of models (go make coffee).
- Launch the Gradio UI at
http://127.0.0.1:7860
. - Not crash (a miracle, honestly).
⚠️ Warning: If your firewall pops up, say “YES” like you’re defusing a bomb. Blocking it = instant regret.
Step 2: Generate Your First Video (Without the GPU Fireworks)
Here’s how I created “Cyberpunk Cat Guitarist” (RIP my dignity):
- Drop an image (e.g., a cat wearing sunglasses).
- Prompt: “A neon cat shredding an electric guitar, holographic stage, crowd cheering, cyberpunk vibe”.
- Settings:
- Video Length: 30 sec (FramePack’s sweet spot).
- Steps: 25 (More = sharper but slower).
- CFG Scale: 10 (Higher = stricter prompt obedience).
- VRAM: 6GB (Or max out your GPU if you’re fancy).
- UNCHECK “TC” (Unless you want your cat to grow a third eye).
“But why 30 seconds?” Because 5-second clips are for amateurs, and 120 seconds will test your sanity. Start small.
Step 3: ComfyUI Setup (For the Nerds Who Like Pain)
If Gradio’s too “normie” for you, here’s the ComfyUI method:
- Install ComfyUI (I assume you’ve done this; if not, Google is your therapist).
- Drag in FramePack’s custom nodes (GitHub has the JSON).
- Patchify Like a Pro: Adjust the kernel’s “attention budget” to prioritize foreground details (e.g., the cat’s face) over background fluff.
Download Links
https://github.com/kijai/ComfyUI-FramePackWrapper/tree/main?tab=readme-ov-file
Or from single file, in ComfyUI\models\diffusion_models
:
https://huggingface.co/Kijai/HunyuanVideo_comfy/blob/main/FramePackI2V_HY_fp8_e4m3fn.safetensor https://huggingface.co/Kijai/HunyuanVideo_comfy/blob/main/FramePackI2V_HY_bf16.safetensors
https://huggingface.co/Comfy-Org/sigclip_vision_384/tree/main
https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files/vae
https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files/text_encoders