Hey there! Today, I’m sharing my experience setting up the new LTX Video Model in Comfy UI. If you want to create videos quickly on your local machine without relying on cloud services, this guide is for you. Let’s jump right in!
What You’ll Need
- Comfy UI: Make sure it’s updated to the latest version.
- Custom Nodes: Install the “Video Helper” node (I’ll show you how).
- Models: Download the LTX model (v0.95) and T5 text encoder.
Step 1: Install the Video Helper Node
- Open Comfy UI and go to Manager > Custom Nodes Manager.
- Search for “Video Helper” and click Install.
- Restart Comfy UI after installation.
Pro Tip: If nodes don’t show up, use the “Update All” button in the Manager to refresh everything.
Step 2: Update Comfy UI
- In the Manager, click “Update All” to get the latest features and fixes.
- Confirm the update date matches recent releases.
Step 3: Download & Load Models
- LTX Model:
- Download from the link in the workflow: ltx-video-2b-v0.9.5.safetensors
- Place it in
models/checkpoints
.
- T5 Text Encoder:
- Download : t5xxl_fp16.safetensors or t5xxl_fp8_e4m3fn.safetensors
- Choose between the large or small version based on your GPU’s memory.
- Save it in
models/text_encoders
.
- Refresh Comfy UI (Edit > Refresh) to detect new models.
Step 4: Run the Text-to-Video Workflow
- Load Models:
- In Load Checkpoint, select the LTX model.
- In Load CLIP, pick the T5 encoder.
- Set Prompts:
- Add a descriptive prompt (e.g., “a cat playing in snow”).
- Include a negative prompt to exclude unwanted elements.
- Adjust Settings:
- Resolution: Start with 768×512 for testing. Ensure dimensions are divisible by 32.
- Frames: For a 4-second video at 24 FPS, set frames to 97 (24×4 + 1).
- Click Run! On my RTX 4090, videos generate in under 20 seconds.
What to Expect: The model isn’t perfect for realism (especially humans/animals), but it’s lightning-fast for experimenting.
Step 5: Try the Image-to-Video Workflow
- Load an Image: Use the Load Image node. Match the image size to your target video resolution.
- Reduce Compression: Lower the default compression value (I use 5) for smoother results.
- Set Frame Count: For a 5-second video, set 121 frames (24 FPS x 5 + 1).
- Hit Run! This workflow often delivers better consistency than text-to-video.
Advanced: Control with Multiple Frames
- Start & End Frames: Load two images (e.g., a zoomed-out scene and a close-up). The model blends them.
- Add Mid-Frames: Insert a third image (like a character jumping) to guide transitions.
- Tweak prompts and seeds for smoother motion.
Tips for Better Results
- Detailed Prompts: Describe motion explicitly. Example: “A waterfall flowing rapidly with mist.”
- Seed Values: Experiment with seeds to find better outputs.
- Upscale Later: Use tools like Topaz Video AI to enhance resolution post-generation.
Final Thoughts
The LTX model is a fun, fast way to experiment with AI-generated videos. While it’s not flawless, the speed lets you test ideas quickly. Give it a try, and share your results on Discord—I’d love to see what you create!