Hunyuan Video 6 Steps ComfyUI Workflow Low VRAM

Hunyuan Video: First Try with ComfyUI’s Native Workflow

I finally got around to testing Hunyuan’s text-to-video model in ComfyUI, and honestly, it’s way simpler than I expected. No complicated API keys or external tools—just drag, load, and run. Here’s how it went.

I started with the pre-built workflow from the ComfyUI examples page. No setup needed—just downloaded the image and dragged it into my canvas. The workflow auto-prompted me to download the required models, which was a nice touch. If you’re doing this manually, you’ll need four files:

hunyuan_video_vae_bf16.safetensors (for the VAE loader)

hunyuan_video_t2v_720p_bf16.safetensors (diffusion model)

clip_l.safetensors and llava_llama3_fp8_scaled.safetensors (text encoders)

Loading the Models

The workflow uses a DualCLIPLoader node for the text encoders. I dropped clip_l.safetensors and llava_llama3_fp8_scaled.safetensors into the ComfyUI/models/text_encoders folder, and ComfyUI detected them right away. For the VAE and diffusion model, I just selected the files in their respective loader nodes.

Running the Workflow
I kept most settings default but tweaked two things:

Resolution: The EmptyHunyuanLatentVideo node defaults to 720p. You can lower it to 480p if you’re tight on VRAM.

Sampling Steps: I tried 6 steps for a quick test (took ~4 minutes on my 4090), but bumping it to 20 gave noticeably smoother motion.

Unexpected Wins

The model handles bilingual prompts (Chinese/English) without extra config.

Setting the length to “1” in EmptyHunyuanLatentVideo generates a static image—handy for testing compositions.

Annoyances

The VAE is heavy. Even at 720p, I hit VRAM limits until I switched the weight dtype to FP8.

Outputs sometimes ignore the prompt for the first few frames. Adding “–v 2” to the prompt helped, but it’s hit-or-miss.

For more details, the Hunyuan Video GitHub repo breaks down the architecture. Or just grab the workflow and tweak it—it’s surprisingly flexible.

Download Files

VAE: pytorch_model.pt
Diffusion Model: mp_rank_00_model_states.pt
Text Encoder: llava_llama3_fp8_scaled.safetensors
Clip I: clip_l.safetensors

Download Workflows

🤖 Hey, welcome! Thanks for visiting comfyuiblog.com