With Hunyuan text-to-video technology, you can now generate videos in just 6 simple steps. Yes, you read that right—only six!
Getting Started: Setting Up the Workflow
Here’s how you can create your own Hunyuan video in just a few steps:
- Download the Workflow
Visit ComfyUI Blog and download the pre-built workflow tailored for Hunyuan video creation. - Load ComfyUI
Launch ComfyUI on your system and load the downloaded workflow. For reference, this tutorial uses a system with 12GB of VRAM. - Gather Required Files
You’ll need the following files:- pytorch_model.pt (Save this in the VAE folder of ComfyUI.)
- mp_rank_00_model_states.pt (Place this in the Diffusion Model folder.)
Clip I of Flux
andLlava Llama3 FP8
(Save these in the Text Encoder folder.)
- Load the Files into ComfyUI
- In the VAE loader, select
pytorch_model.pt
. - In the Diffusion Model loader, select mp_rank_00_model_states.pt. If you encounter memory issues, switch the Weight Dtype to
FP8
. - For Dual Clip Load, select Clip I of Flux and Llava Llama3 FP8, and set the type to Hunyuan Video.
- In the VAE loader, select
- Configure the Workflow
- Add your text prompt in the Clip Text Encoder and set Flux Guidance to
10
and Model Sampling to7
. - In the Empty Latent section, set the resolution to 848×480 for standard output or 720×1224 for HD.
- In the Basic Scheduler, select the Simple Scheduler and set the steps to
6
for quick results or up to20
for enhanced quality.
- Add your text prompt in the Clip Text Encoder and set Flux Guidance to
Download Files
- VAE: pytorch_model.pt
- Diffusion Model: mp_rank_00_model_states.pt
- Text Encoder: llava_llama3_fp8_scaled.safetensors
- Clip I: clip_l.safetensors