LTX-2.3 Dev Audio+Image To Video (GGUF)
About this model
base workflow for Audio+Image to video for Dev model. LOW VRAM as possible.
can also generate text to video with audio reference. (switch red boolean node to TRUE)
i suggest leaving the prompt alone unless you want to prompt for a specific motion or action to occur.
prompt:
" Transform this static image into a high-quality video with with realistic facial expressions and realistic motion.
Perfect lip-sync to the attached audio. "
FILES:
OPTIONAL Kijais fp8 Scaled (requires load diffusion model node instead of unet loader node and replaces the gguf entirely. )
style="color:rgb(250, 82, 82)">DEV gguf (distilled ggufs are in the repo as well)
style="color:rgb(250, 82, 82)">Gemma 3_12B FP4 text encoder
style="color:rgb(250, 82, 82)">Audio VAE
style="color:rgb(250, 82, 82)">Video VAE
style="color:rgb(250, 82, 82)">Text Projection text encoder
style="color:rgb(250, 82, 82)">Distill Lora
style="color:rgb(250, 82, 82)">Upscaler
data-youtube-video>
Related Models
Similar AI models you may like
ON-THE-FLY 实时生成!Wan-AI 万相/ Wan2.1 Video Model (multi-specs) - CausVid&Comfy&Kijai - workflow included
【WAN2.1】IMG to VIDEO
ComfyUI Image Workflows
WAN 2.2 Workflow T2V-I2V-T2I (Kijai Wrapper)
Hunyuan 🌻 AllInOne
Moody Simple Zimage Turbo/Distilled Workflow