Index-tts2 single or two person
About this model
This is the best open-source text-to-speech project until December 2025.
Workflows include single-person and two-person TTS (Text-to-Speech).
Should require 10GB of GPU memory.
First, install this custom nodes via ComfyUI:
download the models:
V2 Model Download: Manually download the models to the specified folder under ComfyUI\models\TTS:
bigvgan_v2_22khz_80band_256x bigvgan_generator.pt
config.json
-campplus
campplus_cn_common.bin
- IndexTTS-2
│ .gitattributes
│ bpe.model
│ config.yaml
│ feat1.pt
│ feat2.pt
│ gpt.pth
│ README.md
│ s2mel.pth
│ wav2vec2bert_stats.pt
│
└─ qwen0.6bemo4-merge
added_tokens.json
chat_template.jinja
config.json
generation_config.json
merges.txt
model.safetensors
Modelfile
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.json
-MaskGCT
semantic_codec
model.safetensors
-w2v-bert-2.0
.gitattributes
config.json
conformer_shaw.pt model.safetensors
preprocessor_config.json
README.md
Please note that Windows users may need to install wheels, such as Triton and SageAttention.
If your ComfyUI prompts you to install them,
you can: Install Triton by running the following command:
pip install -U "triton-windows<3.6"
Install SageAttention by going to Find the wheel that matches your Torch and CUDA versions and run, for example, the following command:
pip install sageattention-2.2.0+cu130torch2.9.0andhigher.post4-cp39-abi3-win_amd64.whl
other:
# 1. First, install pynini using conda (pre-compiled version, to avoid compilation issues on Windows).
conda install -c conda-forge pynini -y# 2.
pip install WeTextProcessing --no-deps Related Models
Similar AI models you may like
ON-THE-FLY 实时生成!Wan-AI 万相/ Wan2.1 Video Model (multi-specs) - CausVid&Comfy&Kijai - workflow included
【WAN2.1】IMG to VIDEO
ComfyUI Image Workflows
WAN 2.2 Workflow T2V-I2V-T2I (Kijai Wrapper)
Hand Detailer/Segmentation - ADetailer
Hunyuan 🌻 AllInOne
Moody Simple Zimage Turbo/Distilled Workflow