IndexTTS2_ Vocal and Emotional Transfer _ Two person Dialogue+Single person Speaking Workflow
About this model
You can click on the link below to try it out directly. If the effect is good, you can deploy it locally
style="color:rgb(230, 73, 128)">Fan benefits,register to get 1000 points,daily login 100 points,play 4090!Experience the super power of 48G.
is a workflow for replicating human voices and emotions, which can generate emotional audio of single person speech or two person conversation. Better to use than previous models that generate stiff vocals, strongly recommended. The deployment difficulty of ComfyUI is relatively high. Firstly, the transformer version needs to be 4.51.0; Ensure the presence of the JSON5 module.
Project page: />Model download link:
/> /> /> /> />Model placement structure:
- bigvgan_v2_22khz_80band_256x
bigvgan_generator.pt
config.json
- campplus
campplus_cn_common.bin
- IndexTTS-2
│ .gitattributes
│ bpe.model
│ config.yaml
│ feat1.pt
│ feat2.pt
│ gpt.pth
│ README.md
│ s2mel.pth
│ wav2vec2bert_stats.pt
│
└─ qwen0.6bemo4-merge
added_tokens.json
chat_template.jinja
config.json
generation_config.json
merges.txt
model.safetensors
Modelfile
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.json
- MaskGCT
semantic_codec
model.safetensors
- w2v-bert-2.0
.gitattributes
config.json
conformer_shaw.pt
model.safetensors
preprocessor_config.json
README.md
Related Models
Similar AI models you may like
ON-THE-FLY 实时生成!Wan-AI 万相/ Wan2.1 Video Model (multi-specs) - CausVid&Comfy&Kijai - workflow included
【WAN2.1】IMG to VIDEO
ComfyUI Image Workflows
WAN 2.2 Workflow T2V-I2V-T2I (Kijai Wrapper)
Hand Detailer/Segmentation - ADetailer
Hunyuan 🌻 AllInOne
Moody Simple Zimage Turbo/Distilled Workflow