ACE-workflow
About this model
🧠 ACE-Step Model Resources
🎨 ComfyUI Integration (FP16)
Place the single, combined checkpoint (including CLIP & VAE) in your checkpoints/ folder under ComfyUI/models:
📂 ComfyUI/
└── 📂 models/
└── 📂 checkpoints/
└── ace_step_v1_3.5b.safetensors
Download the checkpoint here: ace_step_v1_3.5b.safetensors
🎛️ Test Drive
Try ACE-Step interactively in your browser:
🔗 Hugging Face Space – ACE-Step
Model Description
ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holistic architectural design. It integrates diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer, achieving state-of-the-art performance in generation speed, musical coherence, and controllability.
Key Features
15× faster than LLM-based baselines (20 s for 4‑minute music on A100)
Superior musical coherence across melody, harmony, and rhythm
Full‑song generation, duration control, and accepts natural language descriptions
Uses
Direct Use
Generating original music from text descriptions
Music remixing and style transfer
Editing song lyrics
Downstream Use
Voice cloning applications
Specialized music generation (rap, jazz, etc.)
Music production tools
Creative AI assistants
Out‑of‑Scope Use
Generating copyrighted content without permission
Creating harmful or offensive content
Misrepresenting AI‑generated music as human‑created
🔗 More details & source checkpoint: />
👨💻 Developer Information
This guide was created by Abdallah Al-Swaiti:
Hugging Face
GitHub
LinkedIn
ComfyUI-OllamaGemini
For additional tools and updates, check out my other repositories.
Tags
Related Models
Similar AI models you may like
WAN 2.2 Workflow T2V-I2V-T2I (Kijai Wrapper)
Hand Detailer/Segmentation - ADetailer
Hunyuan 🌻 AllInOne
📜 DaSiWa Wan2.2 Workflows | I2V | SVI 2.0 | FLF2V 📜
LTX IMAGE to VIDEO with STG, CAPTION & CLIP EXTEND workflow
Wan2.1_14B-FusionX_Workflows