svdq-int4_r64-ernie-image

# ERNIE Image Turbo — Nunchaku W4A4 Quantized Inference

[中文](#chinese) | [English](#english)

---

### Introduction

This adds **W4A4 quantized inference** support for [ERNIE Image Turbo]( to [**Nunchaku**]( delivering significant speedup and memory reduction with minimal quality loss.

Built on [Nunchaku]( We gratefully acknowledge their excellent work on efficient diffusion model inference.


### Installation

```bash
# This fork adds ERNIE Image support to Nunchaku
git clone 
cd nunchaku
git submodule update --init --recursive

pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```

### Quick Start

```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision

precision = get_precision()  # auto-detect: "int4" or "fp4"
rank = 64

transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
    f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

pipe = ErnieImagePipeline.from_pretrained(
    "baidu/ERNIE-Image-Turbo",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    pe=None, pe_tokenizer=None,
)

image = pipe(
    prompt="a cute orange cat sitting on a sunlit windowsill",
    height=1024, width=1024,
    num_inference_steps=8,
    guidance_scale=1.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```

### Performance (Reference)

Tested on a single A800 GPU, 1024×1024 resolution, 8 inference steps:

| Model | Avg Latency | Speedup |
|-------|-------------|---------|
| Original BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |

### Notes

- Only `batch_size=1` is supported (same as typical inference use case).
---


### 简介

为 [**Nunchaku**]( 添加了对 [ERNIE Image Turbo]( 的 **W4A4 量化推理**支持，在保持图像质量的前提下显著提升推理速度、降低显存占用。

本实现基于 [Nunchaku](

### 安装

```bash
# 本 fork 基于 Nunchaku 添加了对 ERNIE Image 的支持
git clone 
cd nunchaku
git submodule update --init --recursive

pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```

### 快速开始

```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision

precision = get_precision()  # 自动检测：int4 或 fp4
rank = 64

transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
    f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

pipe = ErnieImagePipeline.from_pretrained(
    "baidu/ERNIE-Image-Turbo",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    pe=None, pe_tokenizer=None,
)

image = pipe(
    prompt="一只可爱的橘色猫咪坐在阳光照射的窗台上，旁边放着一盆绿色植物",
    height=1024, width=1024,
    num_inference_steps=8,
    guidance_scale=1.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```


### 性能参考

A800 单卡测试，1024×1024 分辨率，8 步推理：

| 模型 | 平均延迟 | 加速比 |
|------|---------|--------|
| 原始 BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |

### 注意事项

- 仅支持 `batch_size=1`（符合常见推理场景）。

svdq-int4_r64-ernie-image

About this model

Tags

Related Models

Juggernaut XL

Pony Diffusion V6 XL

CyberRealistic Pony

CyberRealistic

epiCRealism XL

Nova Anime XL

Realism By Stable Yogi (Pony)

Realism Illustrious By Stable Yogi