RDBT - NTYM - ComfyForge

Recalibrated distribution

This model is part of the test theories to improve diffusion models.

Trained from NTYM4 with ~70k images.

Aiming for

Also 2x faster.

Guide

Prompt: Basically the same as NetaYume. Except:

Style prompt is required. This model does not have default style. The default tv anime style in NetaYume has been nuked.
Use "Digital anime art style by @xxxx." at the end of the prompt to prevent Gemma 2 paying too much and incorrect attention to the artist name.
Quality tags are not needed. Dataset has higher quality than avg "masterpiece".
You don't need tons of tags to describe a character. Just use the most unique ones. e.g. "elf girl frieren, fox girl tamamo \(fate\)". See: img.
Prefer simple natural language at the start, and tags at the end.

Settings:

About CFG distilled model:

You can't control CFG scale and negative prompt. Those are trained inside the model.
CFG scale = 1 is a special value. It means disabling CFG and neg prompt.
Because you don't need to run a forward pass for the negative prompt, you can generate 2x faster.

Some training details

Total dataset contains ~70k images. Not equally weighted.

Only layers.[2:25] were trained.

Captions are mainly from Gemini. Natural language only, no tags.

Not a LoRA this time?

Multi stage training. No LoRA.

Versions

v0.1 cfg distilled: bf16 full model.

v0.1 cd tcfp8: (has issues, do not download, will be deleted soon) cfg distilled, also a tensorcorefp8 version for ComfyUI.