vLLM Prompt Node for ComfyUI

A ComfyUI custom node that generates Stable Diffusion prompts using a locally running vLLM server. Supports wildcard expansion and a fixed prefix for quality tags or style anchors.

Installation

Clone or copy this folder into your ComfyUI/custom_nodes/ directory:

cd ComfyUI/custom_nodes
git clone  ComfyUI.
Requirements
A running vLLM server (see vLLM docs)
Python package: requests (pip install requests)
ComfyUI
Setup
Start your local vLLM server. The node will automatically detect whichever model is currently loaded. No need to specify it in the node.
Example launch:
vllm serve ./models/Qwen2.5-3B \
--host 0.0.0.0 \
--port 8765 \
--served-model-name Qwen2.5-3B
Note: The node queries /v1/models on each generation and uses the first model returned. If you change models, restart your vLLM server. The node picks it up automatically.
Node Inputs
InputTypeDefaultDescriptionpromptSTRINGGeneration instruction. Supports {wild|card} syntax.prefixSTRINGmasterpiece, best quality, highresFixed tags prepended to the output. Not sent to the model.hostSTRINGlocalhostvLLM server host.portINT8765vLLM server port.max_tokensINT128Maximum tokens to generate.temperatureFLOAT0.7Sampling temperature. Higher = more creative.retriesINT3How many times to retry on empty or failed responses.
Node Output
OutputTypeDescriptioncombined_promptSTRINGprefix + generated text, ready to wire into CLIPTextEncode
The node displays a live preview after each generation showing:
Prefix
Raw generated text
Final combined string
Wildcard Syntax
Use {option1|option2|option3} anywhere in your prompt. One option is chosen at random each run. Multiple wildcards are resolved independently.
A {red|blue|green} dragon, {breathing fire into the sky|coiled around a mountain peak in a storm|diving into a glowing ocean abyss|rearing up against a blood moon}
Wildcards are expanded before the prompt is sent to the model, so the model always receives a fully resolved string.
Example Workflow
VLLMPromptNode ──→ CLIPTextEncode (positive) ──→ KSampler
       ↑
CLIPTextEncode (negative) ───┘
Prompt Format
The node uses the completions endpoint with a structured format that forces the model to return comma-separated tags only:
### Stable Diffusion prompt tags (comma separated, no sentences):

Input: <your expanded prompt>

Output:
Generation stops at the first newline, preventing extra text.
If conversational output appears:
Lower temperature to 0.3–0.5
Use a larger model (≥ 1.5B recommended)
Reduce max_tokens
Model Recommendations
ModelQualityNotesQwen2.5-0.5B⚠️ UnreliableToo small for consistent instruction followingQwen2.5-1.5B✓ UsableOccasional filler, mostly cleanQwen2.5-3B✓✓ RecommendedClean output, follows format reliablyQwen2.5-32B✓✓✓ BestOverkill but flawless
Tested With
vLLM 0.4+
Qwen2.5-0.5B, Qwen2.5-1.5B, Qwen2.5-3B
ComfyUI (latest)