S3-DiT Architecture
Z-Image-Turbo's Scalable Single-Stream DiT unifies text, visual semantic tokens, and image VAE tokens into a single input stream. This design maximizes parameter efficiency compared to dual-stream approaches, enabling the 6B model to run on consumer hardware.
Decoupled-DMD Distillation
The core distillation algorithm separates CFG Augmentation (the primary acceleration engine) from Distribution Matching (the quality stabilizer). This decoupling enables Z-Image-Turbo's 8-step generation without sacrificing image fidelity.
DMDR Framework
Z-Image-Turbo's post-training method combines Distribution Matching Distillation with Reinforcement Learning. DMDR enhances semantic alignment, aesthetic quality, structural consistency, and high-frequency detail richness.
Prompt Enhancement
The built-in Prompt Enhancer adds reasoning capabilities to Z-Image-Turbo, enabling the model to understand context beyond literal descriptions. This feature improves instruction adherence and creative interpretation.
Flash Attention Support
Z-Image-Turbo integrates with Flash Attention for optimized memory usage and faster inference. Enable model compilation for additional speed improvements on compatible hardware.
Flexible Deployment
Deploy Z-Image-Turbo via PyTorch native inference or Hugging Face Diffusers. Supports CPU offloading for memory-constrained environments. Access via API at $0.005 per megapixel through multiple providers.

Find your next favorite product or submit your own. Made by @FalakDigital.
Copyright ©2025. All Rights Reserved