flux 2 klein: Compact Real-Time Image Generation Model

Black Forest Labs has just open-sourced flux 2 klein, a groundbreaking compact image generation model available in both 4B and 9B parameter versions. This release marks a significant milestone in making high-quality AI image generation accessible on consumer hardware.

What Makes flux 2 klein Special?

flux 2 klein is designed for applications demanding real-time image generation without sacrificing quality. With just 13GB of VRAM, flux 2 klein runs efficiently on consumer-grade hardware like RTX 3090 or RTX 4070 GPUs. The architecture of flux 2 klein adopts a compact, unified design that integrates both image generation and editing capabilities within a single model. End-to-end inference time is under 1 second, while maintaining industry-leading generation quality.

A major highlight of this release is that Black Forest Labs has open-sourced not only the distilled models but also the full-strength base models. These base versions retain all training signals, offering greater flexibility and adaptability—particularly suitable for model fine-tuning, LoRA training, research exploration, and customized generation pipelines where controllability is prioritized over inference speed. Compared to distilled models, the full-strength versions also excel in output diversity.

Why Choose flux 2 klein?

Visual intelligence is entering a new era. As AI agents become more capable, their requirements for visual generation are evolving: they need models that can respond in real-time, iterate quickly, and run efficiently on readily available hardware. The name klein comes from German, meaning small, reflecting both the model's compact size and its extremely low inference latency. But flux 2 klein is far from limited in capability. The model series demonstrates exceptional performance in text-to-image generation, image editing, and multi-reference image generation—capabilities typically found only in much larger models.

Key Features of flux 2 klein

flux 2 klein offers the following capabilities:

Sub-second Inference:On modern hardware, image generation or editing completes in under 0.5 seconds
High Realism & Diversity:Especially in the Base version, featuring excellent photorealistic quality and higher generation diversity
Unified Generation & Editing:A single model supports text-to-image, image editing, and multi-reference generation while maintaining cutting-edge performance
Consumer GPU Compatible:The 4B model requires approximately 13GB VRAM, running on RTX 3090 / 4070 and above
Developer-Friendly & Accessible:4B model uses Apache 2.0 license; 9B model provides open weights; full open weights support deep customization and fine-tuning
API + Open Weights:Available through both production-grade API and local deployment with full weights

The flux 2 klein Model Family

The flux 2 klein series includes both 4B and 9B versions, with each version offering a base model and a step-distilled model (requiring only 4-step inference).

flux 2 klein 9B - The Flagship

flux 2 klein 9B is the flagship compact model. In tasks like text-to-image, single-reference image editing, and multi-reference generation, flux 2 klein 9B sets a new benchmark on the quality-latency Pareto frontier: achieving quality that matches or exceeds models 5× larger in under 0.5 seconds of inference time. Built on a 9B Flow model with an 8B Qwen3 text encoder, the model uses step distillation to compress inference to just 4 steps, achieving an optimal balance between speed and quality. You can blend multiple input images, mix different concepts, and iterate on complex compositions at sub-second speeds while maintaining cutting-edge generation quality. However, the 9B model uses the FLUX Non-Commercial License and does not support commercial use.

flux 2 klein 4B - Fully Open Source

flux 2 klein 4B is completely open source (Apache 2.0). This is the most accessible and approachable model, designed for local development and edge deployment, running on consumer GPUs like RTX 3090 / 4070. Despite its compact size, it's highly capable: supporting text-to-image (T2I), image-to-image (I2I), and multi-reference generation, with quality far exceeding similar-sized models. In addition to the step-distilled versions, this release includes the full-strength base models: flux 2 klein Base 9B / 4B. These base models are not distilled, offering higher output diversity and retaining all training signals, making them ideal for further fine-tuning.

Quantized Versions for Enhanced Performance

This release also includes FP8 and NVFP4 quantized versions of all flux 2 klein variants. Developed in collaboration with NVIDIA and optimized for RTX GPUs, these versions maintain the same capabilities while being smaller and faster, supporting a wider range of hardware:

FP8: Up to 1.6× faster with up to 40% less VRAM
NVFP4: Up to 2.7× faster with up to 55% less VRAM

flux 2 klein represents a significant advancement in making high-quality image generation accessible to everyone. Whether you're a researcher, developer, or creative professional, flux 2 klein offers the perfect balance of speed, quality, and accessibility.

flux 2 klein
Compact Real-Time Image Generation

What Makes flux 2 klein Special?

Why Choose flux 2 klein?

Key Features of flux 2 klein

The flux 2 klein Model Family

flux 2 klein 9B - The Flagship

flux 2 klein 4B - Fully Open Source

Quantized Versions for Enhanced Performance

Experience FLUX Image Generation

flux 2 kleinCompact Real-Time Image Generation

What Makes flux 2 klein Special?

Why Choose flux 2 klein?

Key Features of flux 2 klein

The flux 2 klein Model Family

flux 2 klein 9B - The Flagship

flux 2 klein 4B - Fully Open Source

Quantized Versions for Enhanced Performance

Experience FLUX Image Generation

flux 2 klein
Compact Real-Time Image Generation