Krea 2 Turbo β€” MLX (quantized turnkey)

On-device, Apple-MLX-ready repack of krea/Krea-2-Turbo, the few-step text-to-image checkpoint from Krea.ai, Inc. This repository is a Derivative prepared for mlx-gen (and the SceneWorks worker that embeds it): the weights are group-wise-affine quantized and repacked from the original bf16 diffusers checkpoint so the model loads and runs natively on Apple Silicon with no Python/PyTorch sidecar.

This is not the original checkpoint. For the reference model, training details, and the canonical diffusers / SGLang inference paths, see the upstream card: https://huggingface.co/krea/Krea-2-Turbo.

Attribution

  • Original model: Krea 2 Turbo β€” Β© Krea.ai, Inc., released 2026-06-22.
  • Base model: krea/Krea-2-Turbo (itself fine-tuned/distilled from krea/Krea-2-Raw).
  • This Derivative: quantized + MLX-repacked by the SceneWorks / mlx-gen project. No retraining or fine-tuning was performed β€” only numerical quantization and on-disk re-layout.

License

Use of these weights is governed by the Krea 2 Community License Agreement and the Krea Acceptable Use Policy, exactly as for the original model. A copy of the license is included in this repository as LICENSE.pdf (also at https://huggingface.co/krea/Krea-2-Turbo/blob/main/LICENSE.pdf). In the event of any conflict, the Krea Acceptable Use Policy and Krea 2 Community License control.

Deployer obligation (content filtering). The Krea 2 Community License requires anyone who deploys the model to implement content-filtering measures or equivalent review processes appropriate to their use case, to prevent the generation or distribution of unlawful or policy-violating content. If you serve this model, you are responsible for those safeguards. Report harmful, illegal, or policy-violating outputs to safety@krea.ai (potential CSAM is escalated to NCMEC as required by law).

Krea does not claim copyright over generated outputs; users are solely responsible for their inputs and any use of the outputs.

What changed vs. the upstream checkpoint

The conversion is lossy only through quantization β€” the architecture, tokenizer, scheduler config, and VAE are byte-for-byte the originals.

  • Transformer (DiT) and Qwen3-VL-4B text encoder: the linear projection weights are quantized to group-wise affine Q8 / Q4 (group size 64) and repacked into a single .safetensors per stack. Norms, embeddings, modulation tables, and the text-encoder vision tower stay dense.
  • VAE (AutoencoderKLQwenImage): copied unchanged (f32).
  • tokenizer/, scheduler/, model_index.json: copied unchanged.

Repository layout

Each quant is a complete, self-contained snapshot you can load directly:

Path Quantization On-disk size Notes
q8/ Q8 (group size 64) ~20.6 GB Default. Near-lossless; needs a 48 GB-class Mac.
q4/ Q4 (group size 64) ~12.5 GB Lighter footprint; mild quality trade-off.
krea-2-turbo-mlx/
β”œβ”€β”€ LICENSE.pdf
β”œβ”€β”€ README.md
β”œβ”€β”€ q8/   { transformer/  text_encoder/  vae/  tokenizer/  scheduler/  model_index.json }
└── q4/   { transformer/  text_encoder/  vae/  tokenizer/  scheduler/  model_index.json }

Usage

Built for Apple-Silicon inference through mlx-gen's krea_2_turbo engine. Point a loader at the q8/ (or q4/) subdirectory; it auto-detects the packed weights. Krea 2 Turbo is CFG-free β€” run ~8 steps with guidance 0 (no negative prompt), up to 2048Β².

Model details

See the upstream card for the full model overview, capabilities, intended/out-of-scope uses, training-data summary, safety measures, and risk/limitation disclosures: https://huggingface.co/krea/Krea-2-Turbo.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SceneWorks/krea-2-turbo-mlx

Base model

krea/Krea-2-Raw
Quantized
(14)
this model