Krea 2 Turbo — MLX (quantized turnkey)

On-device, Apple-MLX-ready repack of krea/Krea-2-Turbo, the few-step text-to-image checkpoint from Krea.ai, Inc. This repository is a Derivative prepared for mlx-gen (and the SceneWorks worker that embeds it): the weights are group-wise-affine quantized and repacked from the original bf16 diffusers checkpoint so the model loads and runs natively on Apple Silicon with no Python/PyTorch sidecar.

This is not the original checkpoint. For the reference model, training details, and the canonical diffusers / SGLang inference paths, see the upstream card: https://huggingface.co/krea/Krea-2-Turbo.

Attribution

Original model: Krea 2 Turbo — © Krea.ai, Inc., released 2026-06-22.
Base model: krea/Krea-2-Turbo (itself fine-tuned/distilled from krea/Krea-2-Raw).
This Derivative: quantized + MLX-repacked by the SceneWorks / mlx-gen project. No retraining or fine-tuning was performed — only numerical quantization and on-disk re-layout.

License

Use of these weights is governed by the Krea 2 Community License Agreement and the Krea Acceptable Use Policy, exactly as for the original model. A copy of the license is included in this repository as LICENSE.pdf (also at https://huggingface.co/krea/Krea-2-Turbo/blob/main/LICENSE.pdf). In the event of any conflict, the Krea Acceptable Use Policy and Krea 2 Community License control.

Deployer obligation (content filtering). The Krea 2 Community License requires anyone who deploys the model to implement content-filtering measures or equivalent review processes appropriate to their use case, to prevent the generation or distribution of unlawful or policy-violating content. If you serve this model, you are responsible for those safeguards. Report harmful, illegal, or policy-violating outputs to safety@krea.ai (potential CSAM is escalated to NCMEC as required by law).

Krea does not claim copyright over generated outputs; users are solely responsible for their inputs and any use of the outputs.

What changed vs. the upstream checkpoint

The conversion is lossy only through quantization — the architecture, tokenizer, scheduler config, and VAE are byte-for-byte the originals.

Transformer (DiT) and Qwen3-VL-4B text encoder: the linear projection weights are quantized to group-wise affine Q8 / Q4 (group size 64) and repacked into a single .safetensors per stack. Norms, embeddings, modulation tables, and the text-encoder vision tower stay dense.
VAE (AutoencoderKLQwenImage): copied unchanged (f32).
tokenizer/, scheduler/, model_index.json: copied unchanged.

Repository layout

Each quant is a complete, self-contained snapshot you can load directly:

Path	Quantization	On-disk size	Notes
`q8/`	Q8 (group size 64)	~20.6 GB	Default. Near-lossless; needs a 48 GB-class Mac.
`q4/`	Q4 (group size 64)	~12.5 GB	Lighter footprint; mild quality trade-off.

krea-2-turbo-mlx/
├── LICENSE.pdf
├── README.md
├── q8/   { transformer/  text_encoder/  vae/  tokenizer/  scheduler/  model_index.json }
└── q4/   { transformer/  text_encoder/  vae/  tokenizer/  scheduler/  model_index.json }

Usage

Built for Apple-Silicon inference through mlx-gen's krea_2_turbo engine. Point a loader at the q8/ (or q4/) subdirectory; it auto-detects the packed weights. Krea 2 Turbo is CFG-free — run ~8 steps with guidance 0 (no negative prompt), up to 2048².

Model details

See the upstream card for the full model overview, capabilities, intended/out-of-scope uses, training-data summary, safety measures, and risk/limitation disclosures: https://huggingface.co/krea/Krea-2-Turbo.

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

Quantized

Model tree for SceneWorks/krea-2-turbo-mlx

Base model

krea/Krea-2-Raw

Finetuned

krea/Krea-2-Turbo

Quantized

(14)

this model