Atomic Chat Discord GitHub

Ornith 1.0 9B

Ornith 1.0 9B, quantized to MLX 6-bit by Atomic Chat for Apple Silicon. Built straight from DeepReinforce's original weights. Runs fully offline on your Mac.

Highlights

  • A self-improving open-source family for agentic coding from DeepReinforce, built for tool-calling and terminal-based coding agents.
  • Post-trained on top of Gemma 4 and Qwen 3.5, the smallest, fastest member of the Ornith 1.0 lineup.
  • Strong agentic coding scores for its size: 69.4 on SWE-bench Verified and 43.1 on Terminal-Bench 2.1 (Terminus-2).
  • Dense architecture, 32 layers, qwen3_5 model type with a hidden_size of 4096.
  • 262,144-token native context for long files and multi-step agent traces.
  • Pure open: MIT licensed, globally accessible with no regional limits.
  • Full quant ladder with an importance matrix on every quant over calibration_datav3.

This is the MLX 6-bit build for Apple Silicon (M-series). For llama.cpp/Ollama/CPU use the GGUF repo.

Model Overview

Property Value
Base model deepreinforce-ai/Ornith-1.0-9B
Total parameters ~9B (model name; card states no exact figure in prose)
Layers 32
Context length 262,144
Architecture qwen3_5 dense causal LM, post-trained on Gemma 4 and Qwen 3.5
This repo MLX 6-bit quant for Apple Silicon (~7.3 GB), built from the original weights.
Ornith 1.0 9B benchmarks

Scores are DeepReinforce's published results for the full-precision base deepreinforce-ai/Ornith-1.0-9B. MLX quants run the same model locally; lower bit-widths trade a little accuracy for size/speed.

MLX quants in this series

4-bit · 5-bit · 6-bit ← this · 8-bit

Run on Apple Silicon

pip install mlx-lm
mlx_lm.generate --model AtomicChat/ornith-9b-MLX-6bit --prompt "Write a quicksort in Python" --max-tokens 512
from mlx_lm import load, generate
model, tokenizer = load("AtomicChat/ornith-9b-MLX-6bit")
msg = [{"role": "user", "content": "Write a quicksort in Python"}]
prompt = tokenizer.apply_chat_template(msg, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=prompt, max_tokens=512, verbose=True))

Or open it in Atomic Chat: search AtomicChat/ornith-9b-MLX-6bit and hit Use this model.

Recommended sampling

Parameter Value
temperature 0.6
top_p 0.95
top_k 20

DeepReinforce's recommended sampling parameters. The card notes that temperature=1.0 reproduces the reported benchmark setup.

How this was made

  1. Download deepreinforce-ai/Ornith-1.0-9B (original weights).
  2. Convert + quantize to MLX with mlx_lm.convert -q --q-bits 6 --q-group-size 64.

License

Released by DeepReinforce under the MIT license, globally accessible with no regional limits. Quantized to MLX by Atomic Chat.

Downloads last month
11
Safetensors
Model size
9B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AtomicChat/ornith-9b-MLX-6bit

Quantized
(23)
this model