Ornith-1.0-35B · MXFP8

Official OsaurusAI MXFP8 build of deepreinforce-ai/Ornith-1.0-35B (MIT) — a vision-language MoE on a Qwen3.5 hybrid backbone. Near-lossless 8-bit microscaled FP; runs on Apple Silicon via Osaurus / mlx.

~34 GB bundle (down from ~70 GB bf16).
MXFP8: microscaled FP8 (group-size 32) on the language-model linear weights and routed experts; the vision tower is preserved at fp16, short-conv kernels and norms kept fp16.
Vision-language (image + text → text).

Architecture


Family	`qwen3_5_moe` (hybrid)
Text layers	40 — 30 Gated-DeltaNet (linear-attention) + 10 full-attention
Experts	256 routed (stacked `switch_mlp`) · hidden 2048 · untied lm_head
Vision	ViT tower (`model.visual`) preserved fp16
Cache	hybrid (GDN state + KV for attention layers)

Usage

# text
python -m mlx_lm generate --model OsaurusAI/Ornith-1.0-35B-MXFP8 --prompt "Explain a hash map in two sentences."

For image+text, load in Osaurus or an MLX-VLM runtime that supports qwen3_5 vision.

Provenance

Base: deepreinforce-ai/Ornith-1.0-35B © DeepReinforce — MIT (Qwen3.5-based)
Quantization: Osaurus · MXFP8 (microscaled FP8, group-size 32; vision tower fp16) · eric@osaurus.ai

Downloads last month: -

Safetensors

Model size

10B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for OsaurusAI/Ornith-1.0-35B-MXFP8

Base model

deepreinforce-ai/Ornith-1.0-35B

Finetuned

(7)

this model