Ornith-1.0-35B · MXFP4

Official OsaurusAI MXFP4 build of deepreinforce-ai/Ornith-1.0-35B (MIT) — a vision-language MoE on a Qwen3.5 hybrid backbone. Near-lossless 8-bit microscaled FP; runs on Apple Silicon via Osaurus / mlx.

~18 GB (from ~70 GB bf16) bundle.
MXFP8: microscaled FP4 (group-size 32, 4-bit) on the language-model linear weights and routed experts; the vision tower is preserved at fp16, short-conv kernels and norms kept fp16.
Vision-language (image + text → text).

Architecture


Family	`qwen3_5_moe` (hybrid)
Text layers	40 — 30 Gated-DeltaNet (linear-attention) + 10 full-attention
Experts	256 routed (stacked `switch_mlp`) · hidden 2048 · untied lm_head
Vision	ViT tower (`model.visual`) preserved fp16
Cache	hybrid (GDN state + KV for attention layers)

Usage

# text
python -m mlx_lm generate --model OsaurusAI/Ornith-1.0-35B-MXFP4 --prompt "Explain a hash map in two sentences."

For image+text, load in Osaurus or an MLX-VLM runtime that supports qwen3_5 vision.

Provenance

Base: deepreinforce-ai/Ornith-1.0-35B © DeepReinforce — MIT (Qwen3.5-based)
Quantization: Osaurus · MXFP4 (microscaled FP4, group-size 32; vision tower fp16) · eric@osaurus.ai

Downloads last month: -

Safetensors

Model size

6B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for OsaurusAI/Ornith-1.0-35B-MXFP4

Base model

deepreinforce-ai/Ornith-1.0-35B

Finetuned

(6)

this model