Osaurus

Ornith-1.0-35B · MXFP4

Official OsaurusAI MXFP4 build of deepreinforce-ai/Ornith-1.0-35B (MIT) — a vision-language MoE on a Qwen3.5 hybrid backbone. Near-lossless 8-bit microscaled FP; runs on Apple Silicon via Osaurus / mlx.

  • ~18 GB (from ~70 GB bf16) bundle.
  • MXFP8: microscaled FP4 (group-size 32, 4-bit) on the language-model linear weights and routed experts; the vision tower is preserved at fp16, short-conv kernels and norms kept fp16.
  • Vision-language (image + text → text).

Architecture

Family qwen3_5_moe (hybrid)
Text layers 40 — 30 Gated-DeltaNet (linear-attention) + 10 full-attention
Experts 256 routed (stacked switch_mlp) · hidden 2048 · untied lm_head
Vision ViT tower (model.visual) preserved fp16
Cache hybrid (GDN state + KV for attention layers)

Usage

# text
python -m mlx_lm generate --model OsaurusAI/Ornith-1.0-35B-MXFP4 --prompt "Explain a hash map in two sentences."

For image+text, load in Osaurus or an MLX-VLM runtime that supports qwen3_5 vision.

Provenance

Downloads last month
-
Safetensors
Model size
6B params
Tensor type
U32
·
F16
·
U8
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OsaurusAI/Ornith-1.0-35B-MXFP4

Finetuned
(6)
this model