NeoToi Coder v3.1 โ€” 4B

A Rust / Dioxus 0.7 specialist fine-tuned from Qwen3-4B (4.0B parameters, 3.6B non-embedding, tied input/output embeddings) using RAFT (Retrieval-Augmented Fine-Tuning). Optimized for production-quality Dioxus 0.7 components with Tailwind v4 and WCAG 2.2 AAA accessibility.

This is the 4B variant of the v3.1 release โ€” about half the size, ~40% faster generation than the 8B variant (rockypod/neotoi-coder-8b), graded marginally lower on the spec exam. The legacy 14B is at rockypod/neotoi-coder and the family hub linking all three is on the same page.

Exam Results โ€” 104-question Dioxus 0.7 Spec Exam

Re-graded 2026-04-26 with the patched grader (run_grade_v31.py, accepts LANG()/THEME() GlobalSignal accessor calls on Q87).

Tier Name Cnt Raw Wtd /Max Rate Floor Status
T1 Fundamentals 12 11 11.0 12.0 91.7% 82% โœ…
T2 RSX Syntax 12 12 12.0 12.0 100.0% 82% โœ…
T3 Signal Hygiene 12 12 12.0 12.0 100.0% 82% โœ…
T4 WCAG / ARIA 14 14 21.0 21.0 100.0% 82% โœ…
T5 use_resource 8 8 12.0 12.0 100.0% 82% โœ…
T6 Hard Reasoning 10 10 20.0 20.0 100.0% 88% โœ…
T7 Primitives + CSS 12 12 18.0 18.0 100.0% 82% โœ…
T8 GlobalSignal / i18n 8 8 12.0 12.0 100.0% 82% โœ…
T9 Static Navigator 6 6 9.0 9.0 100.0% 82% โœ…
T10 Dioxus 0.7.4 6 6 12.0 12.0 100.0% 88% โœ…
T11 Server Functions 3 3 4.5 4.5 100.0% 82% โœ…
Overall 102 143.5 /144.5 99.31% โ€” โœ… PASS
  • Publication bar (90%): PASS
  • Release bar (95%): PASS
  • Tier floors: PASS

Single failure: Q8 (T1 RSX conversion) โ€” generation truncated mid-<think> block, so no answer body was produced. Real model failure, not a grader artifact.

Version History

Version Base (params) Score Exam Dataset Status
v1.0 Qwen3-Coder-14B (14.8B) 51/60 (85.0%) 60Q standard โ€” Published
v2.0 Qwen3-Coder-14B (14.8B) 135.5/140 (96.8%) 100Q weighted 4,185 Published
v3.0 Qwen3-Coder-14B (14.8B) 124.0/144.5 (85.8%) 103Q weighted 4,535 Published
v3.1 Qwen3-Coder-14B (14.8B) 137.0/144.5 (94.81%) 103Q weighted 4,880 Published
v3.1 Qwen3-8B (8.2B) 144.5/144.5 (100.00%) 103Q weighted 4,880 Published
v3.1 Qwen3-4B (4.0B) 143.5/144.5 (99.31%) 103Q weighted 4,880 This release

Model Details

  • Base model: Qwen/Qwen3-4B (4.0B parameters total, 3.6B non-embedding, tied input/output embeddings)
  • Method: RAFT (Retrieval-Augmented Fine-Tuning) with LoRA adapters
  • Dataset: 4,880 curated Dioxus 0.7 examples across 43 topics
  • Scope: Rust + Dioxus 0.7 + Tailwind v4 + WCAG 2.2 AAA
  • Quantization: Q4_K_M (~2.33 GB)
  • Thinking tokens: patched (qwen3.thinking = true)
  • Author: Kevin Miller, Jr.

Training

Field Value
Steps 2,440
Epochs 4
Wall time ~1h 49m
Final train loss 0.4724
LoRA rank 16 (alpha 32, dropout 0)
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Sequence length 2048
Precision bf16 + 4-bit base
Hardware RTX 3090 Ti (24 GB)

Files

  • neotoi-coder-v3.1-4b-q4_k_m.gguf โ€” Q4_K_M quant (~2.33 GB)
  • neotoi-coder-v3.1-4b-q4_k_m_patched.gguf โ€” same quant + qwen3.thinking=true patch (recommended for Ollama / LM Studio)

Enabling Thinking Mode

This model emits Qwen3 native <think>...</think> blocks. Thinking is on by default with the patched GGUF on inference backends that honor qwen3.thinking.

Ollama

FROM neotoi-coder-v3.1-4b-q4_k_m_patched.gguf

PARAMETER temperature 0.2
PARAMETER num_predict 2000
PARAMETER num_ctx 8192
PARAMETER repeat_penalty 1.1
PARAMETER stop "<|im_end|>"

TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
<think>
"""

SYSTEM You are NeoToi, an expert Rust and Dioxus 0.7 developer specialized in Tailwind v4 and WCAG 2.2 AAA accessibility. Always think step-by-step before answering.
ollama create neotoi-coder:4b -f Modelfile
ollama run neotoi-coder:4b

LM Studio

Field Value
Before System <|im_start|>system
After System <|im_end|>
Before User <|im_start|>user
After User <|im_end|>
Before Assistant <|im_start|>assistant\n<think>
After Assistant <|im_end|>

llama.cpp

./llama-cli \
  -m neotoi-coder-v3.1-4b-q4_k_m_patched.gguf \
  -ngl 99 \
  --temp 0.2 \
  -p "<|im_start|>user\nYour question here<|im_end|>\n<|im_start|>assistant\n<think>"

What It Knows

  • Dioxus 0.7 RSX brace syntax โ€” never function-call style
  • use_signal, use_resource with correct three-arm match
  • r#for on label elements only, never inputs
  • WCAG 2.2 AAA: aria_labelledby, aria_describedby, role="alert", role="dialog", live regions
  • dioxus-primitives โ€” no manual ARIA on managed components
  • styles!() macro for CSS modules
  • Tailwind v4 utility classes
  • GlobalSignal patterns (LANG / THEME), i18n, dark-mode toggling
  • Dioxus 0.7.4 APIs: WritableResultExt, WebSocket Stream+Sink, server-fn extractors

What It Does Not Know

  • Playwright / E2E testing (out of scope)
  • Non-Dioxus web frameworks
  • Backends or databases beyond what server functions cover
  • Occasional generation truncation on simple RSX-conversion prompts when the <think> block runs long (observed once on Q8 of the spec exam)

4B vs 8B

Metric 4B 8B
Base model parameters 4.0B (3.6B non-embed) 8.2B (6.95B non-embed)
Q4_K_M file size ~2.33 GB ~4.68 GB
Exam score 102 / 103 103 / 103
Weighted score 143.5 / 144.5 (99.31 %) 144.5 / 144.5 (100 %)
Exam wall time 6.9 min 10.3 min
Generation throughput ~184 t/s ~132 t/s

The 4B is the recommended pick if disk / RAM are tight; the 8B is the safer default if you need every last fundamentals point.

Transparency

Per-question model outputs and the patched grader source are published alongside the weights:

The training dataset itself is not redistributed โ€” see the GitHub repo for the data-generation pipeline.

License & Attribution

Fine-tuned weights and dataset: licensed under the Neotoi Coder Community License v1.0 โ€” see LICENSE. Commercial use of model outputs permitted. Weight redistribution prohibited. Mental health deployment requires written permission.

Upstream models: the base model and teacher model are licensed under the Apache License, Version 2.0 โ€” see LICENSE-APACHE and NOTICE:

The Neotoi Coder 4B weights are a derivative work of Qwen3-4B, fine-tuned via LoRA adapters on the Neotoi Coder RAFT dataset and then merged + quantized to GGUF.

Credits

  • Unsloth โ€” 2ร— faster fine-tuning
  • TRL โ€” SFTTrainer
  • Qwen3-4B โ€” base model
  • Dioxus โ€” the framework this model specializes in
  • Claude Code โ€” dataset pipeline and training infrastructure
Downloads last month
53
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rockypod/neotoi-coder-4b

Finetuned
Qwen/Qwen3-4B
Finetuned
(621)
this model