You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

gemma-26b-firefly-reasoning

An experimental, security-tuned Gemma model. While it excels at one-shot classification, its performance regresses in multi-step tasks.

Training recipe

  • Base: mlx-community/gemma-4-26b-a4b-it-bf16 (30 layers, 128 experts × top-k 8, moe_intermediate_size 704)
  • Corpus: trevon/lowlevel-securitysft/multitask_v12_train.jsonl (10,177 conversational rows, 30.6% clean / 69.4% vuln)
  • LoRA: r=8, α=2, targets q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj. The MoE expert SwitchLinear targets are wrapped with a per-expert LoRA ((num_experts, in, rank) and (num_experts, rank, out) tensors gathered via mx.gather_mm).
  • Trainable: 333.7 M params (3.78% of model), dominated by per-expert MoE deltas.
  • LR: 1e-4 with declared warmup 40 + cosine decay (note: due to a known mlx-vlm grad-accum × LR-schedule bug, the schedule was effectively constant — same as v9 firefly, so direct comparable).
  • 400 iters, batch size 1, grad accum 4, train-on-completions, assistant_id 77091.
  • Selected step: 150 (best of 16 saved checkpoints by single-gate voting eval).
Downloads last month
96
Safetensors
Model size
26B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for trevon/gemma-26b-firefly

Quantized
(1)
this model