Mistral-Mamba3-7B (Alpha)

Cross-architecture Subsuminator heist: mistralai/Mistral-7B-Instruct-v0.3 → Mamba3-7B SSM body.

CE gate (measured)

Metric	Value
Protocol	subsuminator/subsume.py CE gate (random tokens, shifted CE)
Source CE	13.3000
Heisted CE	10.4125
log(vocab) baseline	10.3972
CE ratio vs source	0.7829×

Random-token next-step CE on mistralai/Mistral-7B-Instruct-v0.3 vs this checkpoint (seed=42, n=5, seq_len=32).

Interpretation: Heisted CE ≈ log(vocab) (10.41 vs 10.40) — the fresh SSM body behaves like a random baseline on uniform tokens. Ratio 0.7829× vs source (10.41 / 13.30) is below 1.0 because Mistral's trained transformer raises CE on garbage random inputs; this is not capability preservation. For comparison, Mamba2→Mamba3 structural port (trained→trained) achieved 1.0016× with CE near source.

What transferred

Token embeddings, final norm, lm_head, per-layer input norms from Mistral-7B

What did not (fresh orthogonal init)

All Mamba3 SSM mixer weights (in_proj, out_proj, dt_bias, state dynamics)

Alpha checkpoint — fine-tune before production use.

Run it (Avocado)

Sovereign local inference for mamba2 + mamba3 only:

rideitlikeyoustoleit — static Avocado binaries (--yeehaw, --arnie, --giddyup)
Download a release build, point at this checkpoint, trust_remote_code for HF load or Avocado native splat path

./avocado run --model RtaForge/Mistral-Mamba3-7B --prompt "Come with me if you want to live."

Usage (trust_remote_code required)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("RtaForge/Mistral-Mamba3-7B", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("RtaForge/Mistral-Mamba3-7B")

Downloads last month: 15

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RtaForge/Mistral-Mamba3-7B

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Finetuned

(500)

this model