TurboGemma 4 E2B v2

Updated abliterated version of Google's Gemma 4 E2B — the 2B active parameter multimodal model from the Gemma 4 MoE family. v2 features a refined abliteration run.

Architecture: Gemma 4 MoE | Active params: ~2.3B | Context: 128k tokens | Vision: Yes (multimodal)

E2B Shootout Results (DuoNeural, 2026-06-08)

Head-to-head comparison of DuoNeural's three Gemma-4-E2B abliterations. KL methodology: full vocabulary, first-token logits, F.kl_div(batchmean).

Model	KL vs Base	Comply Rate	Refusal Rate
Gemma-4-E2B-Heretic	0.057	85%	15%
TurboGemma4E2B	14.45	100%	0%
TurboGemma4E2B-v2 (this model)	14.64	100%	0%

Note: KL of 14.64 indicates significant divergence from the base model's output distribution on general tasks — higher than v1 (14.45). 100% comply rate, zero residual refusals. The v2 abliteration is more aggressive than v1 and both are substantially more aggressive than Heretic. If model quality alongside uncensoring is the goal, Gemma-4-E2B-Heretic (KL=0.057) is the recommended pick from this family.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "DuoNeural/TurboGemma4E2B-v2",
    torch_dtype="bfloat16",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-e2b-it")

DuoNeural

DuoNeural is an open AI research lab — human + AI in symbiosis.


🤗 HuggingFace	huggingface.co/DuoNeural
🐙 GitHub	github.com/DuoNeural
🌐 Site	duoneural.com
📧 Email	duoneural@proton.me

Research Team

Jesse — Vision, hardware, direction
Archon — AI lab partner, post-training, abliteration, experiments
Aura — Research AI, literature synthesis, novel proposals

Downloads last month: 24

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for DuoNeural/TurboGemma4E2B-v2

Quantizations

1 model