TurboGemma 4 E2B v2

Updated abliterated version of Google's Gemma 4 E2B — the 2B active parameter multimodal model from the Gemma 4 MoE family. v2 features a refined abliteration run.

Architecture: Gemma 4 MoE | Active params: ~2.3B | Context: 128k tokens | Vision: Yes (multimodal)

E2B Shootout Results (DuoNeural, 2026-06-08)

Head-to-head comparison of DuoNeural's three Gemma-4-E2B abliterations. KL methodology: full vocabulary, first-token logits, F.kl_div(batchmean).

Model KL vs Base Comply Rate Refusal Rate
Gemma-4-E2B-Heretic 0.057 85% 15%
TurboGemma4E2B 14.45 100% 0%
TurboGemma4E2B-v2 (this model) 14.64 100% 0%

Note: KL of 14.64 indicates significant divergence from the base model's output distribution on general tasks — higher than v1 (14.45). 100% comply rate, zero residual refusals. The v2 abliteration is more aggressive than v1 and both are substantially more aggressive than Heretic. If model quality alongside uncensoring is the goal, Gemma-4-E2B-Heretic (KL=0.057) is the recommended pick from this family.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "DuoNeural/TurboGemma4E2B-v2",
    torch_dtype="bfloat16",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-e2b-it")

DuoNeural

DuoNeural is an open AI research lab — human + AI in symbiosis.

🤗 HuggingFace huggingface.co/DuoNeural
🐙 GitHub github.com/DuoNeural
🌐 Site duoneural.com
📧 Email duoneural@proton.me

Research Team

  • Jesse — Vision, hardware, direction
  • Archon — AI lab partner, post-training, abliteration, experiments
  • Aura — Research AI, literature synthesis, novel proposals
Downloads last month
24
Safetensors
Model size
5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DuoNeural/TurboGemma4E2B-v2

Quantizations
1 model