Brain atlas comparison of 1B and 3B VibeThinker models.

#18
by juiceb0xc0de - opened

Internal-mechanics atlas for the 3B parameter VibeThinker model — all 36 layers tested for activation structure, OV-circuit geometry, and Sub-Zero surgical headroom.


model: WeiboAI/VibeThinker-3B
atlas_type: activation census + Sub-Zero brain atlas
corpus: 9,523 diverse prompts
layers: 36
sacred_layers: 23-35

VibeThinker-3B Brain Atlas

This is an internal-mechanics atlas for the 3B parameter VibeThinker model. The run was done on an Ada-class remote GPU, this time with FlashAttention enabled and a full 36-layer forward pass.

What was run

  • Activation census over 9,523 prompts spanning compliance, reasoning, code, math, multilingual, and refusal-style questions.
  • Per-layer feature taxonomy for mlp, gate, up, and attention heads.
  • OV-circuit spectral analysis per head (W_V @ W_O).
  • Sub-Zero surgery pass on every layer, with a capability fence across code, math, reasoning, factual, and multilingual domains.
  • Forward passes ran GPU-side with a pre-compiled FlashAttention wheel.

Key geometry

Property Value
Layers 36
d_model 2048
d_mlp 11008
Attention heads 16
KV heads 2
Head dim 128
Sacred (deep Sub-Zero) layers 23–35

What the numbers suggest

Same distributed signature as the 1.5B

OV-circuit spectral concentration averages 0.050, with effective rank around 55. The 3B is proportionally larger but its attention heads are not more concentrated. It is still doing many-direction computation rather than collapsing to a few copy-paste circuits.

Feature activation is broad, but more structured

The taxonomy ordering matches the 1.5B (partial_shared > broadly_shared > non_activated > all_shared), but average F-stat separation is higher and the all_shared fraction is larger. The larger model has more cleanly global directions without becoming narrower.

Sacred region starts deeper and stays cleaner

Sub-Zero finds structured SV subspace in layers 23–35, again 36% of total depth. Classifier accuracy stays 0.93–0.95 across all Sub-Zero layers, compared to the 1.5B’s mid-network dip. The 3B’s internal representation is more consistent.

More surgical headroom

The capability fence keeps 81.6% of axes, with lower average damage:

  • Highest single-axis damage: layer 26 gate_proj axis 0, 0.30 to factual reasoning.
  • That same axis also rejects for code, math, multilingual, and reasoning — a universal late-layer direction.
  • down_proj axes mostly pass the fence with ~0.97 explained variance intact.

The worst damage in the 3B is still serious, but it is roughly one third of the worst damage in the 1.5B. More parameters buy you redundant subspaces that can be partially removed without collapsing behavior.

Compliance/behavior subspace is more isolated

Compliance-behaviour singular values peak at 88% of the Sub-Zero SV budget in the late sacred layers, versus 33% in the 1.5B. The 3B has a cleaner separation between “how to respond” style directions and “what to compute” capability directions.

Bottom line

VibeThinker-3B is the same architectural family as the 1.5B, but more factorized and more redundant in its late layers. It distributes computation across attention and MLPs, uses a deep sacred region for structured transformation, and retains enough surgical headroom that quantization-aware editing could be guided by this atlas rather than done blindly.


Cross-post from a small atlas project I ran this weekend on both VibeThinker sizes. The aim was to see what is actually happening inside the tensors, not just what comes out at the end.

VibeThinker 1.5B vs 3B: a brain-atlas comparison

I ran a full GWIQ-style atlas on both WeiboAI/VibeThinker-1.5B and WeiboAI/VibeThinker-3B: activation census, per-component feature taxonomy, OV-circuit SVD, and a Sub-Zero surgery pass with a capability fence across code / math / reasoning / factual / multilingual. Same 9,523-prompt corpus on both.

TL;DR

  • Both models show the same distributed-computation signature: low spectral concentration, high effective rank, broad-feature MLPs.
  • The sacred region starts at the same proportional depth on both: layer 18/28 on 1.5B, layer 23/36 on 3B. Both are about 36% of total depth.
  • The 3B is more factorized and more surgical: 81.6% of Sub-Zero axes pass the capability fence, worst single-axis damage ~0.30.
  • The 1.5B is more load-bearing per direction: only 74.5% pass the fence, worst single-axis damage ~0.81 (layer 18 up_proj).
  • The 3B also has a much cleaner compliance/behavior subspace (88% peak CB-SV fraction vs 33% on 1.5B).

Headline numbers

1.5B 3B
Layers 28 36
Sacred region 18–27 23–35
Sacred fraction ~36% ~36%
OV spectral concentration 0.049 0.050
OV effective rank ~55 ~55
Fence frozen 74.5% 81.6%
Worst axis damage 0.81 (layer 18 up_proj, code) 0.30 (layer 26 gate_proj, factual)
Peak CB-SV fraction 33% 88%
Classifier stability dips to 0.75 mid-network stable 0.93–0.95

What the atlas says about the architecture

The OV-circuit numbers are the strongest signal. Spectral concentration around 0.05 and effective rank around 55 means the attention heads are not memorized one-shot copy circuits. They are doing weighted, high-dimensional computation. Feature taxonomies confirm it: most dimensions are partial_shared or broadly_shared, not hyper-specific token detectors.

So the “VibeThinker is a small reasoning model” vibe from the chat logs matches the geometry: it looks like a model that reasons in the network rather than indexing a compressed knowledge store.

What the atlas says about scale

The sacred region scales with depth proportionally, not by absolute layer index. That is a nice consistency check — it appears the transition from preprocessing to structured transformation is a relative depth event, not a hand-tuned layer number.

Inside that sacred region, the 3B is much cleaner:

  • higher classifier accuracy,
  • more axes survive the capability fence,
  • lower per-axis damage,
  • more isolated style/behavior directions.

The 1.5B is not bad; it is just doing more work with fewer directions, so editing it is riskier.

The most interesting single directions

  • 1.5B layer 18 up_proj axis 0 — rejects across all five capability domains, worst damage to code (0.81). This is the very first layer of the sacred region; the model leans hard on it right after the preprocessing stack.
  • 3B layer 26 gate_proj axis 0 — rejects across all five domains, worst damage to factual (0.30). A universal late-layer direction, but with ~0.995 explained variance, so the damage is concentrated and interpretable.
  • 3B down_proj axes — mostly pass the fence with ~0.97 explained variance intact, making them the safest surgical targets in the larger model.

Pipeline notes for anyone who wants to reproduce

  • 1.5B run: CPU-only container, ~1k tok/s forward-pass throughput.
  • 3B run: remote rented NVIDIA RTX 6000 Ada Generation with 48 GB VRAM, 128 CPU cores, 503 GB system RAM, CUDA 12.1, PyTorch 2.4.0. Measured device-to-device copy throughput was ~400 GB/s.
  • The 3B atlas forward pass peaked around 12–14k tok/s depending on batch size and sequence length.
  • Both runs used the same vendored Space code; the only GPU-side change was enabling the pre-compiled FlashAttention wheel.
  • The real bottleneck at this corpus size was finalization and compression, not forward passes. If you scale the corpus, optimize the chunk finalizer before chasing bigger GPU batch sizes.
  • FlashAttention was the main dependency win. A pre-compiled wheel for the target CUDA/GPU combo removes the “compile for an hour and pray” step.

Caveats

  • This is an internal-mechanics atlas, not a downstream benchmark. We are not claiming 3B is “better at reasoning” than 1.5B on every task; we are claiming the 3B has more redundant, more editable late-layer structure.
  • The compliance-behaviour axis used a separate pos/neg corpus that did not make it into these SQLite artifacts, so the style-axis numbers here come from the Sub-Zero SV decomposition, not a direct authentic-vs-corporate comparison.
  • Both models were very new at the time of the run; numbers may shift if the weights or tokenizer are updated.

Bottom line

VibeThinker scales cleanly: same attention geometry, same proportional sacred depth, but the 3B has enough extra capacity to factor its late layers into more redundant, more isolable directions. If you are fine-tuning, quantizing, or merging these models, the 3B gives you more headroom to do it without breaking the model, and the atlas gives you a map of which directions are safe vs load-bearing.

https://huggingface.co/datasets/juiceb0xc0de/vibethinker-1.5b-atlas
https://huggingface.co/datasets/juiceb0xc0de/vibethinker-3b-atlas

WeiboAI org

This is extremely interesting — thank you for running such a detailed atlas on both sizes.

I really like that this is not just another downstream benchmark, but an attempt to look at the internal structure of the models. The “same proportional sacred depth” observation is especially interesting, because it suggests the 1.5B → 3B scaling is not just adding capacity randomly, but preserving a similar computation geometry while giving the larger model more redundancy / editable structure.

The Sub-Zero results are also very useful for us. The idea that 3B has more axes passing the capability fence, lower worst-axis damage, and cleaner behavior/compliance directions matches our intuition that the 3B version should be easier to fine-tune, quantize, or merge without breaking core reasoning ability.

Of course, as you said, this is not a downstream benchmark and we should not over-interpret it as “3B is better on every task”. But as an internal-mechanics signal, it is very valuable. We’ll read through the atlas artifacts more carefully — this could be very useful for future fine-tuning / quantization / model editing experiments.

Thanks again for sharing the datasets and the analysis.

It was my pleasure! It's not very often you get a novel model release with 2 different size scales to analyze. VibeThinker was especially satisfying to look at with its internal symmetry. I really enjoyed diving into that project. If you need a more in depth look at any specific layers/axes I would be more than happy to look into it for you! Cheers!

Sign up or log in to comment