Papers
arxiv:2604.15706

Target-Oriented Pretraining Data Selection via Neuron-Activated Graph

Published on Apr 17
ยท Submitted by
Zijun Wang
on Apr 22
Authors:
,
,
,
,
,
,
,
,
,

Abstract

A novel target-oriented language model pretraining framework uses neuron activation graphs to select informative data without additional training, demonstrating superior performance across multiple benchmarks.

AI-generated summary

Everyday tasks come with a target, and pretraining models around this target is what turns them into experts. In this paper, we study target-oriented language model (LM) pretraining by introducing Neuron-Activated Graph Ranking (NAG-based Ranking), a training-free and interpretable framework for target pretraining data selection. Rather than using black-box representations, our approach directly characterizes each target input by a sparse set of high-impact neurons in any off-the-shelf LLMs. Concretely, we quantify neuron impact and select the most influential neurons across layers into a compact Neuron-Activated Graph (NAG), and rank candidate data by NAG similarity to target examples. We conduct experiments across six benchmarks, where our NAG-based Ranking improves target-oriented pretraining by 4.9% on average over random sampling, and also outperforms state-of-the-art baselines by 5.3% accuracy on HellaSwag. It also remains effective under a more applicable multi-target setting, where our best setup surpasses two baselines by 1.1% and 4.1%, respectively. Furthermore, we provide a comprehensive analysis on why and how our NAG works, e.g., deactivating NAG-selected neurons (only 0.12% of all) causes a 23.5% performance collapse, and restricting NAG to the final layer incurs a 4.1% average drop, indicating that NAG captures a sparse "functional backbone" for learning target features. We release the code at https://github.com/asillycat/NAG.

Community

Paper submitter

๐Ÿง  Target-Oriented Pretraining Data Selection via Neuron-Activated Graph

TL;DR: We introduce NAG-based Ranking, a training-free, interpretable framework for selecting pretraining data aligned to a downstream target โ€” by characterizing each target input through a sparse set of high-impact neurons in any off-the-shelf LLM.

๐Ÿ” Key ideas

  • Skip black-box embeddings: represent each target example by the neurons it most strongly activates across layers
  • Aggregate these into a compact Neuron-Activated Graph (NAG)
  • Rank candidate pretraining data by NAG similarity to the target set โ€” fully training-free

๐Ÿ“ˆ Results

  • +4.9% average gain over random sampling across 6 benchmarks
  • +5.3% accuracy on HellaSwag over prior SOTA data-selection baselines
  • Interpretable by design: you can inspect which neurons drive the selection

๐Ÿ”— Paper: https://arxiv.org/abs/2604.15706

Happy to discuss โ€” feedback and questions welcome! ๐Ÿ™Œ

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.15706
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.15706 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.15706 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.15706 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.