Datasets used in the paper "A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn’t)"
AI & ML interests
Data-Centric ML
Recent Activity
View all activity
Papers
A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't)
Boomerang Distillation Enables Zero-Shot Model Size Interpolation
datasets 31
Harvard-DCML/tis-dolci-random-unbalanced
Viewer • Updated • 30k • 58
Harvard-DCML/tis-dolci-subset-datasets-gtr-t5-base
Viewer • Updated • 40k • 16
Harvard-DCML/tis-dolci-subset-datasets-Olmo-3-1025-7B
Viewer • Updated • 300k • 21
Harvard-DCML/tis-dolci-subset-datasets-SmolLM3-3B-Base
Viewer • Updated • 130k • 21
Harvard-DCML/tis-dolci-subset-datasets-Qwen3-4B-Base
Viewer • Updated • 300k • 18
Harvard-DCML/tis-dolci-subset-datasets-Llama-3.2-3B
Viewer • Updated • 300k • 28
Harvard-DCML/tis-dolci-subset-datasets-Llama-2-7b-hf
Viewer • Updated • 300k • 20
Harvard-DCML/tis-dolci-quantile-datasets-gtr-t5-base
Updated • 8
Harvard-DCML/tis-dolci-quantile-datasets-Olmo-3-1025-7B
Updated • 9
Harvard-DCML/tis-dolci-quantile-datasets-SmolLM3-3B-Base
Updated • 9