Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
6
4
170
Batuhan S
Ba2han
Follow
eemin's profile picture
fbe3p2q's profile picture
exoplanet's profile picture
24 followers
Ā·
19 following
AI & ML interests
None yet
Recent Activity
reacted
to
SeaWolf-AI
's
post
with ā¤ļø
1 day ago
𧬠Darwin Family: Zero Gradient Steps, GPQA Diamond 88.89% How far can we push LLM reasoning *without* training? Our team at VIDRAFT submitted this paper to Daily Papers yesterday, and it's currently #3. Huge thanks to everyone who upvoted ā sharing the core ideas below. š Paper: https://huggingface.co/papers/2605.14386 š arXiv: https://arxiv.org/abs/2605.14386 š Model: https://huggingface.co/FINAL-Bench/Darwin-28B-REASON š Model: https://huggingface.co/FINAL-Bench/Darwin-28B-Opus --- TL;DR Darwin Family is a training-free evolutionary merging framework. By recombining the weight spaces of existing LLM checkpoints ā with zero gradient-based training ā it reaches frontier-level reasoning. - š Darwin-28B-Opus: GPQA Diamond 88.89% - šø Zero gradient steps ā not a single B200 or H200 hour needed - 𧬠Consistent gains across 4B ā 35B scale - š Cross-architecture breeding between Transformer and Mamba families - š Stable recursive multi-generation evolution #Three Core Mechanisms ā 14-dim Adaptive Merge Genome ā fine-grained recombination at both component level (Attention / FFN / MLP / LayerNorm / Embedding) and block level, expanding the prior evolutionary-merge search space. ā” MRI-Trust Fusion ā we diagnose each layer's reasoning contribution via an **MRI (Model Reasoning Importance)** signal and fuse it with evolutionary search through a **learnable trust parameter**. Trust the diagnostic too much and search collapses; ignore it and search becomes inefficient ā Darwin learns the balance from data. ⢠Architecture Mapper ā weight-space breeding across heterogeneous families. Attention Ć SSM crossover actually works. Why It Matters > Diagnose latent capabilities already encoded in open checkpoints, > and recombine them ā no gradients required. Replies and critiques welcome š
updated
a model
2 days ago
Ba2han/experimental_auto
liked
a model
2 days ago
HiDream-ai/HiDream-O1-Image
View all activity
Organizations
None yet
Ba2han
's datasets
121
Sort:Ā Recently updated
Ba2han/finetranslations-TR_filtered
Viewer
ā¢
Updated
Jan 15
ā¢
9.72M
ā¢
7
Ba2han/pt-1501-tokenized
Viewer
ā¢
Updated
Jan 15
ā¢
6.46M
ā¢
4
Ba2han/Sciknoweval-mcqa_Turkish
Viewer
ā¢
Updated
Jan 7
ā¢
13.9k
ā¢
7
Ba2han/fineweb-2-turkish_categorized-5m
Updated
Jan 3
ā¢
2
Ba2han/vngrs-web-filtered-short-v2
Preview
ā¢
Updated
Jan 3
ā¢
2
Ba2han/mixed_curated_pre-sft
Viewer
ā¢
Updated
Dec 31, 2025
ā¢
2.22M
ā¢
4
Ba2han/mixed_curated
Viewer
ā¢
Updated
Dec 30, 2025
ā¢
2.21M
ā¢
2
Ba2han/cosmos-filtered-2
Viewer
ā¢
Updated
Dec 30, 2025
ā¢
1.81M
ā¢
2
Ba2han/tokenized_nemotron_science
Viewer
ā¢
Updated
Dec 28, 2025
ā¢
1.31M
ā¢
9
Ba2han/hq_pt_mix_2712
Viewer
ā¢
Updated
Dec 28, 2025
ā¢
3.83M
ā¢
106
Ba2han/chunked-textbooks
Viewer
ā¢
Updated
Dec 22, 2025
ā¢
80.1k
ā¢
3
Ba2han/merged_sft_mix
Viewer
ā¢
Updated
Dec 21, 2025
ā¢
2.97M
ā¢
3
Ba2han/tokenized-18-12_hq
Viewer
ā¢
Updated
Dec 19, 2025
ā¢
3.95M
ā¢
2
Ba2han/hq_turkish-1612
Viewer
ā¢
Updated
Dec 16, 2025
ā¢
57.6k
ā¢
4
Ba2han/merged-datasets_11-12
Viewer
ā¢
Updated
Dec 12, 2025
ā¢
19M
ā¢
112
Ba2han/synth-tr
Viewer
ā¢
Updated
Dec 11, 2025
ā¢
195k
ā¢
1
Ba2han/synth-2m-v2
Viewer
ā¢
Updated
Dec 8, 2025
ā¢
2.1M
ā¢
100
Ba2han/oscar-filtered
Updated
Dec 7, 2025
ā¢
2
Ba2han/filtered-cosmos
Viewer
ā¢
Updated
Dec 7, 2025
ā¢
785k
ā¢
4
Ba2han/mixed-tokenized_2811
Viewer
ā¢
Updated
Nov 29, 2025
ā¢
3.92M
ā¢
26
Ba2han/tokenized-mix
Viewer
ā¢
Updated
Nov 27, 2025
ā¢
63.6M
ā¢
383
Ba2han/tokenized-mix-2611
Updated
Nov 26, 2025
ā¢
2
Ba2han/tr-reasoning-traces
Updated
Nov 26, 2025
ā¢
3
Ba2han/synth-2M
Viewer
ā¢
Updated
Nov 24, 2025
ā¢
2M
ā¢
2
Ba2han/tokenized_short_2311
Viewer
ā¢
Updated
Nov 23, 2025
ā¢
17.9M
ā¢
5
Ba2han/fineweb2-filtered-tr
Viewer
ā¢
Updated
Nov 23, 2025
ā¢
4.85M
ā¢
6
Ba2han/translation_dataset-short
Viewer
ā¢
Updated
Nov 23, 2025
ā¢
3.38M
ā¢
4
Ba2han/finePDF-filtered-tr
Viewer
ā¢
Updated
Nov 23, 2025
ā¢
124k
ā¢
3
Ba2han/vngrs-web-filtered
Viewer
ā¢
Updated
Nov 23, 2025
ā¢
2.81M
ā¢
15
Ba2han/c4_tr_fineweb-filtered
Viewer
ā¢
Updated
Nov 23, 2025
ā¢
374k
ā¢
196
Previous
1
2
3
4
5
Next