6 15

Jadon

jadodev

phase

AI & ML interests

Machine Learning, Programming Language Theory, Category Theory, Quantum Computing

Recent Activity

liked a model 24 days ago

nvidia/Gemma-4-31B-IT-NVFP4

liked a model 27 days ago

tencent/Sequential-Hidden-Decoding-8B-n8-Instruct

upvoted a paper 28 days ago

Virtual Width Networks

View all activity

Organizations

None yet

liked a model 24 days ago

nvidia/Gemma-4-31B-IT-NVFP4

Text Generation • 21B • Updated 9 days ago • 1.8M • 427

liked a model 27 days ago

tencent/Sequential-Hidden-Decoding-8B-n8-Instruct

Text Generation • 13B • Updated 27 days ago • 125 • 8

upvoted a paper 28 days ago

Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 39

liked a model 29 days ago

ByteDance/Ouro-1.4B

Text Generation • Updated Jan 18 • 38.6k • 86

liked a Space 29 days ago

The Smol Training Playbook

📚

3.13k

The secrets to building world-class LLMs

liked a model 29 days ago

HuggingFaceTB/FineMath-Llama-3B

3B • Updated Nov 27, 2025 • 49 • 22

liked a dataset 29 days ago

HuggingFaceTB/finemath

Viewer • Updated Feb 6, 2025 • 48.3M • 15.7k • 359

liked 2 datasets about 1 month ago

tiiuae/falcon-refinedweb

Viewer • Updated Jun 20, 2023 • 968M • 40.9k • 905

allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 724k • 557

upvoted a paper 5 months ago

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Paper • 2508.15096 • Published Aug 20, 2025 • 8

liked a model about 1 year ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 651k • • 3.11k

updated a collection about 2 years ago

transformer

Collection

2 items • Updated Apr 7, 2024

upvoted a paper about 2 years ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 108

liked 2 models about 2 years ago

mlabonne/phixtral-4x2_8

Text Generation • Updated Jan 15, 2024 • 137 • 209

NousResearch/Nous-Hermes-2-Mistral-7B-DPO

Text Generation • 7B • Updated Apr 30, 2024 • 1.43k • 218

upvoted a paper about 2 years ago

Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 48

updated a collection about 2 years ago

transformer

Collection

2 items • Updated Apr 7, 2024

upvoted a paper about 2 years ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 190

liked a model about 2 years ago

HuggingFaceH4/zephyr-7b-alpha

Text Generation • 7B • Updated Oct 16, 2024 • 5.36k • • 1.12k

Jadon

AI & ML interests

Recent Activity

Organizations

jadodev's activity

The Smol Training Playbook