Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2606.07082

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

about 2 hours ago

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 85
Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 233
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 158
Pretraining Language Models to Ponder in Continuous Space

Paper • 2505.20674 • Published May 27, 2025 • 3

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 27 days ago • 75

02. Prompt engineering

Verifiable Rewards Beyond Math and Code: Lightweight Corpus-Grounded Process Supervision for Factual Question Answering

Paper • 2605.29648 • Published May 28 • 10
Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

Paper • 2605.29548 • Published May 28 • 12
Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

Paper • 2605.29861 • Published May 28 • 16
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Paper • 2605.31264 • Published May 29 • 123

about 13 hours ago

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models

Paper • 2506.19697 • Published Jun 24, 2025 • 45
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Paper • 2509.23873 • Published Sep 28, 2025 • 68
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1, 2025 • 41
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

Paper • 2509.22944 • Published Sep 26, 2025 • 80

Reinforcement learning

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4, 2025 • 104
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

Insights circa Solstice 2026

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

Paper • 2605.28742 • Published May 27 • 4
Reinforcement Learning from Rich Feedback with Distributional DAgger

Paper • 2606.05152 • Published 29 days ago • 3
Entropy as a Structural Prior: How a Log-Barrier on DiT Belief Space Drives Musical Diversity and Development

Paper • 2606.07207 • Published 27 days ago • 4
Bayesian-Agent: Posterior-Guided Skill Evolution for LLM Agent Harnesses

Paper • 2606.08348 • Published 26 days ago • 14

about 7 hours ago

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published Apr 16 • 25
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora

Paper • 2604.24819 • Published Apr 27 • 91
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

Paper • 2604.26752 • Published Apr 29 • 112

about 5 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 735 • 99
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 41
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 98
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 89

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

about 2 hours ago

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 85
Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 233
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 158
Pretraining Language Models to Ponder in Continuous Space

Paper • 2505.20674 • Published May 27, 2025 • 3

Reinforcement learning

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4, 2025 • 104
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 27 days ago • 75

Insights circa Solstice 2026

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

Paper • 2605.28742 • Published May 27 • 4
Reinforcement Learning from Rich Feedback with Distributional DAgger

Paper • 2606.05152 • Published 29 days ago • 3
Entropy as a Structural Prior: How a Log-Barrier on DiT Belief Space Drives Musical Diversity and Development

Paper • 2606.07207 • Published 27 days ago • 4
Bayesian-Agent: Posterior-Guided Skill Evolution for LLM Agent Harnesses

Paper • 2606.08348 • Published 26 days ago • 14

02. Prompt engineering

Verifiable Rewards Beyond Math and Code: Lightweight Corpus-Grounded Process Supervision for Factual Question Answering

Paper • 2605.29648 • Published May 28 • 10
Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

Paper • 2605.29548 • Published May 28 • 12
Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

Paper • 2605.29861 • Published May 28 • 16
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Paper • 2605.31264 • Published May 29 • 123

about 7 hours ago

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published Apr 16 • 25
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora

Paper • 2604.24819 • Published Apr 27 • 91
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

Paper • 2604.26752 • Published Apr 29 • 112

about 13 hours ago

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models

Paper • 2506.19697 • Published Jun 24, 2025 • 45
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Paper • 2509.23873 • Published Sep 28, 2025 • 68
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1, 2025 • 41
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

Paper • 2509.22944 • Published Sep 26, 2025 • 80

about 5 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 735 • 99
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 41
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 98
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 89

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs