"I didn't Make the Micro Decisions": Measuring, Inducing, and Exposing Goal-Level AI Contributions in Collaboration Paper • 2605.21363 • Published 6 days ago • 3
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 14 days ago • 190
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Paper • 2605.17757 • Published 8 days ago • 62
SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution Paper • 2605.18401 • Published 8 days ago • 124
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 13 days ago • 264
RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark Paper • 2605.10921 • Published 15 days ago • 4
stefanocarrera/autophagycode_D_he_train-mercury_Qwen3-8B_strategy_trust_t1.5_g6_run0_metrics Viewer • Updated 11 days ago • 164 • 76 • 1
Counting as a minimal probe of language model reliability Paper • 2605.02028 • Published 23 days ago • 4
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 23 days ago • 162
Discovering Agentic Safety Specifications from 1-Bit Danger Signals Paper • 2604.23210 • Published about 1 month ago • 4
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242