Refusal in Language Models Is Mediated by a Single Direction Paper • 2406.11717 • Published Jun 17, 2024 • 13
MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models Paper • 2605.14906 • Published 8 days ago • 73
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published 8 days ago • 60
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 11 days ago • 45
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems Paper • 2605.14892 • Published 8 days ago • 47
Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning Paper • 2605.14386 • Published 8 days ago • 59
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? Paper • 2605.06527 • Published 15 days ago • 44
RouteProfile: Elucidating the Design Space of LLM Profiles for Routing Paper • 2605.00180 • Published 22 days ago • 30
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents Paper • 2605.13941 • Published 9 days ago • 24
ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both Paper • 2605.15198 • Published 8 days ago • 19
Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding Paper • 2605.07637 • Published 10 days ago • 19
Does Synthetic Layered Design Data Benefit Layered Design Decomposition? Paper • 2605.15167 • Published 8 days ago • 8
CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves Paper • 2605.14068 • Published 9 days ago • 8
WildTableBench: Benchmarking Multimodal Foundation Models on Table Understanding In the Wild Paper • 2605.01018 • Published 21 days ago • 9
Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis Paper • 2605.14392 • Published 8 days ago • 8
PRISM: Prior Rectification and Uncertainty-Aware Structure Modeling for Diffusion-Based Text Image Super-Resolution Paper • 2605.13027 • Published 9 days ago • 8
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 18 days ago • 333
EMO: Pretraining Mixture of Experts for Emergent Modularity Paper • 2605.06663 • Published 15 days ago • 12
Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning Paper • 2605.11458 • Published 10 days ago • 7