From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published 1 day ago • 52
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 18 days ago • 46
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 17 days ago • 190
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 29 days ago • 90
SenseNova-SI Collection Scaling Spatial Intelligence with Multimodal Foundation Models • 16 items • Updated 16 days ago • 22
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 9 items • Updated about 10 hours ago • 67
OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis Paper • 2604.15093 • Published Apr 16 • 30
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 163
EVA: Efficient Reinforcement Learning for End-to-End Video Agent Paper • 2603.22918 • Published Mar 24 • 44