Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published Mar 20 • 36
Running 203 Video Generation Leaderboard 📊 203 Text to Video and Image to Video Arena & Leaderboard
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published Dec 19, 2025 • 99
Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure Paper • 2512.14336 • Published Dec 16, 2025 • 32
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 123