World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 4 days ago • 111
Exploring Spatial Intelligence from a Generative Perspective Paper • 2604.20570 • Published 9 days ago • 21
OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering Paper • 2604.08209 • Published 22 days ago • 25
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO Paper • 2602.06422 • Published Feb 6 • 47
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models Paper • 2601.07351 • Published Jan 12 • 26
SOP: A Scalable Online Post-Training System for Vision-Language-Action Models Paper • 2601.03044 • Published Jan 6 • 28
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality Paper • 2512.07951 • Published Dec 8, 2025 • 51
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Paper • 2511.20714 • Published Nov 25, 2025 • 50
Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 115
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 107