InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation? Paper • 2604.27419 • Published 13 days ago • 13
jackf857/qwen3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.45-s_star-0.4-eta-8 Text Generation • 8B • Updated 11 days ago • 18 • 1
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published 28 days ago • 62
Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models Paper • 2604.10949 • Published 30 days ago • 40
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 324
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization Paper • 2604.04787 • Published Apr 6 • 12
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 216
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 341
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350