Victoria Jones's picture

Victoria Jones

isaacperez2

AI & ML interests

None yet

Recent Activity

upvoted a paper about 11 hours ago

VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies

upvoted a paper 1 day ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

liked a model 2 days ago

Muapi/makima-chainsaw-man-flux-lora

View all activity

Organizations

None yet

upvoted a paper about 11 hours ago

VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies

Paper • 2605.30011 • Published 8 days ago • 10

upvoted a paper 1 day ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 7 days ago • 9

upvoted a paper 12 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 16 days ago • 204

upvoted a paper 14 days ago

The Unlearnability Phenomenon in RLVR for Language Models

Paper • 2605.16787 • Published 20 days ago • 6

upvoted 3 papers about 1 month ago

From Context to Skills: Can Language Models Learn from Context Skillfully?

Paper • 2604.27660 • Published May 3 • 166

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published Apr 28 • 274

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 243

upvoted 5 papers about 2 months ago

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 102

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published Apr 9 • 291

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 327

MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping

Paper • 2604.08364 • Published Apr 9 • 101

SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Paper • 2602.12783 • Published Feb 13 • 246