Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
Abstract
World models are categorized into three capability levels and four law regimes to better understand and develop predictive environment models for AI agents across diverse domains.
As AI systems move from generating text to accomplishing goals through sustained interaction, the ability to model environment dynamics becomes a central bottleneck. Agents that manipulate objects, navigate software, coordinate with others, or design experiments require predictive environment models, yet the term world model carries different meanings across research communities. We introduce a "levels x laws" taxonomy organized along two axes. The first defines three capability levels: L1 Predictor, which learns one-step local transition operators; L2 Simulator, which composes them into multi-step, action-conditioned rollouts that respect domain laws; and L3 Evolver, which autonomously revises its own model when predictions fail against new evidence. The second identifies four governing-law regimes: physical, digital, social, and scientific. These regimes determine what constraints a world model must satisfy and where it is most likely to fail. Using this framework, we synthesize over 400 works and summarize more than 100 representative systems spanning model-based reinforcement learning, video generation, web and GUI agents, multi-agent social simulation, and AI-driven scientific discovery. We analyze methods, failure modes, and evaluation practices across level-regime pairs, propose decision-centric evaluation principles and a minimal reproducible evaluation package, and outline architectural guidance, open problems, and governance challenges. The resulting roadmap connects previously isolated communities and charts a path from passive next-step prediction toward world models that can simulate, and ultimately reshape, the environments in which agents operate.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- A Subgoal-driven Framework for Improving Long-Horizon LLM Agents (2026)
- World Reasoning Arena (2026)
- Describe-Then-Act: Proactive Agent Steering via Distilled Language-Action World Models (2026)
- Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
the l3 evolver concept is the crux here: a world-model stack that autonomously revises itself when predictions fail, which could actually propel agentic ai forward. my worry is the lack of concrete safeguards for the self-revision loop, especially in the social and scientific regimes where feedback signals can be misleading or biased. without explicit mechanisms to prevent self-delusionālike external validation checks, uncertainty-aware revision, or auditing constraintsāthe improvements may chase brittle local truths and hurt generalization. btw the arxivlens breakdown helped me parse the method details and there's a solid walkthrough here that covers this approach well: https://arxivlens.com/PaperView/Details/agentic-world-modeling-foundations-capabilities-laws-and-beyond-4053-678cc671
Get this paper in your agent:
hf papers read 2604.22748 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper