Score: 0

Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction

Published: November 28, 2025 | arXiv ID: 2511.23476v1

By: Bao Shu , Yan Cai , Jianjian Sun and more

Potential Business Impact:

Lets computers learn faster by trying things.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Developing robust world model reasoning is crucial for large language model (LLM) agents to plan and interact in complex environments. While multi-turn interaction offers a superior understanding of environmental dynamics via authentic feedback, current approaches often impose a rigid reasoning process, which constrains the model's active learning, ultimately hindering efficient world model reasoning. To address these issues, we explore world-model internalization through efficient interaction and active reasoning (WMAct), which liberates the model from structured reasoning, allowing the model to shape thinking directly through its doing, and achieves effective and efficient world model reasoning with two key mechanisms: (1) a reward rescaling mechanism adjusting outcome reward based on action efficacy to incentivize redundancy reduction and purposeful interaction; (2) an interaction frequency annealing strategy to progressively reduce the maximum allowed interaction turns, which compels the model to condense its learning and internalize environmental dynamics rather than over-relying on environmental cues. Our experiments on Sokoban, Maze, and Taxi show that WMAct yields effective world model reasoning capable of resolving tasks in a single turn that previously required multiple interactions and fosters strong transferability to complex environments, improving performance on a suite of reasoning benchmarks.

Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning

Computation and Language

Helps AI understand people's thoughts and feelings.

9 Oct 2025 0

90%

CoBel-World: Harnessing LLM Reasoning to Build a Collaborative Belief World for Optimizing Embodied Multi-Agent Collaboration

Artificial Intelligence

Helps AI teams work together better by guessing what others think.

26 Sep 2025 0

90%

WorldLLM: Improving LLMs' world modeling using curiosity-driven theory-making

Artificial Intelligence

Helps AI understand and predict game worlds better.

7 Jun 2025 1

View PDF Login to Bookmark

Page Count

17 pages

Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction

Lets computers learn faster by trying things.

Technical Abstract

Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning

CoBel-World: Harnessing LLM Reasoning to Build a Collaborative Belief World for Optimizing Embodied Multi-Agent Collaboration

WorldLLM: Improving LLMs' world modeling using curiosity-driven theory-making