Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments
By: Jinwoo Jang , Minjong Yoo , Sihyung Yoon and more
Potential Business Impact:
Helps robots learn new tasks without retraining.
Language model (LM)-based embodied agents are increasingly deployed in real-world settings. Yet, their adaptability remains limited in dynamic environments, where constructing accurate and flexible world models is crucial for effective reasoning and decision-making. To address this challenge, we extend the Mixture-of-Experts (MoE) paradigm to embodied agents. While conventional MoE architectures modularize knowledge into expert components with pre-trained routing, they remain rigid once deployed, making them less effective for adapting to unseen domains in dynamic environments. We therefore propose Test-time Mixture of World Models (TMoW), a framework that enhances adaptability to unseen and evolving domains. TMoW updates its routing function over world models at test time, unlike conventional MoE where the function remains fixed, enabling agents to recombine existing models and integrate new ones for continual adaptation. It achieves this through (i) multi-granular prototype-based routing, which adapts mixtures across object- to scene-level similarities, (ii) test-time refinement that aligns unseen domain features with prototypes during inference, and (iii) distilled mixture-based augmentation, which efficiently constructs new models from few-shot data and existing prototypes. We evaluate TMoW on VirtualHome, ALFWorld, and RLBench benchmarks, demonstrating strong performance in both zero-shot adaptation and few-shot expansion scenarios, and showing that it enables embodied agents to operate effectively in dynamic environments.
Similar Papers
World Model Implanting for Test-time Adaptation of Embodied Agents
Artificial Intelligence
Lets robots learn new tasks without retraining.
Aligning Agentic World Models via Knowledgeable Experience Learning
Computation and Language
Teaches AI to follow real-world rules.
MoMoE: A Mixture of Expert Agent Model for Financial Sentiment Analysis
Computational Engineering, Finance, and Science
Makes AI smarter by letting many AI parts work together.