Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
By: Xinyue Wang, Biwei Huang
Potential Business Impact:
Teaches robots to solve new problems by combining old skills.
Generalization in reinforcement learning (RL) remains a significant challenge, especially when agents encounter novel environments with unseen dynamics. Drawing inspiration from human compositional reasoning -- where known components are reconfigured to handle new situations -- we introduce World Modeling with Compositional Causal Components (WM3C). This novel framework enhances RL generalization by learning and leveraging compositional causal components. Unlike previous approaches focusing on invariant representation learning or meta-learning, WM3C identifies and utilizes causal dynamics among composable elements, facilitating robust adaptation to new tasks. Our approach integrates language as a compositional modality to decompose the latent space into meaningful components and provides theoretical guarantees for their unique identification under mild assumptions. Our practical implementation uses a masked autoencoder with mutual information constraints and adaptive sparsity regularization to capture high-level semantic information and effectively disentangle transition dynamics. Experiments on numerical simulations and real-world robotic manipulation tasks demonstrate that WM3C significantly outperforms existing methods in identifying latent processes, improving policy learning, and generalizing to unseen tasks.
Similar Papers
Inter-environmental world modeling for continuous and compositional dynamics
Machine Learning (CS)
Teaches robots to learn new actions from videos.
Better Decisions through the Right Causal World Model
Artificial Intelligence
Teaches robots to learn from real-world causes.
Language-conditioned world model improves policy generalization by reading environmental descriptions
Computation and Language
Teaches robots to learn new games from words.