Score: 0

MetaWorld: Skill Transfer and Composition in a Hierarchical World Model for Grounding High-Level Instructions

Published: January 24, 2026 | arXiv ID: 2601.17507v1

By: Yutong Shen , Hangxu Liu , Kailin Pei and more

Potential Business Impact:

Robots learn to walk and grab things better.

Business Areas:

Semantic Web Internet Services

Humanoid robot loco-manipulation remains constrained by the semantic-physical gap. Current methods face three limitations: Low sample efficiency in reinforcement learning, poor generalization in imitation learning, and physical inconsistency in VLMs. We propose MetaWorld, a hierarchical world model that integrates semantic planning and physical control via expert policy transfer. The framework decouples tasks into a VLM-driven semantic layer and a latent dynamics model operating in a compact state space. Our dynamic expert selection and motion prior fusion mechanism leverages a pre-trained multi-expert policy library as transferable knowledge, enabling efficient online adaptation via a two-stage framework. VLMs serve as semantic interfaces, mapping instructions to executable skills and bypassing symbol grounding. Experiments on Humanoid-Bench show MetaWorld outperforms world model-based RL in task completion and motion coherence. Our code will be found at https://anonymous.4open.science/r/metaworld-2BF4/

Gentle Manipulation Policy Learning via Demonstrations from VLM Planned Atomic Skills

Robotics

Robots learn complex tasks without human help.

8 Nov 2025 0

91%

Aligning Agentic World Models via Knowledgeable Experience Learning

Computation and Language

Teaches AI to follow real-world rules.

19 Jan 2026 0

90%

Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System

Robotics

Robots work together better using AI to move things.

5 Jun 2025 1

View PDF Login to Bookmark

Page Count

8 pages

MetaWorld: Skill Transfer and Composition in a Hierarchical World Model for Grounding High-Level Instructions

Robots learn to walk and grab things better.

Technical Abstract

Gentle Manipulation Policy Learning via Demonstrations from VLM Planned Atomic Skills

Aligning Agentic World Models via Knowledgeable Experience Learning

Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System