Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning
By: Wenlong Liang , Rui Zhou , Yang Ma and more
Potential Business Impact:
Robots learn to do many tasks like humans.
Embodied AI aims to develop intelligent systems with physical forms capable of perceiving, decision-making, acting, and learning in real-world environments, providing a promising way to Artificial General Intelligence (AGI). Despite decades of explorations, it remains challenging for embodied agents to achieve human-level intelligence for general-purpose tasks in open dynamic environments. Recent breakthroughs in large models have revolutionized embodied AI by enhancing perception, interaction, planning and learning. In this article, we provide a comprehensive survey on large model empowered embodied AI, focusing on autonomous decision-making and embodied learning. We investigate both hierarchical and end-to-end decision-making paradigms, detailing how large models enhance high-level planning, low-level execution, and feedback for hierarchical decision-making, and how large models enhance Vision-Language-Action (VLA) models for end-to-end decision making. For embodied learning, we introduce mainstream learning methodologies, elaborating on how large models enhance imitation learning and reinforcement learning in-depth. For the first time, we integrate world models into the survey of embodied AI, presenting their design methods and critical roles in enhancing decision-making and learning. Though solid advances have been achieved, challenges still exist, which are discussed at the end of this survey, potentially as the further research directions.
Similar Papers
Autonomous Embodied Agents: When Robotics Meets Deep Learning Reasoning
Robotics
Robots learn to do tasks in new places.
Survey of Vision-Language-Action Models for Embodied Manipulation
Robotics
Robots learn to do tasks by watching and acting.
Embodied AI: From LLMs to World Models
Artificial Intelligence
Robots learn to do tasks by watching and thinking.