Score: 0

RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence

Published: December 31, 2025 | arXiv ID: 2512.24653v1

By: Chengkai Hou , Kun Wu , Jiaming Liu and more

While data-driven imitation learning has revolutionized robotic manipulation, current approaches remain constrained by the scarcity of large-scale, diverse real-world demonstrations. Consequently, the ability of existing models to generalize across long-horizon bimanual tasks and mobile manipulation in unstructured environments remains limited. To bridge this gap, we present RoboMIND 2.0, a comprehensive real-world dataset comprising over 310K dual-arm manipulation trajectories collected across six distinct robot embodiments and 739 complex tasks. Crucially, to support research in contact-rich and spatially extended tasks, the dataset incorporates 12K tactile-enhanced episodes and 20K mobile manipulation trajectories. Complementing this physical data, we construct high-fidelity digital twins of our real-world environments, releasing an additional 20K-trajectory simulated dataset to facilitate robust sim-to-real transfer. To fully exploit the potential of RoboMIND 2.0, we propose MIND-2 system, a hierarchical dual-system frame-work optimized via offline reinforcement learning. MIND-2 integrates a high-level semantic planner (MIND-2-VLM) to decompose abstract natural language instructions into grounded subgoals, coupled with a low-level Vision-Language-Action executor (MIND-2-VLA), which generates precise, proprioception-aware motor actions.

RoboBERT: An End-to-end Multimodal Robotic Manipulation Model

Robotics

Robots learn to do tasks from words and seeing.

11 Feb 2025 2

89%

RoboCOIN: An Open-Sourced Bimanual Robotic Data COllection for INtegrated Manipulation

Robotics

Teaches robots to use two hands like people.

21 Nov 2025 0

89%

Hoi! -- A Multimodal Dataset for Force-Grounded, Cross-View Articulated Manipulation

Robotics

Teaches robots to grab and feel objects like humans.

4 Dec 2025 0

View PDF Login to Bookmark

RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence

Technical Abstract

RoboBERT: An End-to-end Multimodal Robotic Manipulation Model

RoboCOIN: An Open-Sourced Bimanual Robotic Data COllection for INtegrated Manipulation

Hoi! -- A Multimodal Dataset for Force-Grounded, Cross-View Articulated Manipulation