Score: 0

OpenEgo: A Large-Scale Multimodal Egocentric Dataset for Dexterous Manipulation

Published: September 5, 2025 | arXiv ID: 2509.05513v1

By: Ahad Jawaid, Yu Xiang

Potential Business Impact:

Teaches robots to copy human hand movements.

Business Areas:
Motion Capture Media and Entertainment, Video

Egocentric human videos provide scalable demonstrations for imitation learning, but existing corpora often lack either fine-grained, temporally localized action descriptions or dexterous hand annotations. We introduce OpenEgo, a multimodal egocentric manipulation dataset with standardized hand-pose annotations and intention-aligned action primitives. OpenEgo totals 1107 hours across six public datasets, covering 290 manipulation tasks in 600+ environments. We unify hand-pose layouts and provide descriptive, timestamped action primitives. To validate its utility, we train language-conditioned imitation-learning policies to predict dexterous hand trajectories. OpenEgo is designed to lower the barrier to learning dexterous manipulation from egocentric video and to support reproducible research in vision-language-action learning. All resources and instructions will be released at www.openegocentric.com.

Country of Origin
🇺🇸 United States

Page Count
7 pages

Category
Computer Science:
CV and Pattern Recognition