Score: 0

GR-Dexter Technical Report

Published: December 30, 2025 | arXiv ID: 2512.24210v1

By: Ruoshi Wen , Guangzeng Chen , Zhongren Cui and more

Vision-language-action (VLA) models have enabled language-conditioned, long-horizon robot manipulation, but most existing systems are limited to grippers. Scaling VLA policies to bimanual robots with high degree-of-freedom (DoF) dexterous hands remains challenging due to the expanded action space, frequent hand-object occlusions, and the cost of collecting real-robot data. We present GR-Dexter, a holistic hardware-model-data framework for VLA-based generalist manipulation on a bimanual dexterous-hand robot. Our approach combines the design of a compact 21-DoF robotic hand, an intuitive bimanual teleoperation system for real-robot data collection, and a training recipe that leverages teleoperated robot trajectories together with large-scale vision-language and carefully curated cross-embodiment datasets. Across real-world evaluations spanning long-horizon everyday manipulation and generalizable pick-and-place, GR-Dexter achieves strong in-domain performance and improved robustness to unseen objects and unseen instructions. We hope GR-Dexter serves as a practical step toward generalist dexterous-hand robotic manipulation.

Information-Theoretic Graph Fusion with Vision-Language-Action Model for Policy Reasoning and Dual Robotic Control

Robotics

Robots learn to build things by watching videos.

7 Aug 2025 0

93%

DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping

Robotics

Robots learn to grab many things from instructions.

28 Feb 2025 0

93%

DexVLG: Dexterous Vision-Language-Grasp Model at Scale

CV and Pattern Recognition

Robots grasp objects like humans using words.

3 Jul 2025 1

View PDF Login to Bookmark

GR-Dexter Technical Report

Technical Abstract

Information-Theoretic Graph Fusion with Vision-Language-Action Model for Policy Reasoning and Dual Robotic Control

DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping

DexVLG: Dexterous Vision-Language-Grasp Model at Scale