Score: 1

RoboChallenge: Large-scale Real-robot Evaluation of Embodied Policies

Published: October 20, 2025 | arXiv ID: 2510.17950v1

By: Adina Yakefu , Bin Xie , Chongyang Xu and more

Potential Business Impact:

Tests robot brains on many tasks online.

Business Areas:

Robotics Hardware, Science and Engineering, Software

Testing on real machines is indispensable for robotic control algorithms. In the context of learning-based algorithms, especially VLA models, demand for large-scale evaluation, i.e. testing a large number of models on a large number of tasks, is becoming increasingly urgent. However, doing this right is highly non-trivial, especially when scalability and reproducibility is taken into account. In this report, we describe our methodology for constructing RoboChallenge, an online evaluation system to test robotic control algorithms, and our survey of recent state-of-the-art VLA models using our initial benchmark Table30.

RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

Robotics

Tests robots better using videos and online help.

27 Oct 2025 1

88%

Eva-VLA: Evaluating Vision-Language-Action Models' Robustness Under Real-World Physical Variations

Robotics

Makes robots better at handling unexpected real-world problems.

23 Sep 2025 1

88%

Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain

Robotics

Tests robot brains to make them smarter.

20 Oct 2025 1

View PDF Login to Bookmark

Page Count

12 pages

RoboChallenge: Large-scale Real-robot Evaluation of Embodied Policies

Tests robot brains on many tasks online.

Technical Abstract

RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

Eva-VLA: Evaluating Vision-Language-Action Models' Robustness Under Real-World Physical Variations

Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain