Static and Plugged: Make Embodied Evaluation Simple
By: Jiahao Xiao , Jianbo Zhang , BoWen Yan and more
Potential Business Impact:
Tests robots in pictures, not real life.
Embodied intelligence is advancing rapidly, driving the need for efficient evaluation. Current benchmarks typically rely on interactive simulated environments or real-world setups, which are costly, fragmented, and hard to scale. To address this, we introduce StaticEmbodiedBench, a plug-and-play benchmark that enables unified evaluation using static scene representations. Covering 42 diverse scenarios and 8 core dimensions, it supports scalable and comprehensive assessment through a simple interface. Furthermore, we evaluate 19 Vision-Language Models (VLMs) and 11 Vision-Language-Action models (VLAs), establishing the first unified static leaderboard for Embodied intelligence. Moreover, we release a subset of 200 samples from our benchmark to accelerate the development of embodied intelligence.
Similar Papers
EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence
CV and Pattern Recognition
Robots learn to do tasks in the real world.
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
CV and Pattern Recognition
Lets robots learn to walk and see cities.
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction
Artificial Intelligence
Helps AI learn by doing, not just watching.