Score: 1

Static and Plugged: Make Embodied Evaluation Simple

Published: August 6, 2025 | arXiv ID: 2508.06553v1

By: Jiahao Xiao , Jianbo Zhang , BoWen Yan and more

Potential Business Impact:

Tests robots in pictures, not real life.

Embodied intelligence is advancing rapidly, driving the need for efficient evaluation. Current benchmarks typically rely on interactive simulated environments or real-world setups, which are costly, fragmented, and hard to scale. To address this, we introduce StaticEmbodiedBench, a plug-and-play benchmark that enables unified evaluation using static scene representations. Covering 42 diverse scenarios and 8 core dimensions, it supports scalable and comprehensive assessment through a simple interface. Furthermore, we evaluate 19 Vision-Language Models (VLMs) and 11 Vision-Language-Action models (VLAs), establishing the first unified static leaderboard for Embodied intelligence. Moreover, we release a subset of 200 samples from our benchmark to accelerate the development of embodied intelligence.

EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence

CV and Pattern Recognition

Robots learn to do tasks in the real world.

23 Oct 2025 1

88%

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

CV and Pattern Recognition

Lets robots learn to walk and see cities.

8 Mar 2025 0

88%

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

Artificial Intelligence

Helps AI learn by doing, not just watching.

26 Nov 2025 1

View PDF Login to Bookmark

Repos / Data Links

huggingface.co

Page Count

23 pages

Static and Plugged: Make Embodied Evaluation Simple

Tests robots in pictures, not real life.

Technical Abstract

EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction