STEP: Structured Training and Evaluation Platform for benchmarking trajectory prediction models
By: Julian F. Schumann , Anna Mészáros , Jens Kober and more
Potential Business Impact:
Tests self-driving car predictions better.
While trajectory prediction plays a critical role in enabling safe and effective path-planning in automated vehicles, standardized practices for evaluating such models remain underdeveloped. Recent efforts have aimed to unify dataset formats and model interfaces for easier comparisons, yet existing frameworks often fall short in supporting heterogeneous traffic scenarios, joint prediction models, or user documentation. In this work, we introduce STEP -- a new benchmarking framework that addresses these limitations by providing a unified interface for multiple datasets, enforcing consistent training and evaluation conditions, and supporting a wide range of prediction models. We demonstrate the capabilities of STEP in a number of experiments which reveal 1) the limitations of widely-used testing procedures, 2) the importance of joint modeling of agents for better predictions of interactions, and 3) the vulnerability of current state-of-the-art models against both distribution shifts and targeted attacks by adversarial agents. With STEP, we aim to shift the focus from the ``leaderboard'' approach to deeper insights about model behavior and generalization in complex multi-agent settings.
Similar Papers
STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
Neural and Evolutionary Computing
Tests brain-like computer chips for better performance.
STEP: Success-Rate-Aware Trajectory-Efficient Policy Optimization
Artificial Intelligence
Teaches robots to learn tasks much faster.
STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans
CV and Pattern Recognition
Tracks and guesses body parts of animals and people.