Benchmarking that Matters: Rethinking Benchmarking for Practical Impact
By: Anna V. Kononova , Niki van Stein , Olaf Mersmann and more
Potential Business Impact:
Creates better computer problem-solving tools for real-world tasks.
Benchmarking has driven scientific progress in Evolutionary Computation, yet current practices fall short of real-world needs. Widely used synthetic suites such as BBOB and CEC isolate algorithmic phenomena but poorly reflect the structure, constraints, and information limitations of continuous and mixed-integer optimization problems in practice. This disconnect leads to the misuse of benchmarking suites for competitions, automated algorithm selection, and industrial decision-making, despite these suites being designed for different purposes. We identify key gaps in current benchmarking practices and tooling, including limited availability of real-world-inspired problems, missing high-level features, and challenges in multi-objective and noisy settings. We propose a vision centered on curated real-world-inspired benchmarks, practitioner-accessible feature spaces and community-maintained performance databases. Real progress requires coordinated effort: A living benchmarking ecosystem that evolves with real-world insights and supports both scientific understanding and industrial use.
Similar Papers
MECHBench: A Set of Black-Box Optimization Benchmarks originated from Structural Mechanics
Neural and Evolutionary Computing
Tests computer programs on real car crash problems.
AI Benchmark Democratization and Carpentry
Artificial Intelligence
Makes AI tests stay fair as AI gets smarter.
Randomness as Reference: Benchmark Metric for Optimization in Engineering
Computational Engineering, Finance, and Science
Tests computer programs for real-world engineering problems.