Score: 1

Formally Exploring Time-Series Anomaly Detection Evaluation Metrics

Published: October 20, 2025 | arXiv ID: 2510.17562v1

By: Dennis Wagner , Arjun Nair , Billy Joe Franks and more

Potential Business Impact:

Finds hidden problems before systems break.

Business Areas:

Predictive Analytics Artificial Intelligence, Data and Analytics, Software

Undetected anomalies in time series can trigger catastrophic failures in safety-critical systems, such as chemical plant explosions or power grid outages. Although many detection methods have been proposed, their performance remains unclear because current metrics capture only narrow aspects of the task and often yield misleading results. We address this issue by introducing verifiable properties that formalize essential requirements for evaluating time-series anomaly detection. These properties enable a theoretical framework that supports principled evaluations and reliable comparisons. Analyzing 37 widely used metrics, we show that most satisfy only a few properties, and none satisfy all, explaining persistent inconsistencies in prior results. To close this gap, we propose LARM, a flexible metric that provably satisfies all properties, and extend it to ALARM, an advanced variant meeting stricter requirements.