Formally Exploring Time-Series Anomaly Detection Evaluation Metrics
By: Dennis Wagner , Arjun Nair , Billy Joe Franks and more
Potential Business Impact:
Finds hidden problems before systems break.
Undetected anomalies in time series can trigger catastrophic failures in safety-critical systems, such as chemical plant explosions or power grid outages. Although many detection methods have been proposed, their performance remains unclear because current metrics capture only narrow aspects of the task and often yield misleading results. We address this issue by introducing verifiable properties that formalize essential requirements for evaluating time-series anomaly detection. These properties enable a theoretical framework that supports principled evaluations and reliable comparisons. Analyzing 37 widely used metrics, we show that most satisfy only a few properties, and none satisfy all, explaining persistent inconsistencies in prior results. To close this gap, we propose LARM, a flexible metric that provably satisfies all properties, and extend it to ALARM, an advanced variant meeting stricter requirements.
Similar Papers
A Problem-Oriented Taxonomy of Evaluation Metrics for Time Series Anomaly Detection
Artificial Intelligence
Helps find fake data in computer signals.
A Comprehensive Forecasting-Based Framework for Time Series Anomaly Detection: Benchmarking on the Numenta Anomaly Benchmark (NAB)
Machine Learning (CS)
Finds weird computer problems faster and better.
MSAD: A Deep Dive into Model Selection for Time series Anomaly Detection
Machine Learning (CS)
Finds weird patterns in data automatically.