Score: 1

From Tea Leaves to System Maps: A Survey and Framework on Context-aware Machine Learning Monitoring

Published: June 12, 2025 | arXiv ID: 2506.10770v3

By: Joran Leest , Claudia Raibulet , Patricia Lago and more

Potential Business Impact:

Helps AI understand why it's making mistakes.

Business Areas:
Semantic Search Internet Services

Machine learning (ML) models in production fail when their broader systems -- from data pipelines to deployment environments -- deviate from training assumptions, not merely due to statistical anomalies in input data. Despite extensive work on data drift, data validation, and out-of-distribution detection, ML monitoring research remains largely model-centric while neglecting contextual information: auxiliary signals about the system around the model (external factors, data pipelines, downstream applications). Incorporating this context turns statistical anomalies into actionable alerts and structured root-cause analysis. Drawing on a systematic review of 94 primary studies, we identify three dimensions of contextual information for ML monitoring: the system element concerned (natural environment or technical infrastructure); the aspect of that element (runtime states, structural relationships, prescriptive properties); and the representation used (formal constructs or informal formats). This forms the Contextual System-Aspect-Representation (C-SAR) framework, a descriptive model synthesizing our findings. We identify 20 recurring triplets across these dimensions and map them to the monitoring activities they support. This study provides a holistic perspective on ML monitoring: from interpreting "tea leaves" (i.e., isolated data and performance statistics) to constructing and managing "system maps" (i.e., end-to-end views that connect data, models, and operating context).

Country of Origin
🇳🇱 🇮🇹 Italy, Netherlands

Page Count
30 pages

Category
Computer Science:
Software Engineering