Score: 1

Monitoring and Observability of Machine Learning Systems: Current Practices and Gaps

Published: October 28, 2025 | arXiv ID: 2510.24142v1

By: Joran Leest , Ilias Gerostathopoulos , Patricia Lago and more

Potential Business Impact:

Helps computers make right choices, not wrong ones.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

Production machine learning (ML) systems fail silently -- not with crashes, but through wrong decisions. While observability is recognized as critical for ML operations, there is a lack empirical evidence of what practitioners actually capture. This study presents empirical results on ML observability in practice through seven focus group sessions in several domains. We catalog the information practitioners systematically capture across ML systems and their environment and map how they use it to validate models, detect and diagnose faults, and explain observed degradations. Finally, we identify gaps in current practice and outline implications for tooling design and research to establish ML observability practices.

Country of Origin
🇳🇱 🇮🇹 Italy, Netherlands

Page Count
12 pages

Category
Computer Science:
Software Engineering