Continuous Observability Assurance in Cloud-Native Applications
By: Maria C. Borges, Sebastian Werner
Potential Business Impact:
Helps fix computer problems faster and cheaper.
When faults occur in microservice applications -- as they inevitably do -- developers depend on observability data to quickly identify and diagnose the issue. To collect such data, microservices need to be instrumented and the respective infrastructure configured. This task is often underestimated and error-prone, typically relying on many ad-hoc decisions. However, some of these decisions can significantly affect how quickly faults are detected and also impact the cost and performance of the application. Given its importance, we emphasize the need for a method to guide the observability design process. In this paper, we build on previous work and integrate our observability experiment tool OXN into a novel method for continuous observability assurance. We demonstrate its use and discuss future directions.
Similar Papers
Tracing and Metrics Design Patterns for Monitoring Cloud-native Applications
Software Engineering
Helps fix computer problems faster.
Validating Alerts in Cloud-Native Observability
Software Engineering
Tests alerts before they cause problems.
Monitoring and Observability of Machine Learning Systems: Current Practices and Gaps
Software Engineering
Helps computers make right choices, not wrong ones.