Tracing and Metrics Design Patterns for Monitoring Cloud-native Applications
By: Carlos Albuquerque, Filipe F. Correia
Potential Business Impact:
Helps fix computer problems faster.
Observability helps ensure the reliability and maintainability of cloud-native applications. As software architectures become increasingly distributed and subject to change, it becomes a greater challenge to diagnose system issues effectively, often having to deal with fragmented observability and more difficult root cause analysis. This paper builds upon our previous work and introduces three design patterns that address key challenges in monitoring cloud-native applications. Distributed Tracing improves visibility into request flows across services, aiding in latency analysis and root cause detection, Application Metrics provides a structured approach to instrumenting applications with meaningful performance indicators, enabling real-time monitoring and anomaly detection, and Infrastructure Metrics focuses on monitoring the environment in which the system is operated, helping teams assess resource utilization, scalability, and operational health. These patterns are derived from industry practices and observability frameworks and aim to offer guidance for software practitioners.
Similar Papers
Continuous Observability Assurance in Cloud-Native Applications
Software Engineering
Helps fix computer problems faster and cheaper.
A Survey on the Landscape of Self-adaptive Cloud Design and Operations Patterns: Goals, Strategies, Tooling, Evaluation and Dataset Perspectives
Distributed, Parallel, and Cluster Computing
Makes apps automatically fix themselves when problems arise.
Artifact for A Non-Intrusive Framework for Deferred Integration of Cloud Patterns in Energy-Efficient Data-Sharing Pipelines
Distributed, Parallel, and Cluster Computing
Makes data tools work better without changing them.