Score: 0

Resilient Microservices: A Systematic Review of Recovery Patterns, Strategies, and Evaluation Frameworks

Published: December 18, 2025 | arXiv ID: 2512.16959v1

By: Muzeeb Mohammad

Microservice based systems underpin modern distributed computing environments but remain vulnerable to partial failures, cascading timeouts, and inconsistent recovery behavior. Although numerous resilience and recovery patterns have been proposed, existing surveys are largely descriptive and lack systematic evidence synthesis or quantitative rigor. This paper presents a PRISMA aligned systematic literature review of empirical studies on microservice recovery strategies published between 2014 and 2025 across IEEE Xplore, ACM Digital Library, and Scopus. From an initial corpus of 412 records, 26 high quality studies were selected using transparent inclusion, exclusion, and quality assessment criteria. The review identifies nine recurring resilience themes encompassing circuit breakers, retries with jitter and budgets, sagas with compensation, idempotency, bulkheads, adaptive backpressure, observability, and chaos validation. As a data oriented contribution, the paper introduces a Recovery Pattern Taxonomy, a Resilience Evaluation Score checklist for standardized benchmarking, and a constraint aware decision matrix mapping latency, consistency, and cost trade offs to appropriate recovery mechanisms. The results consolidate fragmented resilience research into a structured and analyzable evidence base that supports reproducible evaluation and informed design of fault tolerant and performance aware microservice systems.

Category
Computer Science:
Software Engineering