Hard Shell, Reliable Core: Improving Resilience in Replicated Systems with Selective Hybridization
By: Laura Lawniczak, Tobias Distler
Potential Business Impact:
Makes computer systems safer by choosing what to protect.
Hybrid fault models are known to be an effective means for enhancing the robustness of consensus-based replicated systems. However, existing hybridization approaches suffer from limited flexibility with regard to the composition of crash-tolerant and Byzantine fault-tolerant system parts and/or are associated with a significant diversification overhead. In this paper we address these issues with ShellFT, a framework that leverages the concept of micro replication to allow system designers to freely choose the parts of the replication logic that need to be resilient against Byzantine faults. As a key benefit, such a selective hybridization makes it possible to develop hybrid solutions that are tailored to the specific characteristics and requirements of individual use cases. To illustrate this flexibility, we present three custom ShellFT protocols and analyze the complexity of their implementations. Our evaluation shows that compared with traditional hybridization approaches, ShellFT is able to decrease diversification costs by more than 70%.
Similar Papers
FTHP-MPI: Towards Providing Replication-based Fault Tolerance in a Fault-Intolerant Native MPI Library
Distributed, Parallel, and Cluster Computing
Keeps supercomputers running when parts break.
FTI-TMR: A Fault Tolerance and Isolation Algorithm for Interconnected Multicore Systems
Distributed, Parallel, and Cluster Computing
Keeps computers working even when parts break.
FTI-TMR: A Fault Tolerance and Isolation Algorithm for Interconnected Multicore Systems
Distributed, Parallel, and Cluster Computing
Keeps computers working even when parts break.