Fault Tolerant Reconfigurable ML Multiprocessor
By: Tangrui Li , Justin Y. Shi , Matteo Spatola and more
Potential Business Impact:
Makes computers learn faster and fix themselves.
This paper reports three computational experiments for a von Neumann inspired reconfigurable fault tolerant multiprocessor for neural network (NN) training workflows. The experiments are intended to prove the feasibility of the proposed reconfigurable multiprocessor architecture for non-regular workflows on robustness of adaptability. A potential integration with MLIR compilers is also discussed for integrating diverse accelerator hardware for existing practical applications.
Similar Papers
FTI-TMR: A Fault Tolerance and Isolation Algorithm for Interconnected Multicore Systems
Distributed, Parallel, and Cluster Computing
Keeps computers working even when parts break.
A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations
Hardware Architecture
Makes AI learn faster and use less power.
Functional Stability of Software-Hardware Neural Network Implementation The NeuroComp Project
Hardware Architecture
Keeps computer brains working even if parts break.