Fault-Free Analog Computing with Imperfect Hardware
By: Zhicheng Xu , Jiawei Liu , Sitao Huang and more
Potential Business Impact:
Makes computers work even with broken parts.
The growing demand for edge computing and AI drives research into analog in-memory computing using memristors, which overcome data movement bottlenecks by computing directly within memory. However, device failures and variations critically limit analog systems' precision and reliability. Existing fault-tolerance techniques, such as redundancy and retraining, are often inadequate for high-precision applications or scenarios requiring fixed matrices and privacy preservation. Here, we introduce and experimentally demonstrate a fault-free matrix representation where target matrices are decomposed into products of two adjustable sub-matrices programmed onto analog hardware. This indirect, adaptive representation enables mathematical optimization to bypass faulty devices and eliminate differential pairs, significantly enhancing computational density. Our memristor-based system achieved >99.999% cosine similarity for a Discrete Fourier Transform matrix despite 39% device fault rate, a fidelity unattainable with conventional direct representation, which fails with single device faults (0.01% rate). We demonstrated 56-fold bit-error-rate reduction in wireless communication and >196% density with 179% energy efficiency improvements compared to state-of-the-art techniques. This method, validated on memristors, applies broadly to emerging memories and non-electrical computing substrates, showing that device yield is no longer the primary bottleneck in analog computing hardware.
Similar Papers
In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning
Machine Learning (CS)
Trains AI better with cheaper, simpler computer parts.
All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM Devices
Emerging Technologies
AI learns faster and remembers longer.
Efficient and Fault-Tolerant Memristive Neural Networks with In-Situ Training
Emerging Technologies
Makes computers learn faster and use less power.