Extending Straight-Through Estimation for Robust Neural Networks on Analog CIM Hardware
By: Yuannuo Feng , Wenyong Zhou , Yuexi Lyu and more
Potential Business Impact:
Makes AI chips work better with less power.
Analog Compute-In-Memory (CIM) architectures promise significant energy efficiency gains for neural network inference, but suffer from complex hardware-induced noise that poses major challenges for deployment. While noise-aware training methods have been proposed to address this issue, they typically rely on idealized and differentiable noise models that fail to capture the full complexity of analog CIM hardware variations. Motivated by the Straight-Through Estimator (STE) framework in quantization, we decouple forward noise simulation from backward gradient computation, enabling noise-aware training with more accurate but computationally intractable noise modeling in analog CIM systems. We provide theoretical analysis demonstrating that our approach preserves essential gradient directional information while maintaining computational tractability and optimization stability. Extensive experiments show that our extended STE framework achieves up to 5.3% accuracy improvement on image classification, 0.72 perplexity reduction on text generation, 2.2$\times$ speedup in training time, and 37.9% lower peak memory usage compared to standard noise-aware training methods.
Similar Papers
High-Dimensional Learning Dynamics of Quantized Models with Straight-Through Estimator
Machine Learning (Stat)
Makes computer learning faster and more accurate.
Improving the Straight-Through Estimator with Zeroth-Order Information
Machine Learning (CS)
Makes AI learn faster and better with less effort.
Computing-In-Memory Aware Model Adaption For Edge Devices
Hardware Architecture
Makes AI chips faster and smaller.