Late Breaking Results: Quamba-SE: Soft-edge Quantizer for Activations in State Space Models
By: Yizhi Chen, Ahmed Hemani
We propose Quamba-SE, a soft-edge quantizer for State Space Model (SSM) activation quantization. Unlike existing methods, using standard INT8 operation, Quamba-SE employs three adaptive scales: high-precision for small values, standard scale for normal values, and low-precision for outliers. This preserves outlier information instead of hard clipping, while maintaining precision for other values. We evaluate on Mamba- 130M across 6 zero-shot benchmarks. Results show that Quamba- SE consistently outperforms Quamba, achieving up to +2.68% on individual benchmarks and up to +0.83% improvement in the average accuracy of 6 datasets.
Similar Papers
QMamba: Post-Training Quantization for Vision State Space Models
CV and Pattern Recognition
Makes smart vision programs run faster on phones.
Quantizing Small-Scale State-Space Models for Edge AI
Machine Learning (CS)
Makes smart computer models run faster, smaller.
eMamba: Efficient Acceleration Framework for Mamba Models in Edge Computing
Machine Learning (CS)
Makes smart devices run AI faster, using less power.