Score: 1

DBellQuant: Breaking the Bell with Double-Bell Transformation for LLMs Post Training Binarization

Published: June 18, 2025 | arXiv ID: 2507.01027v1

By: Zijian Ye , Wei Huang , Yifei Yu and more

Potential Business Impact:

Makes smart computer programs much smaller.

Business Areas:
A/B Testing Data and Analytics

Large language models (LLMs) demonstrate remarkable performance but face substantial computational and memory challenges that limit their practical deployment. Quantization has emerged as a promising solution; however, its effectiveness is often limited by quantization errors arising from weight distributions that are not quantization-friendly and the presence of activation outliers. To address these challenges, we introduce DBellQuant, an innovative post-training quantization (PTQ) framework that achieves nearly 1-bit weight compression and 6-bit activation quantization with minimal performance degradation. DBellQuant uses Learnable Transformation for Dual-Bell (LTDB) algorithm, which transforms single-bell weight distributions into dual-bell forms to reduce binarization errors and applies inverse transformations to smooth activations. DBellQuant sets a new state-of-the-art by preserving superior model performance under aggressive weight and activation quantization. For example, on the Wikitext2 dataset, DBellQuant achieves a perplexity of 14.39 on LLaMA2-13B with 6-bit activation quantization, significantly outperforming BiLLM's 21.35 without activation quantization, underscoring its potential in compressing LLMs for real-world applications.

Country of Origin
🇭🇰 Hong Kong

Page Count
19 pages

Category
Computer Science:
Machine Learning (CS)