Score: 1

DBellQuant: Breaking the Bell with Double-Bell Transformation for LLMs Post Training Binarization

Published: June 18, 2025 | arXiv ID: 2507.01027v1

By: Zijian Ye , Wei Huang , Yifei Yu and more

Potential Business Impact:

Makes smart computer programs much smaller.

Business Areas:

A/B Testing Data and Analytics

Large language models (LLMs) demonstrate remarkable performance but face substantial computational and memory challenges that limit their practical deployment. Quantization has emerged as a promising solution; however, its effectiveness is often limited by quantization errors arising from weight distributions that are not quantization-friendly and the presence of activation outliers. To address these challenges, we introduce DBellQuant, an innovative post-training quantization (PTQ) framework that achieves nearly 1-bit weight compression and 6-bit activation quantization with minimal performance degradation. DBellQuant uses Learnable Transformation for Dual-Bell (LTDB) algorithm, which transforms single-bell weight distributions into dual-bell forms to reduce binarization errors and applies inverse transformations to smooth activations. DBellQuant sets a new state-of-the-art by preserving superior model performance under aggressive weight and activation quantization. For example, on the Wikitext2 dataset, DBellQuant achieves a perplexity of 14.39 on LLaMA2-13B with 6-bit activation quantization, significantly outperforming BiLLM's 21.35 without activation quantization, underscoring its potential in compressing LLMs for real-world applications.

Binary Neural Networks for Large Language Model: A Survey

Computation and Language

Makes AI models smaller and faster to train.

26 Feb 2025 0

89%

Achieving binary weight and activation for LLMs using Post-Training Quantization

Machine Learning (CS)

Makes big AI models much smaller and faster.

7 Apr 2025 2

89%

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Computation and Language

Makes big AI models run on small phones.

20 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇭🇰 Hong Kong

Page Count

19 pages

DBellQuant: Breaking the Bell with Double-Bell Transformation for LLMs Post Training Binarization

Makes smart computer programs much smaller.

Technical Abstract

Binary Neural Networks for Large Language Model: A Survey

Achieving binary weight and activation for LLMs using Post-Training Quantization

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs