Score: 0

Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models

Published: August 26, 2025 | arXiv ID: 2508.18609v1

By: Chenxi Zhou , Pengfei Cao , Jiang Li and more

Potential Business Impact:

Makes big AI models smaller without losing smarts.

Business Areas:
Quantum Computing Science and Engineering

Large language models (LLMs) present significant deployment challenges due to their scale, with post-training quantization (PTQ) emerging as a practical compression solution. However, a comprehensive understanding of how PTQ precisely impacts diverse LLM knowledge capabilities remains elusive, and existing scaling laws for quantized models often overlook crucial PTQ-specific parameters and task-specific sensitivities. This paper addresses these gaps by conducting an extensive empirical investigation to establish task-stratified scaling laws. We disentangle LLM knowledge into memorization and utilization capabilities and develop a unified quantitative framework that incorporates model size, effective bit-width, calibration set size, and group size. Our central finding reveals that knowledge memorization exhibits markedly greater sensitivity to variations in effective bit-width, calibration set size, and model size compared to the more robust knowledge utilization. These findings offer a fine-grained understanding of PTQ's impact and provide guidance for developing knowledge-aware quantization strategies that can better preserve targeted cognitive functions.

Page Count
12 pages

Category
Computer Science:
Computation and Language