Score: 0

Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models

Published: August 26, 2025 | arXiv ID: 2508.18609v2

By: Chenxi Zhou , Pengfei Cao , Jiang Li and more

Potential Business Impact:

Makes big AI models smaller without losing smarts.

Business Areas:

Quantum Computing Science and Engineering

Large language models (LLMs) present significant deployment challenges due to their scale, with post-training quantization (PTQ) emerging as a practical compression solution. However, a comprehensive understanding of how PTQ precisely impacts diverse LLM knowledge capabilities remains elusive, and existing scaling laws for quantized models often overlook crucial PTQ-specific parameters and task-specific sensitivities. This paper addresses these gaps by conducting an extensive empirical investigation to establish task-stratified scaling laws. We disentangle LLM knowledge into memorization and utilization capabilities and develop a unified quantitative framework that incorporates model size, effective bit-width, calibration set size, and group size. Our central finding reveals that knowledge memorization exhibits markedly greater sensitivity to variations in effective bit-width, calibration set size, and model size compared to the more robust knowledge utilization. These findings offer a fine-grained understanding of PTQ's impact and provide guidance for developing knowledge-aware quantization strategies that can better preserve targeted cognitive functions.

Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models

Computation and Language

Makes big AI models smaller without losing smarts.

26 Aug 2025 0

91%

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Computation and Language

Makes big AI models run on small phones.

20 Aug 2025 1

91%

You Had One Job: Per-Task Quantization Using LLMs' Hidden Representations

Computation and Language

Makes big AI models run faster and smaller.

9 Nov 2025 0

View PDF Login to Bookmark

Page Count

12 pages

Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models

Makes big AI models smaller without losing smarts.

Technical Abstract

Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

You Had One Job: Per-Task Quantization Using LLMs' Hidden Representations