LZ Penalty: An information-theoretic repetition penalty for autoregressive language models
By: Antonio A. Ginart , Naveen Kodali , Jason Lee and more
Potential Business Impact:
Stops AI from repeating itself when writing.
We introduce the LZ penalty, a penalty specialized for reducing degenerate repetitions in autoregressive language models without loss of capability. The penalty is based on the codelengths in the LZ77 universal lossless compression algorithm. Through the lens of the prediction-compression duality, decoding the LZ penalty has the interpretation of sampling from the residual distribution after removing the information that is highly compressible. We demonstrate the LZ penalty enables state-of-the-art open-source reasoning models to operate with greedy (temperature zero) decoding without loss of capability and without instances of degenerate repetition. Both the industry-standard frequency penalty and repetition penalty are ineffective, incurring degenerate repetition rates of up to 4%.
Similar Papers
LZD-style Compression Scheme with Truncation and Repetitions
Data Structures and Algorithms
Makes files smaller, faster, and better.
Enhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability
Artificial Intelligence
Makes AI understand code and logic better, cheaper.
zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression
Computation and Language
Makes computer language models faster and cheaper.