Repetitions are not all alike: distinct mechanisms sustain repetition in language models
By: Matéo Mahaut, Francesca Franzon
Potential Business Impact:
Fixes computer writing that repeats itself too much.
Text generated by language models (LMs) can degrade into repetitive cycles, where identical word sequences are persistently repeated one after another. Prior research has typically treated repetition as a unitary phenomenon. However, repetitive sequences emerge under diverse tasks and contexts, raising the possibility that it may be driven by multiple underlying factors. Here, we experimentally explore the hypothesis that repetition in LMs can result from distinct mechanisms, reflecting different text generation strategies used by the model. We examine the internal working of LMs under two conditions that prompt repetition: one in which repeated sequences emerge naturally after human-written text, and another where repetition is explicitly induced through an in-context learning (ICL) setup. Our analysis reveals key differences between the two conditions: the model exhibits varying levels of confidence, relies on different attention heads, and shows distinct pattens of change in response to controlled perturbations. These findings suggest that distinct internal mechanisms can interact to drive repetition, with implications for its interpretation and mitigation strategies. More broadly, our results highlight that the same surface behavior in LMs may be sustained by different underlying processes, acting independently or in combination.
Similar Papers
Solving LLM Repetition Problem in Production: A Comprehensive Study of Multiple Solutions
Artificial Intelligence
Stops AI from repeating itself in code.
Interpreting the Repeated Token Phenomenon in Large Language Models
Machine Learning (CS)
Fixes AI that can't repeat words correctly.
A Neural Model for Word Repetition
Computation and Language
Teaches computers to repeat words like babies.