Score: 0

Clozing the Gap: Exploring Why Language Model Surprisal Outperforms Cloze Surprisal

Published: January 14, 2026 | arXiv ID: 2601.09886v1

By: Sathvik Nair, Byung-Doh Oh

How predictable a word is can be quantified in two ways: using human responses to the cloze task or using probabilities from language models (LMs).When used as predictors of processing effort, LM probabilities outperform probabilities derived from cloze data. However, it is important to establish that LM probabilities do so for the right reasons, since different predictors can lead to different scientific conclusions about the role of prediction in language comprehension. We present evidence for three hypotheses about the advantage of LM probabilities: not suffering from low resolution, distinguishing semantically similar words, and accurately assigning probabilities to low-frequency words. These results call for efforts to improve the resolution of cloze studies, coupled with experiments on whether human-like prediction is also as sensitive to the fine-grained distinctions made by LM probabilities.

Surprisal and Metaphor Novelty: Moderate Correlations and Divergent Scaling Effects

Computation and Language

Helps computers understand new, creative word uses.

5 Jan 2026 0

89%

Surprisal and Metaphor Novelty: Moderate Correlations and Divergent Scaling Effects

Computation and Language

Helps computers understand new, creative word uses.

5 Jan 2026 0

89%

Do Language Models Agree with Human Perceptions of Suspense in Stories?

Computation and Language

Computers don't feel suspense like people do.

13 Aug 2025 1

View PDF Login to Bookmark

Clozing the Gap: Exploring Why Language Model Surprisal Outperforms Cloze Surprisal

Technical Abstract

Surprisal and Metaphor Novelty: Moderate Correlations and Divergent Scaling Effects

Surprisal and Metaphor Novelty: Moderate Correlations and Divergent Scaling Effects

Do Language Models Agree with Human Perceptions of Suspense in Stories?