Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers
By: Marek Kadlčík , Michal Štefánik , Timothee Mickus and more
Potential Business Impact:
Fixes math mistakes in smart computer programs.
Pretrained language models (LMs) are prone to arithmetic errors. Existing work showed limited success in probing numeric values from models' representations, indicating that these errors can be attributed to the inherent unreliability of distributionally learned embeddings in representing exact quantities. However, we observe that previous probing methods are inadequate for the emergent structure of learned number embeddings with sinusoidal patterns. In response, we propose a novel probing technique that decodes numeric values from input embeddings with near-perfect accuracy across a range of open-source LMs. This proves that after the sole pre-training, LMs represent numbers with remarkable precision. Finally, we find that the embeddings' preciseness judged by our probe's accuracy explains a large portion of LM's errors in elementary arithmetic, and show that aligning the embeddings with the pattern discovered by our probe can mitigate these errors.
Similar Papers
Unravelling the Mechanisms of Manipulating Numbers in Language Models
Computation and Language
Finds how computers make math mistakes.
Revealing the Numeracy Gap: An Empirical Investigation of Text Embedding Models
Computation and Language
Computers now understand numbers in words better.
Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles
Computation and Language
Computers learn math from different number words.