Semantic Deception: When Reasoning Models Can't Compute an Addition
By: Nathaniël de Leeuw , Marceau Nahon , Mathis Reymond and more
Potential Business Impact:
Computers sometimes get tricked by how things look.
Large language models (LLMs) are increasingly used in situations where human values are at stake, such as decision-making tasks that involve reasoning when performed by humans. We investigate the so-called reasoning capabilities of LLMs over novel symbolic representations by introducing an experimental framework that tests their ability to process and manipulate unfamiliar symbols. We introduce semantic deceptions: situations in which symbols carry misleading semantic associations due to their form, such as being embedded in specific contexts, designed to probe whether LLMs can maintain symbolic abstraction or whether they default to exploiting learned semantic associations. We redefine standard digits and mathematical operators using novel symbols, and task LLMs with solving simple calculations expressed in this altered notation. The objective is: (1) to assess LLMs' capacity for abstraction and manipulation of arbitrary symbol systems; (2) to evaluate their ability to resist misleading semantic cues that conflict with the task's symbolic logic. Through experiments with four LLMs we show that semantic cues can significantly deteriorate reasoning models' performance on very simple tasks. They reveal limitations in current LLMs' ability for symbolic manipulations and highlight a tendency to over-rely on surface-level semantics, suggesting that chain-of-thoughts may amplify reliance on statistical correlations. Even in situations where LLMs seem to correctly follow instructions, semantic cues still impact basic capabilities. These limitations raise ethical and societal concerns, undermining the widespread and pernicious tendency to attribute reasoning abilities to LLMs and suggesting how LLMs might fail, in particular in decision-making contexts where robust symbolic reasoning is essential and should not be compromised by residual semantic associations inherited from the model's training.
Similar Papers
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Artificial Intelligence
Teaches computers to think more like people.
Interpretability Framework for LLMs in Undergraduate Calculus
Computers and Society
Checks math answers by understanding how they're solved.
Thinking Machines: Mathematical Reasoning in the Age of LLMs
Artificial Intelligence
Helps computers prove math ideas like a scientist.