In-Context Algebra
By: Eric Todd , Jannik Brinkmann , Rohit Gandikota and more
Potential Business Impact:
Teaches computers to solve math with changing symbols.
We investigate the mechanisms that arise when transformers are trained to solve arithmetic on sequences where tokens are variables whose meaning is determined only through their interactions. While prior work has found that transformers develop geometric embeddings that mirror algebraic structure, those previous findings emerge from settings where arithmetic-valued tokens have fixed meanings. We devise a new task in which the assignment of symbols to specific algebraic group elements varies from one sequence to another. Despite this challenging setup, transformers achieve near-perfect accuracy on the task and even generalize to unseen algebraic groups. We develop targeted data distributions to create causal tests of a set of hypothesized mechanisms, and we isolate three mechanisms models consistently learn: commutative copying where a dedicated head copies answers, identity element recognition that distinguishes identity-containing facts, and closure-based cancellation that tracks group membership to constrain valid answers. Complementary to the geometric representations found in fixed-symbol settings, our findings show that models develop symbolic reasoning mechanisms when trained to reason in-context with variables whose meanings are not fixed.
Similar Papers
Distinct Computations Emerge From Compositional Curricula in In-Context Learning
Machine Learning (CS)
Teaches computers to solve harder problems by breaking them down.
Vector Arithmetic in Concept and Token Subspaces
Computation and Language
Makes AI understand word meanings and spelling better.
Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning
Computation and Language
AI learns new things from just a few examples.