The Geometry of Thought: Disclosing the Transformer as a Tropical Polynomial Circuit
By: Faruk Alpay, Bilge Senturk
We prove that the Transformer self-attention mechanism in the high-confidence regime ($β\to \infty$, where $β$ is an inverse temperature) operates in the tropical semiring (max-plus algebra). In particular, we show that taking the tropical limit of the softmax attention converts it into a tropical matrix product. This reveals that the Transformer's forward pass is effectively executing a dynamic programming recurrence (specifically, a Bellman-Ford path-finding update) on a latent graph defined by token similarities. Our theoretical result provides a new geometric perspective for chain-of-thought reasoning: it emerges from an inherent shortest-path (or longest-path) algorithm being carried out within the network's computation.
Similar Papers
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms
Machine Learning (CS)
Makes computers solve hard problems better.
The Mean-Field Dynamics of Transformers
Machine Learning (CS)
Makes AI understand long texts better by grouping ideas.
The Bayesian Geometry of Transformer Attention
Machine Learning (CS)
**Computers learn to think like humans.**