Score: 0

State over Tokens: Characterizing the Role of Reasoning Tokens

Published: December 14, 2025 | arXiv ID: 2512.12777v1

By: Mosh Levy , Zohar Elyoseph , Shauli Ravfogel and more

Potential Business Impact:

Lets computers "think" better by showing their steps.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks. While these sequences seem like human thought processes, empirical evidence reveals that they are not a faithful explanation of the model's actual reasoning process. To address this gap between appearance and function, we introduce the State over Tokens (SoT) conceptual framework. SoT reframes reasoning tokens not as a linguistic narrative, but as an externalized computational state -- the sole persistent information carrier across the model's stateless generation cycles. This explains how the tokens can drive correct reasoning without being a faithful explanation when read as text and surfaces previously overlooked research questions on these tokens. We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state.

LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens

Computation and Language

Makes computer translators better by showing them how.

13 Oct 2025 1

89%

Reasoning Beyond Chain-of-Thought: A Latent Computational Mode in Large Language Models

Computation and Language

Makes computers think better without extra instructions.

12 Jan 2026 0

89%

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Information Retrieval

Makes AI think smarter, faster, and use less energy.

29 May 2025 0

View PDF Login to Bookmark

Page Count

13 pages

State over Tokens: Characterizing the Role of Reasoning Tokens

Lets computers "think" better by showing their steps.

Technical Abstract

LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens

Reasoning Beyond Chain-of-Thought: A Latent Computational Mode in Large Language Models

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval