Score: 0

Prompting Science Report 2: The Decreasing Value of Chain of Thought in Prompting

Published: June 8, 2025 | arXiv ID: 2506.07142v1

By: Lennart Meincke , Ethan Mollick , Lilach Mollick and more

Potential Business Impact:

Helps AI think better, but costs more.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This is the second in a series of short reports that seek to help business, education, and policy leaders understand the technical details of working with AI through rigorous testing. In this report, we investigate Chain-of-Thought (CoT) prompting, a technique that encourages a large language model (LLM) to "think step by step" (Wei et al., 2022). CoT is a widely adopted method for improving reasoning tasks, however, our findings reveal a more nuanced picture of its effectiveness. We demonstrate two things: - The effectiveness of Chain-of-Thought prompting can vary greatly depending on the type of task and model. For non-reasoning models, CoT generally improves average performance by a small amount, particularly if the model does not inherently engage in step-by-step processing by default. However, CoT can introduce more variability in answers, sometimes triggering occasional errors in questions the model would otherwise get right. We also found that many recent models perform some form of CoT reasoning even if not asked; for these models, a request to perform CoT had little impact. Performing CoT generally requires far more tokens (increasing cost and time) than direct answers. - For models designed with explicit reasoning capabilities, CoT prompting often results in only marginal, if any, gains in answer accuracy. However, it significantly increases the time and tokens needed to generate a response.

Chain-of-Conceptual-Thought: Eliciting the Agent to Deeply Think within the Response

Computation and Language

Helps AI understand feelings and give better advice.

21 Oct 2025 0

92%

Layered Chain-of-Thought Prompting for Multi-Agent LLM Systems: A Comprehensive Approach to Explainable Large Language Models

Computation and Language

Makes AI explain its thinking more clearly and correctly.

29 Jan 2025 0

92%

Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models

Computation and Language

Teaches small computers to think like big ones.

7 Nov 2025 0

View PDF Login to Bookmark

Page Count

19 pages

Prompting Science Report 2: The Decreasing Value of Chain of Thought in Prompting

Helps AI think better, but costs more.

Technical Abstract

Chain-of-Conceptual-Thought: Eliciting the Agent to Deeply Think within the Response

Layered Chain-of-Thought Prompting for Multi-Agent LLM Systems: A Comprehensive Approach to Explainable Large Language Models

Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models