Score: 0

Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models

Published: March 14, 2025 | arXiv ID: 2503.11154v1

By: Shaotian Yan , Chen Shen , Wenxiao Wang and more

Potential Business Impact:

Fixes AI's thinking to get better answers.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Few-shot Chain-of-Thought (CoT) significantly enhances the reasoning capabilities of large language models (LLMs), functioning as a whole to guide these models in generating reasoning steps toward final answers. However, we observe that isolated segments, words, or tokens within CoT demonstrations can unexpectedly disrupt the generation process of LLMs. The model may overly concentrate on certain local information present in the demonstration, introducing irrelevant noise into the reasoning process and potentially leading to incorrect answers. In this paper, we investigate the underlying mechanism of CoT through dynamically tracing and manipulating the inner workings of LLMs at each output step, which demonstrates that tokens exhibiting specific attention characteristics are more likely to induce the model to take things out of context; these tokens directly attend to the hidden states tied with prediction, without substantial integration of non-local information. Building upon these insights, we propose a Few-shot Attention Intervention method (FAI) that dynamically analyzes the attention patterns of demonstrations to accurately identify these tokens and subsequently make targeted adjustments to the attention weights to effectively suppress their distracting effect on LLMs. Comprehensive experiments across multiple benchmarks demonstrate consistent improvements over baseline methods, with a remarkable 5.91% improvement on the AQuA dataset, further highlighting the effectiveness of FAI.

Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

Computation and Language

Makes AI think faster with less information.

27 Nov 2025 1

92%

Attention Reveals More Than Tokens: Training-Free Long-Context Reasoning with Attention-guided Retrieval

Computation and Language

Helps computers remember more for complex thinking.

12 Mar 2025 0

91%

Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning

Computation and Language

Helps computers understand long stories better.

18 Feb 2025 0

View PDF Login to Bookmark

Page Count

22 pages

Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models

Fixes AI's thinking to get better answers.

Technical Abstract

Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

Attention Reveals More Than Tokens: Training-Free Long-Context Reasoning with Attention-guided Retrieval

Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning