Score: 2

Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective

Published: January 6, 2026 | arXiv ID: 2601.03154v1

By: Beiduo Chen , Tiancheng Hu , Caiqi Zhang and more

Potential Business Impact:

Helps AI understand when answers are unsure.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Reasoning-tuned LLMs utilizing long Chain-of-Thought (CoT) excel at single-answer tasks, yet their ability to model Human Label Variation--which requires capturing probabilistic ambiguity rather than resolving it--remains underexplored. We investigate this through systematic disentanglement experiments on distribution-based tasks, employing Cross-CoT experiments to isolate the effect of reasoning text from intrinsic model priors. We observe a distinct "decoupled mechanism": while CoT improves distributional alignment, final accuracy is dictated by CoT content (99% variance contribution), whereas distributional ranking is governed by model priors (over 80%). Step-wise analysis further shows that while CoT's influence on accuracy grows monotonically during the reasoning process, distributional structure is largely determined by LLM's intrinsic priors. These findings suggest that long CoT serves as a decisive LLM decision-maker for the top option but fails to function as a granular distribution calibrator for ambiguous tasks.

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Artificial Intelligence

Computers' "thinking" breaks when problems change.

2 Aug 2025 1

93%

Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation

Computation and Language

Helps computers explain why they pick answers.

29 May 2025 2

92%

From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models

Computation and Language

Helps AI "think step-by-step" to solve harder problems.

17 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇩🇪 🇬🇧 United Kingdom, Germany

Repos / Data Links

github.com github.com

Page Count

19 pages

Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective

Helps AI understand when answers are unsure.

Technical Abstract

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation

From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models