Score: 3

Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns

Published: September 25, 2025 | arXiv ID: 2509.21124v1

By: Xuemiao Zhang , Can Ren , Chengying Tu and more

BigTech Affiliations: Meituan

Potential Business Impact:

Teaches computers to solve math problems better.

Business Areas:

A/B Testing Data and Analytics

Recent progress in large reasoning models for challenging mathematical reasoning has been driven by reinforcement learning (RL). Incorporating long chain-of-thought (CoT) data during mid-training has also been shown to substantially improve reasoning depth. However, current approaches often utilize CoT data indiscriminately, leaving open the critical question of which data types most effectively enhance model reasoning capabilities. In this paper, we define the foundation model's reasoning potential for the first time as the inverse of the number of independent attempts required to correctly answer the question, which is strongly correlated with the final model performance. We then propose utilizing diverse data enriched with high-value reasoning patterns to expand the reasoning potential. Specifically, we abstract atomic reasoning patterns from CoT sequences, characterized by commonality and inductive capabilities, and use them to construct a core reference set enriched with valuable reasoning patterns. Furthermore, we propose a dual-granularity algorithm involving chains of reasoning patterns and token entropy, efficiently selecting high-value CoT data (CoTP) from the data pool that aligns with the core set, thereby training models to master reasoning effectively. Only 10B-token CoTP data enables the 85A6B Mixture-of-Experts (MoE) model to improve by 9.58% on the challenging AIME 2024 and 2025, and to raise the upper bound of downstream RL performance by 7.81%.

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

Artificial Intelligence

Makes computers think deeper to solve hard problems.

12 Mar 2025 2

90%

Latent Chain-of-Thought for Visual Reasoning

Artificial Intelligence

Makes AI think step-by-step better for new problems.

27 Oct 2025 2

90%

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Computation and Language

Makes small computers think like big ones.

30 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com huggingface.co

Page Count

28 pages

Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns

Teaches computers to solve math problems better.

Technical Abstract

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

Latent Chain-of-Thought for Visual Reasoning

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math