Score: 1

Enhancing Large Language Model Reasoning via Selective Critical Token Fine-Tuning

Published: October 13, 2025 | arXiv ID: 2510.10974v1

By: Zhiwen Ruan , Yixia Li , He Zhu and more

Potential Business Impact:

Teaches AI to focus on important math steps.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) primarily rely on supervised fine-tuning (SFT) as a key method to adapt pre-trained models to domain-specific tasks such as mathematical reasoning. However, standard SFT uniformly penalizes all tokens, neglecting that only a small subset of critical tokens determines reasoning correctness. This uniform supervision often causes reduced output diversity and limited generalization. We propose Critical Token Fine-tuning (CFT), a simple yet effective approach that updates only tokens identified as functionally indispensable via counterfactual perturbations. By focusing gradient signals on these decisive reasoning steps while preserving the diversity of non-critical tokens, CFT can enhance both generation and diversity. Extensive experiments on five models across three families (Qwen, OLMo, LLaMA) and eleven mathematical reasoning benchmarks show that CFT, despite fine-tuning on less than 12% of tokens, consistently outperforms standard SFT. Moreover, CFT enables test-time scaling through improved sampling diversity and provides a stronger initialization for reinforcement learning, sustaining performance gains in later training stages while maintaining higher entropy for better exploration. These results highlight CFT as a practical and general framework for efficient and robust LLM fine-tuning.

Rethinking Supervised Fine-Tuning: Emphasizing Key Answer Tokens for Improved LLM Accuracy

Computation and Language

Improves AI answers by focusing on the final solution.

24 Dec 2025 1

91%

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Computation and Language

Teaches computers to think better, not just copy.

29 Jan 2025 0

90%

Empowering Lightweight MLLMs with Reasoning via Long CoT SFT

CV and Pattern Recognition

Teaches small AI to think better with examples.

3 Sep 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com github.com

Page Count

17 pages

Enhancing Large Language Model Reasoning via Selective Critical Token Fine-Tuning

Teaches AI to focus on important math steps.

Technical Abstract

Rethinking Supervised Fine-Tuning: Emphasizing Key Answer Tokens for Improved LLM Accuracy

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Empowering Lightweight MLLMs with Reasoning via Long CoT SFT