From Implicit to Explicit: Token-Efficient Logical Supervision for Mathematical Reasoning in LLMs
By: Shaojie Wang, Liang Zhang
Potential Business Impact:
Teaches computers to think step-by-step for math.
Recent studies reveal that large language models (LLMs) exhibit limited logical reasoning abilities in mathematical problem-solving, instead often relying on pattern-matching and memorization. We systematically analyze this limitation, focusing on logical relationship understanding, which is a core capability underlying genuine logical reasoning, and reveal that errors related to this capability account for over 90\% of incorrect predictions, with Chain-of-Thought Supervised Fine-Tuning (CoT-SFT) failing to substantially reduce these errors. To address this bottleneck, we propose First-Step Logical Reasoning (FSLR), a lightweight training framework targeting logical relationship understanding. Our key insight is that the first planning step-identifying which variables to use and which operation to apply-encourages the model to derive logical relationships directly from the problem statement. By training models on this isolated step, FSLR provides explicit supervision for logical relationship understanding, unlike CoT-SFT which implicitly embeds such relationships within complete solution trajectories. Extensive experiments across multiple models and datasets demonstrate that FSLR consistently outperforms CoT-SFT under both in-distribution and out-of-distribution settings, with average improvements of 3.2\% and 4.6\%, respectively. Moreover, FSLR achieves 4-6x faster training and reduces training token consumption by over 80\%.
Similar Papers
Empowering Lightweight MLLMs with Reasoning via Long CoT SFT
CV and Pattern Recognition
Teaches small AI to think better with examples.
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Computation and Language
Makes AI think faster with less information.
Mitigating Forgetting Between Supervised and Reinforcement Learning Yields Stronger Reasoners
Computation and Language
Makes AI smarter by learning from mistakes.