ConMax: Confidence-Maximizing Compression for Efficient Chain-of-Thought Reasoning
By: Minda Hu , Zexuan Qiu , Zenan Xu and more
Potential Business Impact:
Makes smart computers think less, faster.
Recent breakthroughs in Large Reasoning Models (LRMs) have demonstrated that extensive Chain-of-Thought (CoT) generation is critical for enabling intricate cognitive behaviors, such as self-verification and backtracking, to solve complex tasks. However, this capability often leads to ``overthinking'', where models generate redundant reasoning paths that inflate computational costs without improving accuracy. While Supervised Fine-Tuning (SFT) on reasoning traces is a standard paradigm for the 'cold start' phase, applying existing compression techniques to these traces often compromises logical coherence or incurs prohibitive sampling costs. In this paper, we introduce ConMax (Confidence-Maximizing Compression), a novel reinforcement learning framework designed to automatically compress reasoning traces while preserving essential reasoning patterns. ConMax formulates compression as a reward-driven optimization problem, training a policy to prune redundancy by maximizing a weighted combination of answer confidence for predictive fidelity and thinking confidence for reasoning validity through a frozen auxiliary LRM. Extensive experiments across five reasoning datasets demonstrate that ConMax achieves a superior efficiency-performance trade-off. Specifically, it reduces inference length by 43% over strong baselines at the cost of a mere 0.7% dip in accuracy, proving its effectiveness in generating high-quality, efficient training data for LRMs.
Similar Papers
ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning
Machine Learning (CS)
Makes smart computer answers shorter, saving power.
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
Computation and Language
Makes AI think faster and smarter.
Correct, Concise and Complete: Multi-stage Training For Adaptive Reasoning
Computation and Language
Makes AI think less to solve problems faster.