Reasoning Effort and Problem Complexity: A Scaling Analysis in LLMs
By: Benjamin Estermann, Roger Wattenhofer
Potential Business Impact:
Computers struggle with harder puzzles.
Large Language Models (LLMs) have demonstrated remarkable text generation capabilities, and recent advances in training paradigms have led to breakthroughs in their reasoning performance. In this work, we investigate how the reasoning effort of such models scales with problem complexity. We use the infinitely scalable Tents puzzle, which has a known linear-time solution, to analyze this scaling behavior. Our results show that reasoning effort scales with problem size, but only up to a critical problem complexity. Beyond this threshold, the reasoning effort does not continue to increase, and may even decrease. This observation highlights a critical limitation in the logical coherence of current LLMs as problem complexity increases, and underscores the need for strategies to improve reasoning scalability. Furthermore, our results reveal significant performance differences between current state-of-the-art reasoning models when faced with increasingly complex logical puzzles.
Similar Papers
A Survey of Scaling in Large Language Model Reasoning
Artificial Intelligence
Makes AI think better by giving it more practice.
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Artificial Intelligence
Helps computers think better, but they still make mistakes.
Do Larger Language Models Imply Better Generalization? A Pretraining Scaling Law for Implicit Reasoning
Artificial Intelligence
Makes AI better at solving puzzles with lots of steps.