Evolutionary System 2 Reasoning: An Empirical Proof
By: Zeyuan Ma , Wenqi Huang , Guo-Huan Song and more
Machine intelligence marks the ultimate dream of making machines' intelligence comparable to human beings. While recent progress in Large Language Models (LLMs) show substantial specific skills for a wide array of downstream tasks, they more or less fall shorts in general intelligence. Following correlation between intelligence and system 2 reasoning (slow thinking), in this paper, we aim to answering a worthwhile research question: could machine intelligence such as LLMs be evolved to acquire reasoning ability (not specific skill) just like our human beings? To this end, we propose evolutionary reasoning optimization (ERO) framework which performs survival of the fittest over a population of LLMs to search for individual with strong reasoning ability. Given a reasoning task, ERO first initializes multiple LLMs as a population, after which an evolutionary strategy evolves the population to maximize quantified reasoning score of the best individual. Based on experiments on representative testsuites, we claim two surprising empirical discoveries: i) the latest LLMs such as GPT-5 still show limited system 2 reasoning ability; ii) with simple evolution-loop of ERO, a relatively weak model (Qwen-7B) could be enhanced to emerge powerful reasoning ability. Our project can be accessed at https://github.com/MetaEvo/ERO for reproduction needs.
Similar Papers
Thinking Machines: A Survey of LLM based Reasoning Strategies
Computation and Language
Makes AI think better to solve hard problems.
From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models
Artificial Intelligence
Computers change how they think based on how hard a problem is.
Adaptive Reasoning Executor: A Collaborative Agent System for Efficient Reasoning
Artificial Intelligence
Smarter AI answers questions faster, cheaper.