Score: 2

Reasoning Curriculum: Bootstrapping Broad LLM Reasoning from Math

Published: October 30, 2025 | arXiv ID: 2510.26143v1

By: Bo Pang , Deqian Kong , Silvio Savarese and more

BigTech Affiliations: Salesforce Research

Potential Business Impact:

Teaches computers to think better at many tasks.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Reinforcement learning (RL) can elicit strong reasoning in large language models (LLMs), yet most open efforts focus on math and code. We propose Reasoning Curriculum, a simple two-stage curriculum that first elicits reasoning skills in pretraining-aligned domains such as math, then adapts and refines these skills across other domains via joint RL. Stage 1 performs a brief cold start and then math-only RL with verifiable rewards to develop reasoning skills. Stage 2 runs joint RL on mixed-domain data to transfer and consolidate these skills. The curriculum is minimal and backbone-agnostic, requiring no specialized reward models beyond standard verifiability checks. Evaluated on Qwen3-4B and Llama-3.1-8B over a multi-domain suite, reasoning curriculum yields consistent gains. Ablations and a cognitive-skill analysis indicate that both stages are necessary and that math-first elicitation increases cognitive behaviors important for solving complex problems. Reasoning Curriculum provides a compact, easy-to-adopt recipe for general reasoning.