Score: 1

SSR: Socratic Self-Refine for Large Language Model Reasoning

Published: November 13, 2025 | arXiv ID: 2511.10621v1

By: Haizhou Shi , Ye Liu , Bo Pang and more

BigTech Affiliations: Salesforce Research

Potential Business Impact:

Makes AI think better, step by step.

Business Areas:
Semantic Search Internet Services

Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, yet existing test-time frameworks often rely on coarse self-verification and self-correction, limiting their effectiveness on complex tasks. In this paper, we propose Socratic Self-Refine (SSR), a novel framework for fine-grained evaluation and precise refinement of LLM reasoning. Our proposed SSR decomposes model responses into verifiable (sub-question, sub-answer) pairs, enabling step-level confidence estimation through controlled re-solving and self-consistency checks. By pinpointing unreliable steps and iteratively refining them, SSR produces more accurate and interpretable reasoning chains. Empirical results across five reasoning benchmarks and three LLMs show that SSR consistently outperforms state-of-the-art iterative self-refinement baselines. Beyond performance gains, SSR provides a principled black-box approach for evaluating and understanding the internal reasoning processes of LLMs. Code is available at https://github.com/SalesforceAIResearch/socratic-self-refine-reasoning.

Country of Origin
🇺🇸 United States

Page Count
32 pages

Category
Computer Science:
Computation and Language