Score: 0

Big Reasoning with Small Models: Instruction Retrieval at Inference Time

Published: October 15, 2025 | arXiv ID: 2510.13935v1

By: Kenan Alkiek, David Jurgens, Vinod Vydiswaran

Potential Business Impact:

Helps small computers solve hard problems.

Business Areas:

Semantic Search Internet Services

Can we bring large-scale reasoning to local-scale compute? Small language models (SLMs) are increasingly attractive because they run efficiently on local hardware, offering strong privacy, low cost, and reduced environmental impact. Yet they often struggle with tasks that require multi-step reasoning or domain-specific knowledge. We address this limitation through instruction intervention at inference time, where an SLM retrieves structured reasoning procedures rather than generating them from scratch. Our method builds an Instruction Corpus by grouping similar training questions and creating instructions via GPT-5. During inference, the SLM retrieves the most relevant instructions and follows their steps. Unlike retrieval-augmented generation, which retrieves text passages, instruction retrieval gives the model structured guidance for reasoning. We evaluate this framework on MedQA (medical board exams), MMLU Professional Law, and MathQA using models from 3B to 14B parameters without any additional fine-tuning. Instruction retrieval yields consistent gains: 9.4% on MedQA, 7.9% on MMLU Law, and 5.1% on MathQA. Concise instructions outperform longer ones, and the magnitude of improvement depends strongly on model family and intrinsic reasoning ability.

ReTraceQA: Evaluating Reasoning Traces of Small Language Models in Commonsense Question Answering

Computation and Language

Finds when AI answers right but thinks wrong.

10 Oct 2025 0

90%

Think Before You Retrieve: Learning Test-Time Adaptive Search with Small Language Models

Artificial Intelligence

Teaches small computers to find information better.

10 Nov 2025 1

90%

A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions

Artificial Intelligence

Makes smart computer programs think faster and better.

12 Apr 2025 2

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

19 pages

Big Reasoning with Small Models: Instruction Retrieval at Inference Time

Helps small computers solve hard problems.

Technical Abstract

ReTraceQA: Evaluating Reasoning Traces of Small Language Models in Commonsense Question Answering

Think Before You Retrieve: Learning Test-Time Adaptive Search with Small Language Models

A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions