Score: 0

DETAIL Matters: Measuring the Impact of Prompt Specificity on Reasoning in Large Language Models

Published: December 1, 2025 | arXiv ID: 2512.02246v1

By: Olivia Kim

Potential Business Impact:

Makes AI smarter by giving it clearer instructions.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Prompt design plays a critical role in the reasoning performance of large language models (LLMs), yet the impact of prompt specificity - how detailed or vague a prompt is - remains understudied. This paper introduces DETAIL, a framework for evaluating LLM performance across varying levels of prompt specificity. We generate multi-level prompts using GPT-4, quantify specificity via perplexity, and assess correctness using GPT-based semantic equivalence. Experiments on 30 novel reasoning tasks across GPT-4 and O3-mini reveal that specificity improves accuracy, especially for smaller models and procedural tasks. Our results highlight the need for adaptive prompting strategies and provide tools and data to support further research.

Country of Origin
🇺🇸 United States

Page Count
10 pages

Category
Computer Science:
Computation and Language