Prompting Science Report 4: Playing Pretend: Expert Personas Don't Improve Factual Accuracy
By: Savir Basil , Ina Shapiro , Dan Shapiro and more
Potential Business Impact:
Giving AI pretend jobs doesn't help it answer questions.
This is the fourth in a series of short reports that help business, education, and policy leaders understand the technical details of working with AI through rigorous testing. Here, we ask whether assigning personas to models improves performance on difficult objective multiple-choice questions. We study both domain-specific expert personas and low-knowledge personas, evaluating six models on GPQA Diamond (Rein et al. 2024) and MMLU-Pro (Wang et al. 2024), graduate-level questions spanning science, engineering, and law. We tested three approaches: -In-Domain Experts: Assigning the model an expert persona ("you are a physics expert") matched to the problem type (physics problems) had no significant impact on performance (with the exception of the Gemini 2.0 Flash model). -Off-Domain Experts (Domain-Mismatched): Assigning the model an expert persona ("you are a physics expert") not matched to the problem type (law problems) resulted in marginal differences. -Low-Knowledge Personas: We assigned the model negative capability personas (layperson, young child, toddler), which were generally harmful to benchmark accuracy. Across both benchmarks, persona prompts generally did not improve accuracy relative to a no-persona baseline. Expert personas showed no consistent benefit across models, with few exceptions. Domain-mismatched expert personas sometimes degraded performance. Low-knowledge personas often reduced accuracy. These results are about the accuracy of answers only; personas may serve other purposes (such as altering the tone of outputs), beyond improving factual performance.
Similar Papers
Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance
Computation and Language
Makes AI smarter by telling it who to be.
No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models
Computation and Language
AI sometimes refuses requests based on fake identities.
Self-Transparency Failures in Expert-Persona LLMs: A Large-Scale Behavioral Audit
Artificial Intelligence
Computers admit when they're faking expertise.