Score: 0

Should you use LLMs to simulate opinions? Quality checks for early-stage deliberation

Published: April 11, 2025 | arXiv ID: 2504.08954v3

By: Terrence Neumann, Maria De-Arteaga, Sina Fazelpour

Potential Business Impact:

Tests if AI opinions are trustworthy for surveys.

Business Areas:

Simulation Software

The emergent capabilities of large language models (LLMs) have prompted interest in using them as surrogates for human subjects in opinion surveys. However, prior evaluations of LLM-based opinion simulation have relied heavily on costly, domain-specific survey data, and mixed empirical results leave their reliability in question. To enable cost-effective, early-stage evaluation, we introduce a quality control assessment designed to test the viability of LLM-simulated opinions on Likert-scale tasks without requiring large-scale human data for validation. This assessment comprises two key tests: \emph{logical consistency} and \emph{alignment with stakeholder expectations}, offering a low-cost, domain-adaptable validation tool. We apply our quality control assessment to an opinion simulation task relevant to AI-assisted content moderation and fact-checking workflows -- a socially impactful use case -- and evaluate seven LLMs using a baseline prompt engineering method (backstory prompting), as well as fine-tuning and in-context learning variants. None of the models or methods pass the full assessment, revealing several failure modes. We conclude with a discussion of the risk management implications and release \texttt{TopicMisinfo}, a benchmark dataset with paired human and LLM annotations simulated by various models and approaches, to support future research.

An Analysis of Large Language Models for Simulating User Responses in Surveys

Computation and Language

Helps computers understand many different opinions.

7 Dec 2025 0

90%

Synthesizing Public Opinions with LLMs: Role Creation, Impacts, and the Future to eDemorcacy

Computation and Language

Makes computers guess what people think better.

31 Mar 2025 0

90%

Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings

Computers and Society

Tests if computer brains copy people's choices.

17 Jun 2025 1

View PDF Login to Bookmark

Page Count

19 pages

Should you use LLMs to simulate opinions? Quality checks for early-stage deliberation

Tests if AI opinions are trustworthy for surveys.

Technical Abstract

An Analysis of Large Language Models for Simulating User Responses in Surveys

Synthesizing Public Opinions with LLMs: Role Creation, Impacts, and the Future to eDemorcacy

Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings