Score: 1

Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation

Published: September 8, 2025 | arXiv ID: 2509.06337v1

By: Jianpeng Zhao , Chenyu Yuan , Weiming Luo and more

Potential Business Impact:

Creates fake people for surveys to save time.

Business Areas:

Simulation Software

Questionnaire-based surveys are foundational to social science research and public policymaking, yet traditional survey methods remain costly, time-consuming, and often limited in scale. This paper explores a new paradigm: simulating virtual survey respondents using Large Language Models (LLMs). We introduce two novel simulation settings, namely Partial Attribute Simulation (PAS) and Full Attribute Simulation (FAS), to systematically evaluate the ability of LLMs to generate accurate and demographically coherent responses. In PAS, the model predicts missing attributes based on partial respondent profiles, whereas FAS involves generating complete synthetic datasets under both zero-context and context-enhanced conditions. We curate a comprehensive benchmark suite, LLM-S^3 (Large Language Model-based Sociodemographic Survey Simulation), that spans 11 real-world public datasets across four sociological domains. Our evaluation of multiple mainstream LLMs (GPT-3.5/4 Turbo, LLaMA 3.0/3.1-8B) reveals consistent trends in prediction performance, highlights failure modes, and demonstrates how context and prompt design impact simulation fidelity. This work establishes a rigorous foundation for LLM-driven survey simulations, offering scalable and cost-effective tools for sociological research and policy evaluation. Our code and dataset are available at: https://github.com/dart-lab-research/LLM-S-Cube-Benchmark

Emulating Public Opinion: A Proof-of-Concept of AI-Generated Synthetic Survey Responses for the Chilean Case

Computation and Language

Computers can answer survey questions like people.

11 Sep 2025 0

91%

Delving Into the Psychology of Machines: Exploring the Structure of Self-Regulated Learning via LLM-Generated Survey Responses

Artificial Intelligence

Computers can pretend to be students learning.

16 Jun 2025 0

91%

SocioBench: Modeling Human Behavior in Sociological Surveys with Large Language Models

Social and Information Networks

Helps computers understand how people think.

13 Oct 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

24 pages

Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation

Creates fake people for surveys to save time.

Technical Abstract

Emulating Public Opinion: A Proof-of-Concept of AI-Generated Synthetic Survey Responses for the Chilean Case

Delving Into the Psychology of Machines: Exploring the Structure of Self-Regulated Learning via LLM-Generated Survey Responses

SocioBench: Modeling Human Behavior in Sociological Surveys with Large Language Models