Score: 1

SocioBench: Modeling Human Behavior in Sociological Surveys with Large Language Models

Published: October 13, 2025 | arXiv ID: 2510.11131v1

By: Jia Wang , Ziyu Zhao , Tingjuntao Ni and more

Potential Business Impact:

Helps computers understand how people think.

Business Areas:

Simulation Software

Large language models (LLMs) show strong potential for simulating human social behaviors and interactions, yet lack large-scale, systematically constructed benchmarks for evaluating their alignment with real-world social attitudes. To bridge this gap, we introduce SocioBench-a comprehensive benchmark derived from the annually collected, standardized survey data of the International Social Survey Programme (ISSP). The benchmark aggregates over 480,000 real respondent records from more than 30 countries, spanning 10 sociological domains and over 40 demographic attributes. Our experiments indicate that LLMs achieve only 30-40% accuracy when simulating individuals in complex survey scenarios, with statistically significant differences across domains and demographic subgroups. These findings highlight several limitations of current LLMs in survey scenarios, including insufficient individual-level data coverage, inadequate scenario diversity, and missing group-level modeling.

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

Computation and Language

Tests if AI acts like real people.

20 Oct 2025 1

94%

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

Computation and Language

Tests how well AI imitates people's actions.

20 Oct 2025 1

94%

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

Computation and Language

Tests if AI acts like real people.

20 Oct 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com github.com github.com

Page Count

33 pages

SocioBench: Modeling Human Behavior in Sociological Surveys with Large Language Models

Helps computers understand how people think.

Technical Abstract

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors