Score: 0

In Silico Development of Psychometric Scales: Feasibility of Representative Population Data Simulation with LLMs

Published: December 2, 2025 | arXiv ID: 2512.02910v1

By: Enrico Cipriani , Pavel Okopnyi , Danilo Menicucci and more

Potential Business Impact:

Lets computers create fake people for testing.

Business Areas:

Simulation Software

Developing and validating psychometric scales requires large samples, multiple testing phases, and substantial resources. Recent advances in Large Language Models (LLMs) enable the generation of synthetic participant data by prompting models to answer items while impersonating individuals of specific demographic profiles, potentially allowing in silico piloting before real data collection. Across four preregistered studies (N = circa 300 each), we tested whether LLM-simulated datasets can reproduce the latent structures and measurement properties of human responses. In Studies 1-2, we compared LLM-generated data with real datasets for two validated scales; in Studies 3-4, we created new scales using EFA on simulated data and then examined whether these structures generalized to newly collected human samples. Simulated datasets replicated the intended factor structures in three of four studies and showed consistent configural and metric invariance, with scalar invariance achieved for the two newly developed scales. However, correlation-based tests revealed substantial differences between real and synthetic datasets, and notable discrepancies appeared in score distributions and variances. Thus, while LLMs capture group-level latent structures, they do not approximate individual-level data properties. Simulated datasets also showed full internal invariance across gender. Overall, LLM-generated data appear useful for early-stage, group-level psychometric prototyping, but not as substitutes for individual-level validation. We discuss methodological limitations, risks of bias and data pollution, and ethical considerations related to in silico psychometric simulations.

Scaling Law in LLM Simulated Personality: More Detailed and Realistic Persona Profile Is All You Need

Computers and Society

Computers can now pretend to be people.

10 Oct 2025 0

91%

From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers

Artificial Intelligence

Computers guess your personality from a few answers.

5 Nov 2025 0

91%

Do LLMs Give Psychometrically Plausible Responses in Educational Assessments?

Computation and Language

Computers can't yet help make tests better.

11 Jun 2025 1

View PDF Login to Bookmark

Country of Origin

🇳🇴 Norway

Page Count

52 pages

In Silico Development of Psychometric Scales: Feasibility of Representative Population Data Simulation with LLMs

Lets computers create fake people for testing.

Technical Abstract

Scaling Law in LLM Simulated Personality: More Detailed and Realistic Persona Profile Is All You Need

From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers

Do LLMs Give Psychometrically Plausible Responses in Educational Assessments?