Value-Action Alignment in Large Language Models under Privacy-Prosocial Conflict
By: Guanyu Chen, Chenxiao Yu, Xiyang Hu
Potential Business Impact:
Tests if AI shares your private info fairly.
Large language models (LLMs) are increasingly used to simulate decision-making tasks involving personal data sharing, where privacy concerns and prosocial motivations can push choices in opposite directions. Existing evaluations often measure privacy-related attitudes or sharing intentions in isolation, which makes it difficult to determine whether a model's expressed values jointly predict its downstream data-sharing actions as in real human behaviors. We introduce a context-based assessment protocol that sequentially administers standardized questionnaires for privacy attitudes, prosocialness, and acceptance of data sharing within a bounded, history-carrying session. To evaluate value-action alignments under competing attitudes, we use multi-group structural equation modeling (MGSEM) to identify relations from privacy concerns and prosocialness to data sharing. We propose Value-Action Alignment Rate (VAAR), a human-referenced directional agreement metric that aggregates path-level evidence for expected signs. Across multiple LLMs, we observe stable but model-specific Privacy-PSA-AoDS profiles, and substantial heterogeneity in value-action alignment.
Similar Papers
Operationalizing Pluralistic Values in Large Language Model Alignment Reveals Trade-offs in Safety, Inclusivity, and Model Behavior
Artificial Intelligence
Makes AI understand different people better.
ProSocialAlign: Preference Conditioned Test Time Alignment in Language Models
Computation and Language
Makes AI helpful and safe, even when upset.
VAL-Bench: Measuring Value Alignment in Language Models
Artificial Intelligence
Checks if AI has fair and steady opinions.