Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment
By: Jiseon Kim , Jea Kwon , Luiz Felipe Vecchietti and more
Potential Business Impact:
AI makes different moral choices based on who it pretends to be.
Deploying large language models (LLMs) with agency in real-world applications raises critical questions about how these models will behave. In particular, how will their decisions align with humans when faced with moral dilemmas? This study examines the alignment between LLM-driven decisions and human judgment in various contexts of the moral machine experiment, including personas reflecting different sociodemographics. We find that the moral decisions of LLMs vary substantially by persona, showing greater shifts in moral decisions for critical tasks than humans. Our data also indicate an interesting partisan sorting phenomenon, where political persona predominates the direction and degree of LLM decisions. We discuss the ethical implications and risks associated with deploying these models in applications that involve moral decisions.
Similar Papers
When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas
Computation and Language
AI struggles when doing good conflicts with getting rewards.
Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings
Computers and Society
AI personalities don't understand people like real humans.
Normative Evaluation of Large Language Models with Everyday Moral Dilemmas
Artificial Intelligence
Tests AI's right and wrong choices in real-life problems.