Score: 1

The Morality of Probability: How Implicit Moral Biases in LLMs May Shape the Future of Human-AI Symbiosis

Published: September 12, 2025 | arXiv ID: 2509.10297v1

By: Eoin O'Doherty , Nicole Weinrauch , Andrew Talone and more

Potential Business Impact:

AI learns to pick "good" choices over "selfish" ones.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

Artificial intelligence (AI) is advancing at a pace that raises urgent questions about how to align machine decision-making with human moral values. This working paper investigates how leading AI systems prioritize moral outcomes and what this reveals about the prospects for human-AI symbiosis. We address two central questions: (1) What moral values do state-of-the-art large language models (LLMs) implicitly favour when confronted with dilemmas? (2) How do differences in model architecture, cultural origin, and explainability affect these moral preferences? To explore these questions, we conduct a quantitative experiment with six LLMs, ranking and scoring outcomes across 18 dilemmas representing five moral frameworks. Our findings uncover strikingly consistent value biases. Across all models, Care and Virtue values outcomes were rated most moral, while libertarian choices were consistently penalized. Reasoning-enabled models exhibited greater sensitivity to context and provided richer explanations, whereas non-reasoning models produced more uniform but opaque judgments. This research makes three contributions: (i) Empirically, it delivers a large-scale comparison of moral reasoning across culturally distinct LLMs; (ii) Theoretically, it links probabilistic model behaviour with underlying value encodings; (iii) Practically, it highlights the need for explainability and cultural awareness as critical design principles to guide AI toward a transparent, aligned, and symbiotic future.

Country of Origin
🇨🇳 China

Page Count
22 pages

Category
Computer Science:
Artificial Intelligence