EthicsMH: A Pilot Benchmark for Ethical Reasoning in Mental Health AI
By: Sai Kartheek Reddy Kasu
Potential Business Impact:
Teaches AI to make good choices in therapy.
The deployment of large language models (LLMs) in mental health and other sensitive domains raises urgent questions about ethical reasoning, fairness, and responsible alignment. Yet, existing benchmarks for moral and clinical decision-making do not adequately capture the unique ethical dilemmas encountered in mental health practice, where confidentiality, autonomy, beneficence, and bias frequently intersect. To address this gap, we introduce Ethical Reasoning in Mental Health (EthicsMH), a pilot dataset of 125 scenarios designed to evaluate how AI systems navigate ethically charged situations in therapeutic and psychiatric contexts. Each scenario is enriched with structured fields, including multiple decision options, expert-aligned reasoning, expected model behavior, real-world impact, and multi-stakeholder viewpoints. This structure enables evaluation not only of decision accuracy but also of explanation quality and alignment with professional norms. Although modest in scale and developed with model-assisted generation, EthicsMH establishes a task framework that bridges AI ethics and mental health decision-making. By releasing this dataset, we aim to provide a seed resource that can be expanded through community and expert contributions, fostering the development of AI systems capable of responsibly handling some of society's most delicate decisions.
Similar Papers
LLM Ethics Benchmark: A Three-Dimensional Assessment System for Evaluating Moral Reasoning in Large Language Models
Computers and Society
Tests if AI makes good and fair choices.
MindEval: Benchmarking Language Models on Multi-turn Mental Health Support
Computation and Language
Tests AI mental health helpers for real problems.
MindEval: Benchmarking Language Models on Multi-turn Mental Health Support
Computation and Language
Tests AI chatbots for better mental health help.