MoVa: Towards Generalizable Classification of Human Morals and Values
By: Ziyu Chen , Junfei Sun , Chenxi Li and more
Potential Business Impact:
Helps computers understand what's right and wrong.
Identifying human morals and values embedded in language is essential to empirical studies of communication. However, researchers often face substantial difficulty navigating the diversity of theoretical frameworks and data available for their analysis. Here, we contribute MoVa, a well-documented suite of resources for generalizable classification of human morals and values, consisting of (1) 16 labeled datasets and benchmarking results from four theoretically-grounded frameworks; (2) a lightweight LLM prompting strategy that outperforms fine-tuned models across multiple domains and frameworks; and (3) a new application that helps evaluate psychological surveys. In practice, we specifically recommend a classification strategy, all@once, that scores all related concepts simultaneously, resembling the well-known multi-label classifier chain. The data and methods in MoVa can facilitate many fine-grained interpretations of human and machine communication, with potential implications for the alignment of machine behavior.
Similar Papers
Differences in the Moral Foundations of Large Language Models
Computers and Society
Models show different values than people.
The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models
Computation and Language
Keeps AI acting good, even in new situations.
Structured Moral Reasoning in Language Models: A Value-Grounded Evaluation Framework
Human-Computer Interaction
Teaches computers to make fair, good choices.