Score: 0

Automatic Classifiers Underdetect Emotions Expressed by Men

Published: January 8, 2026 | arXiv ID: 2601.04730v1

By: Ivan Smirnov , Segun T. Aroyehun , Paul Plener and more

Potential Business Impact:

Computers misunderstand men's feelings more than women's.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The widespread adoption of automatic sentiment and emotion classifiers makes it important to ensure that these tools perform reliably across different populations. Yet their reliability is typically assessed using benchmarks that rely on third-party annotators rather than the individuals experiencing the emotions themselves, potentially concealing systematic biases. In this paper, we use a unique, large-scale dataset of more than one million self-annotated posts and a pre-registered research design to investigate gender biases in emotion detection across 414 combinations of models and emotion-related classes. We find that across different types of automatic classifiers and various underlying emotions, error rates are consistently higher for texts authored by men compared to those authored by women. We quantify how this bias could affect results in downstream applications and show that current machine learning tools, including large language models, should be applied with caution when the gender composition of a sample is not known or variable. Our findings demonstrate that sentiment analysis is not yet a solved problem, especially in ensuring equitable model behaviour across demographic groups.