Who Gets Cited? Gender- and Majority-Bias in LLM-Driven Reference Selection
By: Jiangen He
Potential Business Impact:
AI unfairly favors male authors when picking research.
Large language models (LLMs) are rapidly being adopted as research assistants, particularly for literature review and reference recommendation, yet little is known about whether they introduce demographic bias into citation workflows. This study systematically investigates gender bias in LLM-driven reference selection using controlled experiments with pseudonymous author names. We evaluate several LLMs (GPT-4o, GPT-4o-mini, Claude Sonnet, and Claude Haiku) by varying gender composition within candidate reference pools and analyzing selection patterns across fields. Our results reveal two forms of bias: a persistent preference for male-authored references and a majority-group bias that favors whichever gender is more prevalent in the candidate pool. These biases are amplified in larger candidate pools and only modestly attenuated by prompt-based mitigation strategies. Field-level analysis indicates that bias magnitude varies across scientific domains, with social sciences showing the least bias. Our findings indicate that LLMs can reinforce or exacerbate existing gender imbalances in scholarly recognition. Effective mitigation strategies are needed to avoid perpetuating existing gender disparities in scientific citation practices before integrating LLMs into high-stakes academic workflows.
Similar Papers
Gender Bias in LLMs: Preliminary Evidence from Shared Parenting Scenario in Czech Family Law
Computation and Language
AI gives unfair legal advice based on gender.
Fairness Evaluation of Large Language Models in Academic Library Reference Services
Computation and Language
AI helps libraries treat everyone fairly.
Justice in Judgment: Unveiling (Hidden) Bias in LLM-assisted Peer Reviews
Computers and Society
Finds AI reviews unfairly favor famous schools.