More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models
By: Evan Chen , Run-Jun Zhan , Yan-Bai Lin and more
Potential Business Impact:
Finds computers show unfair gender stories.
Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases. This study introduces a novel evaluation framework to uncover gender biases in LLMs: using free-form storytelling to surface biases embedded within the models. A systematic analysis of ten prominent LLMs shows a consistent pattern of overrepresenting female characters across occupations, likely due to supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Paradoxically, despite this overrepresentation, the occupational gender distributions produced by these LLMs align more closely with human stereotypes than with real-world labor data. This highlights the challenge and importance of implementing balanced mitigation measures to promote fairness and prevent the establishment of potentially new biases. We release the prompts and LLM-generated stories at GitHub.
Similar Papers
Investigating Gender Bias in LLM-Generated Stories via Psychological Stereotypes
Computation and Language
Finds how stories show gender bias.
Addressing Stereotypes in Large Language Models: A Critical Examination and Mitigation
Computation and Language
Fixes AI's unfairness and improves its understanding.
Automated Evaluation of Gender Bias Across 13 Large Multimodal Models
CV and Pattern Recognition
Finds AI makes unfair pictures of jobs.