The Impact of Annotator Personas on LLM Behavior Across the Perspectivism Spectrum
By: Olufunke O. Sarumi , Charles Welch , Daniel Braun and more
Potential Business Impact:
Helps computers judge online hate speech fairly.
In this work, we explore the capability of Large Language Models (LLMs) to annotate hate speech and abusiveness while considering predefined annotator personas within the strong-to-weak data perspectivism spectra. We evaluated LLM-generated annotations against existing annotator modeling techniques for perspective modeling. Our findings show that LLMs selectively use demographic attributes from the personas. We identified prototypical annotators, with persona features that show varying degrees of alignment with the original human annotators. Within the data perspectivism paradigm, annotator modeling techniques that do not explicitly rely on annotator information performed better under weak data perspectivism compared to both strong data perspectivism and human annotations, suggesting LLM-generated views tend towards aggregation despite subjective prompting. However, for more personalized datasets tailored to strong perspectivism, the performance of LLM annotator modeling approached, but did not exceed, human annotators.
Similar Papers
Ideology-Based LLMs for Content Moderation
Computation and Language
AI models can be tricked into favoring certain opinions.
Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection
Computation and Language
Makes AI better at spotting hate speech fairly.
Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models
Computation and Language
Makes AI less biased when judging mean words.