Score: 0

BAID: A Benchmark for Bias Assessment of AI Detectors

Published: December 12, 2025 | arXiv ID: 2512.11505v1

By: Priyam Basu, Yunfeng Zhang, Vipul Raheja

AI-generated text detectors have recently gained adoption in educational and professional contexts. Prior research has uncovered isolated cases of bias, particularly against English Language Learners (ELLs) however, there is a lack of systematic evaluation of such systems across broader sociolinguistic factors. In this work, we propose BAID, a comprehensive evaluation framework for AI detectors across various types of biases. As a part of the framework, we introduce over 200k samples spanning 7 major categories: demographics, age, educational grade level, dialect, formality, political leaning, and topic. We also generated synthetic versions of each sample with carefully crafted prompts to preserve the original content while reflecting subgroup-specific writing styles. Using this, we evaluate four open-source state-of-the-art AI text detectors and find consistent disparities in detection performance, particularly low recall rates for texts from underrepresented groups. Our contributions provide a scalable, transparent approach for auditing AI detectors and emphasize the need for bias-aware evaluation before these tools are deployed for public use.

How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models

Computation and Language

AI unfairly flags Black people's words as bad.

10 Nov 2025 1

88%

Identifying Bias in Machine-generated Text Detection

Computation and Language

Detectors unfairly flag some students' writing as fake.

10 Dec 2025 0

88%

Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study

Computation and Language

Finds unfairness in AI's words.

6 Oct 2025 1

View PDF Login to Bookmark

BAID: A Benchmark for Bias Assessment of AI Detectors

Technical Abstract

How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models

Identifying Bias in Machine-generated Text Detection

Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study