Evaluating LLM Safety Across Child Development Stages: A Simulated Agent Approach
By: Abhejay Murali , Saleh Afroogh , Kevin Chen and more
Potential Business Impact:
Tests AI for kids' safety and understanding.
Large Language Models (LLMs) are rapidly becoming part of tools used by children; however, existing benchmarks fail to capture how these models manage language, reasoning, and safety needs that are specific to various ages. We present ChildSafe, a benchmark that evaluates LLM safety through simulated child agents that embody four developmental stages. These agents, grounded in developmental psychology, enable a systematic study of child safety without the ethical implications of involving real children. ChildSafe assesses responses across nine safety dimensions (including privacy, misinformation, and emotional support) using age-weighted scoring in both sensitive and neutral contexts. Multi-turn experiments with multiple LLMs uncover consistent vulnerabilities that vary by simulated age, exposing shortcomings in existing alignment practices. By releasing agent templates, evaluation protocols, and an experimental corpus, we provide a reproducible framework for age-aware safety research. We encourage the community to expand this work with real child-centered data and studies, advancing the development of LLMs that are genuinely safe and developmentally aligned.
Similar Papers
Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions
Computers and Society
Tests if AI is safe for kids and teens.
The Scales of Justitia: A Comprehensive Survey on Safety Evaluation of LLMs
Computation and Language
Makes AI safer by checking its bad ideas.
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
Cryptography and Security
Makes AI safer from start to finish.