Measuring What Matters: Connecting AI Ethics Evaluations to System Attributes, Hazards, and Harms
By: Shalaleh Rismani , Renee Shelby , Leah Davis and more
Potential Business Impact:
Helps AI systems avoid causing harm.
Over the past decade, an ecosystem of measures has emerged to evaluate the social and ethical implications of AI systems, largely shaped by high-level ethics principles. These measures are developed and used in fragmented ways, without adequate attention to how they are situated in AI systems. In this paper, we examine how existing measures used in the computing literature map to AI system components, attributes, hazards, and harms. Our analysis draws on a scoping review resulting in nearly 800 measures corresponding to 11 AI ethics principles. We find that most measures focus on four principles - fairness, transparency, privacy, and trust - and primarily assess model or output system components. Few measures account for interactions across system elements, and only a narrow set of hazards is typically considered for each harm type. Many measures are disconnected from where harm is experienced and lack guidance for setting meaningful thresholds. These patterns reveal how current evaluation practices remain fragmented, measuring in pieces rather than capturing how harms emerge across systems. Framing measures with respect to system attributes, hazards, and harms can strengthen regulatory oversight, support actionable practices in industry, and ground future research in systems-level understanding.
Similar Papers
Safety by Measurement: A Systematic Literature Review of AI Safety Evaluation Methods
Artificial Intelligence
Tests AI for dangerous tricks and hidden goals.
Do Ethical AI Principles Matter to Users? A Large-Scale Analysis of User Sentiment and Satisfaction
Human-Computer Interaction
Makes AI more liked by people using it.
Measuring the right thing: justifying metrics in AI impact assessments
Computers and Society
Makes AI fair by explaining why we measure it.