Score: 0

Human-in-the-Loop Testing of AI Agents for Air Traffic Control with a Regulated Assessment Framework

Published: January 7, 2026 | arXiv ID: 2601.04288v1

By: Ben Carvell , Marc Thomas , Andrew Pace and more

Potential Business Impact:

Tests AI to safely guide airplanes like humans.

Business Areas:

Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

We present a rigorous, human-in-the-loop evaluation framework for assessing the performance of AI agents on the task of Air Traffic Control, grounded in a regulator-certified simulator-based curriculum used for training and testing real-world trainee controllers. By leveraging legally regulated assessments and involving expert human instructors in the evaluation process, our framework enables a more authentic and domain-accurate measurement of AI performance. This work addresses a critical gap in the existing literature: the frequent misalignment between academic representations of Air Traffic Control and the complexities of the actual operational environment. It also lays the foundations for effective future human-machine teaming paradigms by aligning machine performance with human assessment targets.

Trustworthy AI: UK Air Traffic Control Revisited

Computers and Society

Helps air traffic controllers trust AI tools.

25 Jul 2025 0

88%

A Probabilistic Digital Twin of UK En Route Airspace for Training and Evaluating AI Agents for Air Traffic Control

Computational Engineering, Finance, and Science

Lets AI practice controlling planes safely.

6 Jan 2026 0

88%

Towards Human-Centric Evaluation of Interaction-Aware Automated Vehicle Controllers: A Framework and Case Study

Human-Computer Interaction

Makes self-driving cars safer for human drivers.

7 Aug 2025 1

View PDF Login to Bookmark

Page Count

18 pages

Human-in-the-Loop Testing of AI Agents for Air Traffic Control with a Regulated Assessment Framework

Tests AI to safely guide airplanes like humans.

Technical Abstract

Trustworthy AI: UK Air Traffic Control Revisited

A Probabilistic Digital Twin of UK En Route Airspace for Training and Evaluating AI Agents for Air Traffic Control

Towards Human-Centric Evaluation of Interaction-Aware Automated Vehicle Controllers: A Framework and Case Study