Multimodal Deep Learning for ATCO Command Lifecycle Modeling and Workload Prediction
By: Kaizhen Tan
Potential Business Impact:
Helps air traffic controllers manage planes better.
Air traffic controllers (ATCOs) issue high-intensity voice commands in dense airspace, where accurate workload modeling is critical for safety and efficiency. This paper proposes a multimodal deep learning framework that integrates structured data, trajectory sequences, and image features to estimate two key parameters in the ATCO command lifecycle: the time offset between a command and the resulting aircraft maneuver, and the command duration. A high-quality dataset was constructed, with maneuver points detected using sliding window and histogram-based methods. A CNN-Transformer ensemble model was developed for accurate, generalizable, and interpretable predictions. By linking trajectories to voice commands, this work offers the first model of its kind to support intelligent command generation and provides practical value for workload assessment, staffing, and scheduling.
Similar Papers
Air Traffic Controller Task Demand via Graph Neural Networks: An Interpretable Approach to Airspace Complexity
Machine Learning (CS)
Helps air traffic controllers manage busy skies.
Learning to Explain Air Traffic Situation
Machine Learning (CS)
Helps air traffic controllers see the whole sky.
From Voice to Safety: Language AI Powered Pilot-ATC Communication Understanding for Airport Surface Movement Collision Risk Assessment
Audio and Speech Processing
Helps planes avoid crashing on the ground.