Online Action-Stacking Improves Reinforcement Learning Performance for Air Traffic Control

Ben Carvell; George De Ath; Eseoghene Benjamin; Richard Everson

arXiv:2601.04287·cs.LG·January 12, 2026

Online Action-Stacking Improves Reinforcement Learning Performance for Air Traffic Control

Ben Carvell, George De Ath, Eseoghene Benjamin, Richard Everson

PDF

Open Access

TL;DR

This paper presents online action-stacking, a method that enhances reinforcement learning for air traffic control by producing realistic commands with a smaller action space, improving efficiency and scalability.

Contribution

Introducing online action-stacking as an inference-time wrapper that compiles primitive actions into domain-appropriate commands, enabling effective RL in complex ATC tasks with fewer actions.

Findings

01

Reduces instruction frequency compared to baseline

02

Achieves similar performance with fewer actions

03

Facilitates scaling to complex control scenarios

Abstract

We introduce online action-stacking, an inference-time wrapper for reinforcement learning policies that produces realistic air traffic control commands while allowing training on a much smaller discrete action space. Policies are trained with simple incremental heading or level adjustments, together with an action-damping penalty that reduces instruction frequency and leads agents to issue commands in short bursts. At inference, online action-stacking compiles these bursts of primitive actions into domain-appropriate compound clearances. Using Proximal Policy Optimisation and the BluebirdDT digital twin platform, we train agents to navigate aircraft along lateral routes, manage climb and descent to target flight levels, and perform two-aircraft collision avoidance under a minimum separation constraint. In our lateral navigation experiments, action stacking greatly reduces the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAir Traffic Management and Optimization · Aerospace and Aviation Technology · Reinforcement Learning in Robotics