TL;DR
Aeolus is a comprehensive, multi-modal flight delay dataset that captures spatiotemporal dynamics and relational structures, enabling advanced research in delay prediction and foundation models for structured data.
Contribution
The paper introduces Aeolus, a large-scale multi-modal dataset with aligned tabular, sequential, and graph data for flight delays, addressing limitations of existing datasets.
Findings
Provides over 50 million flights data with rich features
Models delay propagation through flight chains and network graphs
Supports diverse tasks like regression, classification, and graph learning
Abstract
We introduce Aeolus, a large-scale Multi-modal Flight Delay Dataset designed to advance research on flight delay prediction and support the development of foundation models for tabular data. Existing datasets in this domain are typically limited to flat tabular structures and fail to capture the spatiotemporal dynamics inherent in delay propagation. Aeolus addresses this limitation by providing three aligned modalities: (i) a tabular dataset with rich operational, meteorological, and airportlevel features for over 50 million flights; (ii) a flight chain module that models delay propagation along sequential flight legs, capturing upstream and downstream dependencies; and (iii) a flight network graph that encodes shared aircraft, crew, and airport resource connections, enabling cross-flight relational reasoning. The dataset is carefully constructed with temporal splits, comprehensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
