Scalable Transit Delay Prediction at City Scale: A Systematic Approach with Multi-Resolution Feature Engineering and Deep Learning
Emna Boudabbous, Mohamed Karaa, Lokman Sboui, Julio Montecinos, Omar Alam

TL;DR
This paper introduces a scalable, deep learning-based system for city-wide bus delay prediction that leverages multi-resolution features, clustering, and dimensionality reduction to improve accuracy and efficiency.
Contribution
The paper presents a novel city-scale prediction pipeline combining multi-resolution feature engineering, hybrid clustering, and deep learning, enabling scalable and accurate delay predictions.
Findings
Best model is a cluster-aware LSTM outperforming transformers by 18-52%.
Pipeline achieves real-time deployment suitability for city-scale transit networks.
Dimensionality reduction preserves 95% variance with only 83 components.
Abstract
Urban bus transit agencies need reliable, network-wide delay predictions to provide accurate arrival information to passengers and support real-time operational control. Accurate predictions help passengers plan their trips, reduce waiting time, and allow operations staff to adjust headways, dispatch extra vehicles, and manage disruptions. Although real-time feeds such as GTFS-Realtime (GTFS-RT) are now widely available, most existing delay prediction systems handle only a few routes, depend on hand-crafted features, and offer little guidance on how to design a scalable, reusable architecture. We present a city-scale prediction pipeline that combines multi-resolution feature engineering, dimensionality reduction, and deep learning. The framework generates 1,683 spatiotemporal features by exploring 23 aggregation combinations over H3 cells, routes, segments, and temporal patterns, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Human Mobility and Location-Based Analysis · Transportation Planning and Optimization
