Mixed-Effects Modeling of NYC Subway Ridership Using MTA and Weather Data
Zoe Curtis, Jake Haines

TL;DR
This paper models NYC subway ridership using mixed-effects models incorporating MTA and weather data, revealing how borough origin and wind speed influence monthly transit patterns.
Contribution
It introduces a longitudinal mixed-effects modeling approach combined with PCA and automated data processing to analyze large-scale transit and weather data.
Findings
Manhattan origin significantly affects ridership levels.
Maximum gust speed reduces ridership, especially in Manhattan.
December ridership decline is largely due to increased wind speeds.
Abstract
This study investigates monthly trends in New York City subway ridership throughout 2023 by integrating Metropolitan Transportation Authority (MTA) origin-destination data with weather data from Weather Underground. Using a longitudinal mixed-effects modeling framework, we assess how origin borough, seasonal variation, and weather, particularly maximum gust speed, influence average monthly ridership. The dataset was processed using an automated ETL pipeline built with Apache Airflow and PostgreSQL to handle over 115 million records. Principal component analysis (PCA) was applied to reduce multicollinearity among weather covariates. Our findings indicate that origin borough, especially Manhattan, plays a dominant role in ridership levels, while maximum gust speed significantly reduces ridership, primarily for trips originating in Manhattan. Further analysis reveals that December's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUrban Transport and Accessibility · Transportation Planning and Optimization · Underground infrastructure and sustainability
