Comprehensive and Spatially Detailed Passenger Vehicle and Truck Traffic Volume Data for the United States Estimated by Machine Learning
Brittany Antonczak, Meg Fay, Aviral Chawla, Gregory Rowangould

TL;DR
This paper develops a machine learning-based method to estimate missing truck traffic data across US roadways, creating a comprehensive, high-resolution dataset that enhances transportation and public health research.
Contribution
It introduces a novel application of random forest regression to fill data gaps in truck traffic volumes, resulting in the most complete publicly available dataset of its kind.
Findings
Achieved coverage of 85.2% of US public roadways.
Produced validated high-resolution traffic density data.
Enhanced understanding of truck traffic's impact on air quality.
Abstract
The Highway Performance Monitoring System, managed by the Federal Highway Administration, provides data on average annual daily traffic volume across roadways in the United States, but it has limited representation of medium- and heavy-duty vehicle traffic on lower-volume roadways that are not part of the national highway system. This gap limits research and policy analysis on the community impacts of truck traffic, especially concerning air quality and public health. To address this, we use random forest regression to estimate medium- and heavy-duty vehicle traffic volumes on network links where these data are missing. The result is a comprehensive vehicle traffic dataset that covers 85.2% of public roadways in the United States. From these data, we also calculate traffic density values for each census block and vehicle class that can serve as a high-resolution surrogate for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
