A Data-Driven Supervised Machine Learning Approach to Estimating Global Ambient Air Pollution Concentrations With Associated Prediction Intervals
Liam J Berrisford, Hugo Barbosa, Ronaldo Menezes

TL;DR
This paper presents a scalable machine learning framework that imputes missing air pollution data globally, providing detailed estimates with prediction intervals to improve environmental assessments and inform station placement.
Contribution
The study introduces a novel supervised machine learning approach for filling gaps in air quality data, offering high-resolution estimates with prediction intervals and insights for monitoring station placement.
Findings
Effective imputation of missing data across multiple pollutants.
Generation of hourly, high-resolution pollution datasets with prediction intervals.
Insights into optimal placement of monitoring stations for improved accuracy.
Abstract
Global ambient air pollution, a transboundary challenge, is typically addressed through interventions relying on data from spatially sparse and heterogeneously placed monitoring stations. These stations often encounter temporal data gaps due to issues such as power outages. In response, we have developed a scalable, data-driven, supervised machine learning framework. This model is designed to impute missing temporal and spatial measurements, thereby generating a comprehensive dataset for pollutants including NO, O, PM, PM, and SO. The dataset, with a fine granularity of 0.25 at hourly intervals and accompanied by prediction intervals for each estimate, caters to a wide range of stakeholders relying on outdoor air pollution data for downstream assessments. This enables more detailed studies. Additionally, the model's performance across various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Air Quality and Health Impacts · Vehicle emissions and performance
