A data-driven approach to the forecasting of ground-level ozone concentration
Dario Marvin, Lorenzo Nespoli, Davide Strepparava, Vasco Medici

TL;DR
This paper presents a machine learning approach for forecasting ground-level ozone concentrations in southern Switzerland, emphasizing feature selection and explainability to improve accuracy and understand atmospheric dependencies.
Contribution
It introduces a data-driven forecasting method that accounts for complex terrain and low station density, utilizing feature selection and Shapley values for model interpretability.
Findings
Effective feature selection improves forecast accuracy.
Shapley values reveal key atmospheric variable interactions.
Weighted observations enhance predictions of ozone peaks.
Abstract
The ability to forecast the concentration of air pollutants in an urban region is crucial for decision-makers wishing to reduce the impact of pollution on public health through active measures (e.g. temporary traffic closures). In this study, we present a machine learning approach applied to the forecast of the day-ahead maximum value of the ozone concentration for several geographical locations in southern Switzerland. Due to the low density of measurement stations and to the complex orography of the use case terrain, we adopted feature selection methods instead of explicitly restricting relevant features to a neighbourhood of the prediction sites, as common in spatio-temporal forecasting methods. We then used Shapley values to assess the explainability of the learned models in terms of feature importance and feature interactions in relation to ozone predictions; our analysis suggests…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection
