Benchmarking Scientific Machine Learning Models for Air Quality Data
Khawja Imran Masud, Venkata Sai Rahul Unnam, and Sahara Ali

TL;DR
This study benchmarks classical, machine learning, and deep learning models for air quality index forecasting in North Texas, demonstrating that physics-guided deep learning models enhance stability and physical consistency, especially for short-term predictions.
Contribution
It introduces a comprehensive benchmark and physics-guided variants for AQI forecasting, providing practical guidance for model selection in regional air quality prediction.
Findings
Deep learning models outperform traditional baselines in AQI forecasting.
Physics guidance improves model stability and physical consistency.
Short-horizon predictions benefit most from physics-guided models.
Abstract
Accurate air quality index (AQI) forecasting is essential for the protecting public health in rapidly growing urban regions, and the practical model evaluation and selection are often challenged by the lack of rigorous, region-specific benchmarking on standardized datasets. Physics-guided machine learning and deep learning models could be a good and effective solution to resolve such issues with more accurate and efficient AQI forecasting. This research study presents an explainable and comprehensive benchmark that enables a guideline and proposed physics-guided best model by benchmarking classical time-series, machine-learning, and deep-learning approaches for multi-horizon AQI forecasting in North Texas (Dallas County). Using publicly available U.S. Environmental Protection Agency (EPA) daily observations of air quality data from 2022 to 2024, we curate city-level time series for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
