Using Machine Learning in Analyzing Air Quality Discrepancies of Environmental Impact
Shuangbao Paul Wang, Lucas Yang, Rahouane Chouchane, Jin Guo, Michael Bailey

TL;DR
This paper uses machine learning to analyze air quality disparities in Baltimore, revealing how socioeconomic and demographic factors influence pollution levels and highlighting ongoing environmental justice issues.
Contribution
The study integrates diverse data sources with machine learning to uncover socio-economic and demographic disparities in urban air pollution levels.
Findings
Air pollution correlates with biased insurance risk estimation.
Significant NO2 disparities between income groups.
Ethnic disparities in air pollution levels.
Abstract
In this study, we apply machine learning and software engineering in analyzing air pollution levels in City of Baltimore. The data model was fed with three primary data sources: 1) a biased method of estimating insurance risk used by homeowners loan corporation, 2) demographics of Baltimore residents, and 3) census data estimate of NO2 and PM2.5 concentrations. The dataset covers 650,643 Baltimore residents in 44.7 million residents in 202 major cities in US. The results show that air pollution levels have a clear association with the biased insurance estimating method. Great disparities present in NO2 level between more desirable and low income blocks. Similar disparities exist in air pollution level between residents' ethnicity. As Baltimore population consists of a greater proportion of people of color, the finding reveals how decades old policies has continued to discriminate and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
