Denoising ESG: quantifying data uncertainty from missing data with Machine Learning and prediction intervals
Sergio Caprioli, Jacopo Foschi, Riccardo Crupi, Alessandro Sabatino

TL;DR
This paper applies machine learning to impute missing ESG data, emphasizing the importance of quantifying uncertainty with prediction intervals to improve the reliability of ESG ratings.
Contribution
It introduces the use of probabilistic machine learning models with prediction intervals for ESG data imputation, addressing data uncertainty and robustness.
Findings
Prediction intervals effectively quantify data uncertainty.
Multiple imputation strategies improve ESG data robustness.
Probabilistic models enhance ESG rating reliability.
Abstract
Environmental, Social, and Governance (ESG) datasets are frequently plagued by significant data gaps, leading to inconsistencies in ESG ratings due to varying imputation methods. This paper explores the application of established machine learning techniques for imputing missing data in a real-world ESG dataset, emphasizing the quantification of uncertainty through prediction intervals. By employing multiple imputation strategies, this study assesses the robustness of imputation methods and quantifies the uncertainty associated with missing data. The findings highlight the importance of probabilistic machine learning models in providing better understanding of ESG scores, thereby addressing the inherent risks of wrong ratings due to incomplete data. This approach improves imputation practices to enhance the reliability of ESG ratings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting
