Navigating the corporate disclosure gap: Modelling of Missing Not at Random Carbon Data
Malgorzata Olesiewicz, Jaakko Kooroshy, Sonja Greven

TL;DR
This paper develops a novel multiple imputation approach to estimate missing corporate carbon emissions data, accounting for bias and uncertainty, thereby improving prediction accuracy and reliability for climate risk assessment.
Contribution
It introduces a multiple imputation method based on Heckman's sample selection model to address MNAR bias in corporate carbon data estimation.
Findings
Up to 30% accuracy improvement in simulation
Up to 40% higher coverage rate in simulation
Up to 20% higher coverage in real data
Abstract
Corporate carbon emissions data is disclosed by approximately 65% of large and mid-sized companies globally, despite being a key indicator of corporate climate performance. With investors increasingly looking to integrate climate risk into their investment strategies and risk reporting, this creates demand for robust prediction models that can generate reliable estimates for missing carbon disclosures. However, these estimates lack transparency and are frequently used in the investment decisions process with the same confidence as corporate reported data. As disclosures remain mostly voluntary and the propensity to disclose is shaped by several factors (e.g. size, sector, geography), missing emissions data should be assumed to be missing not at random (MNAR). However, widely used estimation methods (e.g. linear regression models) typically do not correct for MNAR bias and do not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy, Environment, Economic Growth · Economic and Environmental Valuation · Forecasting Techniques and Applications
