Performing energy modelling exercises in a transparent way the issue of data quality in power plant databases
Fabian Gotzens, Heidi Heinrichs, Jonas H\"orsch, Fabian Hofmann

TL;DR
This paper introduces an open source Python tool for cleaning and combining power plant databases, highlighting data quality issues and the importance of proprietary data for accurate energy modelling.
Contribution
The paper presents 'powerplantmatching', a new open source toolset for improving data quality in power plant databases and compares open data with proprietary sources.
Findings
Open data alone may not match proprietary data quality.
Proprietary data significantly improves commissioning year accuracy.
Matching capacities shows high consistency across datasets.
Abstract
In energy modelling, open data and open source code can help enhance traceability and reproducibility of model exercises which contribute to facilitate controversial debates and improve policy advice. While the availability of open power plant databases increased in recent years, they often differ considerably from each other and their data quality has not been systematically compared to proprietary sources yet. Here, we introduce the python-based "powerplantmatching" (PPM), an open source toolset for cleaning, standardizing and combining multiple power plant databases. We apply it once only with open databases and once with an additional proprietary database in order to discuss and elaborate the issue of data quality, by analysing capacities, countries, fuel types, geographic coordinates and commissioning years for conventional power plants. We find that a derived dataset purely based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
