Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering
Sandra Wilfling

TL;DR
This paper introduces a Python framework for feature engineering to enhance data quality in energy systems modeling, demonstrating improved prediction accuracy in an energy demand case study.
Contribution
It presents a new Python framework based on scikit-learn that offers various feature engineering methods tailored for energy system data modeling.
Findings
Improved prediction accuracy with engineered features
Framework effectively supports feature creation, expansion, and selection
Demonstrated on energy demand prediction case study
Abstract
Data-driven modeling is an approach in energy systems modeling that has been gaining popularity. In data-driven modeling, machine learning methods such as linear regression, neural networks or decision-tree based methods are being applied. While these methods do not require domain knowledge, they are sensitive to data quality. Therefore, improving data quality in a dataset is beneficial for creating machine learning-based models. The improvement of data quality can be implemented through preprocessing methods. A selected type of preprocessing is feature engineering, which focuses on evaluating and improving the quality of certain features inside the dataset. Feature engineering methods include methods such as feature creation, feature expansion, or feature selection. In this work, a Python framework containing different feature engineering methods is presented. This framework contains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Load and Power Forecasting · Integrated Energy Systems Optimization · Advanced Data Processing Techniques
MethodsLib
