Statistically Enhanced Learning: a feature engineering framework to boost (any) learning algorithms
Florian Felice, Christophe Ley, Andreas Groll, St\'ephane, Bordas

TL;DR
Statistically Enhanced Learning (SEL) is a formal framework for feature engineering that improves machine learning performance by using statistical estimators as predictors, validated through simulations and real-world applications.
Contribution
This paper introduces SEL, a novel formalization framework for feature engineering that incorporates statistical estimators as predictors to enhance learning algorithms.
Findings
SEL improves model accuracy in simulations.
Application of SEL on real data shows performance gains.
Framework formalizes feature engineering process.
Abstract
Feature engineering is of critical importance in the field of Data Science. While any data scientist knows the importance of rigorously preparing data to obtain good performing models, only scarce literature formalizes its benefits. In this work, we will present the method of Statistically Enhanced Learning (SEL), a formalization framework of existing feature engineering and extraction tasks in Machine Learning (ML). The difference compared to classical ML consists in the fact that certain predictors are not directly observed but obtained as statistical estimators. Our goal is to study SEL, aiming to establish a formalized framework and illustrate its improved performance by means of simulations as well as applications on real life use cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
