One button machine for automating feature engineering in relational databases
Hoang Thanh Lam, Johann-Michael Thiebaut, Mathieu Sinn, Bei Chen, Tiep, Mai, Oznur Alkan

TL;DR
One Button Machine automates feature engineering in relational databases, significantly reducing time and effort for data scientists and enabling non-experts to extract valuable features efficiently.
Contribution
The paper introduces OneBM, an automated system for feature discovery in relational databases that outperforms existing methods and aids both experts and non-experts in predictive analytics.
Findings
Achieved top 16-24% performance in Kaggle competitions.
Outperformed state-of-the-art systems in prediction accuracy.
Reduced data exploration time for data scientists.
Abstract
Feature engineering is one of the most important and time consuming tasks in predictive analytics projects. It involves understanding domain knowledge and data exploration to discover relevant hand-crafted features from raw data. In this paper, we introduce a system called One Button Machine, or OneBM for short, which automates feature discovery in relational databases. OneBM automatically performs a key activity of data scientists, namely, joining of database tables and applying advanced data transformations to extract useful features from data. We validated OneBM in Kaggle competitions in which OneBM achieved performance as good as top 16% to 24% data scientists in three Kaggle competitions. More importantly, OneBM outperformed the state-of-the-art system in a Kaggle competition in terms of prediction accuracy and ranking on Kaggle leaderboard. The results show that OneBM can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Advanced Database Systems and Queries
