How Usable is Automated Feature Engineering for Tabular Data?

Bastian Sch\"afer; Lennart Purucker; Maciej Janowski; Frank Hutter

arXiv:2508.13932·cs.LG·August 20, 2025

How Usable is Automated Feature Engineering for Tabular Data?

Bastian Sch\"afer, Lennart Purucker, Maciej Janowski, Frank Hutter

PDF

TL;DR

This paper evaluates the usability of 53 automated feature engineering methods for tabular data, revealing significant usability issues such as poor documentation, lack of community support, and absence of resource constraints, emphasizing the need for more user-friendly AutoFE tools.

Contribution

The study provides a comprehensive usability assessment of existing AutoFE methods, highlighting critical gaps and setting directions for developing more practical, user-oriented AutoFE solutions.

Findings

01

Most AutoFE methods are difficult to use and poorly documented.

02

No AutoFE method supports setting time or memory constraints.

03

There is a significant need for more usable and well-engineered AutoFE tools.

Abstract

Tabular data, consisting of rows and columns, is omnipresent across various machine learning applications. Each column represents a feature, and features can be combined or transformed to create new, more informative features. Such feature engineering is essential to achieve peak performance in machine learning. Since manual feature engineering is expensive and time-consuming, a substantial effort has been put into automating it. Yet, existing automated feature engineering (AutoFE) methods have never been investigated regarding their usability for practitioners. Thus, we investigated 53 AutoFE methods. We found that these methods are, in general, hard to use, lack documentation, and have no active communities. Furthermore, no method allows users to set time and memory constraints, which we see as a necessity for usable automation. Our survey highlights the need for future work on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.