Toward Efficient Automated Feature Engineering
Kafeng Wang, Pengyang Wang, Chengzhong xu

TL;DR
This paper introduces a reinforcement learning-based framework for automated feature engineering that significantly improves efficiency and maintains high effectiveness, enabling faster deployment on large-scale datasets.
Contribution
The work proposes a novel AFE framework with a feature pre-evaluation model and a two-stage policy training strategy to enhance efficiency without sacrificing performance.
Findings
Achieved 2.9% higher average performance
Doubled the computational efficiency compared to existing methods
Validated on 36 datasets for classification and regression
Abstract
Automated Feature Engineering (AFE) refers to automatically generate and select optimal feature sets for downstream tasks, which has achieved great success in real-world applications. Current AFE methods mainly focus on improving the effectiveness of the produced features, but ignoring the low-efficiency issue for large-scale deployment. Therefore, in this work, we propose a generic framework to improve the efficiency of AFE. Specifically, we construct the AFE pipeline based on reinforcement learning setting, where each feature is assigned an agent to perform feature transformation \com{and} selection, and the evaluation score of the produced features in downstream tasks serve as the reward to update the policy. We improve the efficiency of AFE in two perspectives. On the one hand, we develop a Feature Pre-Evaluation (FPE) Model to reduce the sample size and feature size that are two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Software Engineering Research · Anomaly Detection Techniques and Applications
