FeatGeNN: Improving Model Performance for Tabular Data with Correlation-based Feature Extraction
Sammuel Ramos Silva, Rodrigo Silva

TL;DR
FeatGeNN introduces a correlation-based feature extraction method for AutoFE that enhances model performance on tabular data by capturing linear relationships more effectively than traditional pooling methods.
Contribution
The paper presents a novel convolutional AutoFE approach using correlation-based pooling, improving feature extraction for tabular data over existing methods.
Findings
FeaGeNN outperforms existing AutoFE methods on benchmark datasets.
Correlation-based pooling captures linear relationships better than max-pooling.
The method is computationally efficient and reduces overfitting.
Abstract
Automated Feature Engineering (AutoFE) has become an important task for any machine learning project, as it can help improve model performance and gain more information for statistical analysis. However, most current approaches for AutoFE rely on manual feature creation or use methods that can generate a large number of features, which can be computationally intensive and lead to overfitting. To address these challenges, we propose a novel convolutional method called FeatGeNN that extracts and creates new features using correlation as a pooling function. Unlike traditional pooling functions like max-pooling, correlation-based pooling considers the linear relationship between the features in the data matrix, making it more suitable for tabular data. We evaluate our method on various benchmark datasets and demonstrate that FeatGeNN outperforms existing AutoFE approaches regarding model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Machine Learning and Data Classification · Data Management and Algorithms
