Evolutionary Large Language Model for Automated Feature Transformation

Nanxu Gong; Chandan K.Reddy; Wangyang Ying; Haifeng Chen; Yanjie Fu

arXiv:2405.16203·cs.LG·December 19, 2024·1 cites

Evolutionary Large Language Model for Automated Feature Transformation

Nanxu Gong, Chandan K.Reddy, Wangyang Ying, Haifeng Chen, Yanjie Fu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an evolutionary framework combining Large Language Models and evolutionary algorithms to automate feature transformation, improving exploration efficiency and generality over traditional methods.

Contribution

It presents a novel evolutionary LLM approach that integrates multi-population databases and few-shot prompting to enhance feature transformation exploration.

Findings

01

Effective exploration of large feature spaces

02

Improved feature transformation quality

03

Demonstrated generality across domains

Abstract

Feature transformation aims to reconstruct the feature space of raw features to enhance the performance of downstream models. However, the exponential growth in the combinations of features and operations poses a challenge, making it difficult for existing methods to efficiently explore a wide space. Additionally, their optimization is solely driven by the accuracy of downstream models in specific domains, neglecting the acquisition of general feature knowledge. To fill this research gap, we propose an evolutionary LLM framework for automated feature transformation. This framework consists of two parts: 1) constructing a multi-population database through an RL data collector while utilizing evolutionary algorithm strategies for database maintenance, and 2) utilizing the ability of Large Language Model (LLM) in sequence understanding, we employ few-shot prompts to guide LLM in generating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NanxuGong/ELLM-FT
pytorchOfficial

Videos

Evolutionary Large Language Model for Automated Feature Transformation· underline

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies