The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks
Wanfu Gao, Zebin He, Jun Gao

TL;DR
FEAML introduces an automated, feedback-driven feature engineering approach using LLMs tailored for multi-label classification, effectively modeling label dependencies and improving model accuracy.
Contribution
This paper presents FEAML, a novel method that leverages LLMs with feedback to automate feature engineering specifically for multi-label learning tasks.
Findings
FEAML outperforms existing feature engineering methods on multiple datasets.
The feedback mechanism enhances the quality and relevance of generated features.
FEAML improves model accuracy and interpretability in multi-label classification.
Abstract
Existing feature engineering methods based on large language models (LLMs) have not yet been applied to multi-label learning tasks. They lack the ability to model complex label dependencies and are not specifically adapted to the characteristics of multi-label tasks. To address the above issues, we propose Feature Engineering Automation for Multi-Label Learning (FEAML), an automated feature engineering method for multi-label classification which leverages the code generation capabilities of LLMs. By utilizing metadata and label co-occurrence matrices, LLMs are guided to understand the relationships between data features and task objectives, based on which high-quality features are generated. The newly generated features are evaluated in terms of model accuracy to assess their effectiveness, while Pearson correlation coefficients are used to detect redundancy. FEAML further incorporates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Sentiment Analysis and Opinion Mining
