Retrieval-Augmented Feature Generation for Domain-Specific Classification
Xinhao Zhang, Jinghan Zhang, Fengran Mo, Dakshak Keerthi Chandra, Yu-Zhong Chen, Fei Xie, Kunpeng Liu

TL;DR
This paper presents RAFG, a retrieval-augmented method that uses knowledge retrieval and large language models to generate interpretable, domain-specific features, significantly boosting classification accuracy in limited-data scenarios.
Contribution
The paper introduces RAFG, a novel retrieval-augmented framework leveraging knowledge retrieval and LLM reasoning to generate explainable, high-quality features for domain-specific classification tasks.
Findings
RAF G improves classification performance across multiple domains.
Generated features are interpretable and domain-relevant.
RAF G outperforms baseline feature generation methods.
Abstract
Feature generation can significantly enhance learning outcomes, particularly for tasks with limited data. An effective way to improve feature generation is to expand the current feature space using existing features and enriching the informational content. However, generating new, interpretable features usually requires domain-specific knowledge on top of the existing features. In this paper, we introduce a Retrieval-Augmented Feature Generation method, RAFG, to generate useful and explainable features specific to domain classification tasks. To increase the interpretability of the generated features, we conduct knowledge retrieval among the existing features in the domain to identify potential feature associations. These associations are expected to help generate useful features. Moreover, we develop a framework based on large language models (LLMs) for feature generation with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsFocus
