Knowledge-Informed Automatic Feature Extraction via Collaborative Large Language Model Agents

Henrik Bradland; Morten Goodwin; Vladimir I. Zadorozhny; Per-Arne Andersen

arXiv:2511.15074·cs.AI·November 20, 2025

Knowledge-Informed Automatic Feature Extraction via Collaborative Large Language Model Agents

Henrik Bradland, Morten Goodwin, Vladimir I. Zadorozhny, Per-Arne Andersen

PDF

Open Access

TL;DR

Rogue One is a multi-agent LLM framework that enhances feature extraction for tabular data by integrating external knowledge, qualitative feedback, and iterative collaboration, leading to more meaningful and powerful features.

Contribution

It introduces a decentralized multi-agent system with qualitative feedback and knowledge retrieval to improve automatic feature extraction beyond existing monolithic LLM approaches.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Generates semantically meaningful and interpretable features

03

Identifies novel hypotheses like potential biomarkers

Abstract

The performance of machine learning models on tabular data is critically dependent on high-quality feature engineering. While Large Language Models (LLMs) have shown promise in automating feature extraction (AutoFE), existing methods are often limited by monolithic LLM architectures, simplistic quantitative feedback, and a failure to systematically integrate external domain knowledge. This paper introduces Rogue One, a novel, LLM-based multi-agent framework for knowledge-informed automatic feature extraction. Rogue One operationalizes a decentralized system of three specialized agents-Scientist, Extractor, and Tester-that collaborate iteratively to discover, generate, and validate predictive features. Crucially, the framework moves beyond primitive accuracy scores by introducing a rich, qualitative feedback mechanism and a "flooding-pruning" strategy, allowing it to dynamically balance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Topic Modeling