CoFEE: Reasoning Control for LLM-Based Feature Discovery

Maximilian Westermann; Ben Griffin; Aaron Ontoyin Yin; Zakari Salifu; Yagiz Ihlamur; Kelvin Amoaba; Joseph Ternasky; Fuat Alican; Yigit Ihlamur

arXiv:2604.21584·cs.AI·April 24, 2026

CoFEE: Reasoning Control for LLM-Based Feature Discovery

Maximilian Westermann, Ben Griffin, Aaron Ontoyin Yin, Zakari Salifu, Yagiz Ihlamur, Kelvin Amoaba, Joseph Ternasky, Fuat Alican, Yigit Ihlamur

PDF

TL;DR

This paper introduces CoFEE, a framework that enforces cognitive reasoning behaviors in LLMs to improve feature discovery from unstructured data, leading to higher quality and efficiency.

Contribution

It proposes a novel reasoning control framework for LLMs that incorporates cognitive behaviors to enhance feature quality and reduce costs in feature discovery.

Findings

01

CoFEE achieves 15.2% higher success rate than vanilla LLM prompts.

02

It generates 29% fewer features, reducing computational costs by 53.3%.

03

Cognitive reasoning control improves feature quality and generalization.

Abstract

Feature discovery from complex unstructured data is fundamentally a reasoning problem: it requires identifying abstractions that are predictive of a target outcome while avoiding leakage, proxies, and post-outcome signals. With the introduction of ever-improving Large Language Models (LLMs), our method provides a structured method for addressing this challenge. LLMs are well suited for this task by being able to process large amounts of information, but unconstrained feature generation can lead to weak features. In this work, we study reasoning control in LLMs by inducing cognitive behaviors for improving feature discovery. We introduce CoFEE (Cognitive Feature Engineering Engine), a reasoning control framework that enforces cognitive behaviors in how the LLM reasons during feature discovery. From a machine learning perspective, these cognitive behaviors act as structured inductive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.