Translating Expert Intuition into Quantifiable Features: Encode Investigator Domain Knowledge via LLM for Enhanced Predictive Analytics
Phoebe Jing, Yijing Gao, Yuanhang Zhang, Xianlong Zeng

TL;DR
This paper introduces a framework using Large Language Models to convert investigator expertise into quantifiable features, significantly improving predictive analytics by integrating human domain knowledge into machine learning models.
Contribution
It presents a novel method for encoding expert insights into structured features using LLMs, enhancing model performance across multiple prediction tasks.
Findings
Improved risk assessment accuracy
Enhanced decision-making performance
Effective integration of human expertise into ML models
Abstract
In the realm of predictive analytics, the nuanced domain knowledge of investigators often remains underutilized, confined largely to subjective interpretations and ad hoc decision-making. This paper explores the potential of Large Language Models (LLMs) to bridge this gap by systematically converting investigator-derived insights into quantifiable, actionable features that enhance model performance. We present a framework that leverages LLMs' natural language understanding capabilities to encode these red flags into a structured feature set that can be readily integrated into existing predictive models. Through a series of case studies, we demonstrate how this approach not only preserves the critical human expertise within the investigative process but also scales the impact of this knowledge across various prediction tasks. The results indicate significant improvements in risk…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Big Data and Business Intelligence · Scientific Computing and Data Management
MethodsSparse Evolutionary Training · High-Order Consensuses
