Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework

Chongyu Qu; Allen J. Luna; Thomas Z. Li; Junchao Zhu; Junlin Guo; Juming Xiong; Kim L. Sandler; Bennett A. Landman; Yuankai Huo

arXiv:2508.14940·cs.LG·August 27, 2025

Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework

Chongyu Qu, Allen J. Luna, Thomas Z. Li, Junchao Zhu, Junlin Guo, Juming Xiong, Kim L. Sandler, Bennett A. Landman, Yuankai Huo

PDF

TL;DR

This paper introduces a personalized lung cancer risk prediction system that dynamically selects the best model for each patient by combining cohort retrieval with reasoning via large language models, improving individual risk assessment accuracy.

Contribution

It presents a novel retrieval-augmented framework that enables cohort-aware, individualized model selection for lung cancer risk prediction using multi-modal data and LLM reasoning.

Findings

01

Effective cohort retrieval using FAISS improves model relevance.

02

LLM-based reasoning enhances personalized model selection.

03

Framework demonstrates potential for real-world clinical application.

Abstract

Accurate lung cancer risk prediction remains challenging due to substantial variability across patient populations and clinical settings -- no single model performs best for all cohorts. To address this, we propose a personalized lung cancer risk prediction agent that dynamically selects the most appropriate model for each patient by combining cohort-specific knowledge with modern retrieval and reasoning techniques. Given a patient's CT scan and structured metadata -- including demographic, clinical, and nodule-level features -- the agent first performs cohort retrieval using FAISS-based similarity search across nine diverse real-world cohorts to identify the most relevant patient population from a multi-institutional database. Second, a Large Language Model (LLM) is prompted with the retrieved cohort and its associated performance metrics to recommend the optimal prediction algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.