Budget-Aware Routing for Long Clinical Text

Khizar Qureshi; Geoffrey Martin; Yifan Peng

arXiv:2605.00336·cs.CL·May 4, 2026

Budget-Aware Routing for Long Clinical Text

Khizar Qureshi, Geoffrey Martin, Yifan Peng

PDF

1 Repo

TL;DR

This paper addresses budgeted context selection for large language models in clinical text, proposing RCD and heuristics to optimize token usage under cost constraints, improving summarization and extraction tasks.

Contribution

It introduces RCD, a submodular objective for budget-aware document unit selection, and evaluates heuristics for different document segmentation strategies in clinical NLP.

Findings

01

Positional heuristics excel at low budgets for extractive tasks.

02

Diversity-aware methods like MMR enhance LLM generation quality.

03

Cluster-based grouping reduces performance compared to other unitization schemes.

Abstract

A key challenge for large language models is token cost per query and overall deployment cost. Clinical inputs are long, heterogeneous, and often redundant, while downstream tasks are short and high stakes. We study budgeted context selection, where a subset of document units is chosen under a strict token budget so an off-the-shelf generator can meet fixed cost and latency constraints. We cast this as a knapsack-constrained subset selection problem with two design choices, unitization that defines document segmentation and selection that determines which units are kept. We propose \textbf{RCD}, a monotone submodular objective that balances relevance, coverage, and diversity. We compare sentence, section, window, and cluster-based unitization, and introduce a routing heuristic that adapts to the budget regime. Experiments on MIMIC discharge notes, Cochrane abstracts, and L-Eval show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stone-technologies/ACL_budget_paper
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.