Optimal Query Allocation in Extractive QA with LLMs: A Learning-to-Defer Framework with Theoretical Guarantees

Yannis Montreuil; Shu Heng Yeo; Axel Carlier; Lai Xing Ng; Wei Tsang Ooi

arXiv:2410.15761·cs.CL·May 21, 2026

Optimal Query Allocation in Extractive QA with LLMs: A Learning-to-Defer Framework with Theoretical Guarantees

Yannis Montreuil, Shu Heng Yeo, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

PDF

TL;DR

This paper introduces a Learning-to-Defer framework for extractive question answering with LLMs, optimizing query allocation to improve answer accuracy and reduce computational costs with theoretical guarantees.

Contribution

It presents a novel allocation strategy that balances performance and cost, supported by theoretical guarantees, and demonstrates effectiveness on multiple QA datasets.

Findings

01

Improves answer reliability in extractive QA tasks.

02

Reduces computational overhead significantly.

03

Provides theoretical guarantees on deferral strategy.

Abstract

Large Language Models excel in generative tasks but exhibit inefficiencies in structured text selection, particularly in extractive question answering. This challenge is magnified in resource-constrained environments, where deploying multiple specialized models for different tasks is impractical. We propose a Learning-to-Defer framework that allocates queries to specialized experts, ensuring high-confidence predictions while optimizing computational efficiency. Our approach integrates a principled allocation strategy with theoretical guarantees on optimal deferral that balances performance and cost. Empirical evaluations on SQuADv1, SQuADv2, and TriviaQA demonstrate that our method enhances answer reliability while significantly reducing computational overhead, making it well-suited for scalable and efficient EQA deployment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems