Halo: Domain-Aware Query Optimization for Long-Context Question Answering
Pramod Chunduri, Francisco Romero, Ali Payani, Kexin Rong, Joy Arulraj

TL;DR
Halo is a framework that enhances long-context question answering by automatically leveraging domain knowledge to improve accuracy and reduce computational costs through a multi-stage, knowledge-guided query processing pipeline.
Contribution
Halo introduces a systematic method to extract and apply domain knowledge as executable operators in a multi-stage QA pipeline, improving accuracy and efficiency over existing approaches.
Findings
Achieves up to 13% higher accuracy
Reduces cost by 4.8x compared to baselines
Enables lightweight models to approach frontier LLM accuracy at 78x lower cost
Abstract
Long-context question answering (QA) over lengthy documents is critical for applications such as financial analysis, legal review, and scientific research. Current approaches, such as processing entire documents via a single LLM call or retrieving relevant chunks via RAG have two drawbacks: First, as context size increases, response quality can degrade, impacting accuracy. Second, iteratively processing hundreds of input documents can incur prohibitively high costs in API calls. To improve response quality and reduce the number of iterations needed to get the desired response, users tend to add domain knowledge to their prompts. However, existing systems fail to systematically capture and use this knowledge to guide query processing. Domain knowledge is treated as prompt tokens alongside the document: the LLM may or may not follow it, there is no reduction in computational cost, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Expert finding and Q&A systems
