Structure and Diversity Aware Context Bubble Construction for Enterprise Retrieval Augmented Systems

Amir Khurshid; Abhishek Sehgal

arXiv:2601.10681·cs.AI·January 16, 2026

Structure and Diversity Aware Context Bubble Construction for Enterprise Retrieval Augmented Systems

Amir Khurshid, Abhishek Sehgal

PDF

Open Access

TL;DR

This paper introduces a structure-aware, diversity-constrained method for constructing coherent context bundles in enterprise retrieval systems, improving relevance, coverage, and reducing redundancy within token limits.

Contribution

It proposes a novel framework that leverages document structure and diversity constraints to assemble compact, informative context sets, outperforming traditional top-k retrieval methods.

Findings

01

Reduces redundant context significantly

02

Improves coverage of secondary facets

03

Enhances answer quality and citation faithfulness

Abstract

Large language model (LLM) contexts are typically constructed using retrieval-augmented generation (RAG), which involves ranking and selecting the top-k passages. The approach causes fragmentation in information graphs in document structures, over-retrieval, and duplication of content alongside insufficient query context, including 2nd and 3rd order facets. In this paper, a structure-informed and diversity-constrained context bubble construction framework is proposed that assembles coherent, citable bundles of spans under a strict token budget. The method preserves and exploits inherent document structure by organising multi-granular spans (e.g., sections and rows) and using task-conditioned structural priors to guide retrieval. Starting from high-relevance anchor spans, a context bubble is constructed through constrained selection that balances query relevance, marginal coverage, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Topic Modeling · Advanced Graph Neural Networks