ConTReGen: Context-driven Tree-structured Retrieval for Open-domain Long-form Text Generation
Kashob Kumar Roy, Pritom Saha Akash, Kevin Chen-Chuan Chang, Lucian, Popa

TL;DR
ConTReGen is a novel framework that uses a hierarchical, tree-structured retrieval method to improve the depth, relevance, and coherence of open-domain long-form text generation, effectively addressing complex queries.
Contribution
It introduces a context-driven, tree-structured retrieval approach that combines hierarchical exploration with bottom-up synthesis, enhancing existing retrieval-augmented generation methods.
Findings
Outperforms state-of-the-art RAG models on multiple datasets
Demonstrates improved depth and relevance in generated responses
Validates effectiveness on a newly introduced dataset, ODSUM-WikiHow
Abstract
Open-domain long-form text generation requires generating coherent, comprehensive responses that address complex queries with both breadth and depth. This task is challenging due to the need to accurately capture diverse facets of input queries. Existing iterative retrieval-augmented generation (RAG) approaches often struggle to delve deeply into each facet of complex queries and integrate knowledge from various sources effectively. This paper introduces ConTReGen, a novel framework that employs a context-driven, tree-structured retrieval approach to enhance the depth and relevance of retrieved content. ConTReGen integrates a hierarchical, top-down in-depth exploration of query facets with a systematic bottom-up synthesis, ensuring comprehensive coverage and coherent integration of multifaceted information. Extensive experiments on multiple datasets, including LFQA and ODSUM, alongside…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Adam · Linear Layer · Dropout · Byte Pair Encoding · Layer Normalization · Residual Connection · Linear Warmup With Linear Decay · Dense Connections
