LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization
Masafumi Enomoto, Kunihiro Takeoka, Kosuke Akimoto, Kiril Gashteovski,, Masafumi Oyamada

TL;DR
LightPAL introduces a lightweight, graph-based passage retrieval method that enhances open-domain multi-document summarization efficiency and effectiveness without iterative large language model inference.
Contribution
The paper presents LightPAL, a novel retrieval approach that pre-constructs passage graphs and uses random walks, reducing latency and improving performance over existing methods.
Findings
LightPAL outperforms naive sparse and dense retrievers in retrieval metrics.
LightPAL achieves better summarization quality compared to baseline retrieval methods.
LightPAL is more efficient than iterative MQA approaches.
Abstract
Open-Domain Multi-Document Summarization (ODMDS) is the task of generating summaries from large document collections in response to user queries. This task is crucial for efficiently addressing diverse information needs from users. Traditional retrieve-then-summarize approaches fall short for open-ended queries in ODMDS tasks. These queries often require broader context than initially retrieved passages provide, making it challenging to retrieve all relevant information in a single search. While iterative retrieval methods has been explored for multi-hop question answering (MQA), it's impractical for ODMDS due to high latency from repeated LLM inference. Accordingly, we propose LightPAL, a lightweight passage retrieval method for ODMDS. LightPAL leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
