CompAct: Compressing Retrieved Documents Actively for Question Answering

Chanwoong Yoon; Taewhoo Lee; Hyeon Hwang; Minbyul Jeong; Jaewoo Kang

arXiv:2407.09014·cs.CL·October 15, 2024

CompAct: Compressing Retrieved Documents Actively for Question Answering

Chanwoong Yoon, Taewhoo Lee, Hyeon Hwang, Minbyul Jeong, Jaewoo Kang

PDF

Open Access 1 Repo 1 Video

TL;DR

CompAct is a new framework that actively compresses retrieved documents to improve question answering performance, achieving high compression rates and flexibility as a plug-in module for retrieval-augmented systems.

Contribution

It introduces an active document compression method that effectively condenses extensive information without losing key details, enhancing multi-hop question answering.

Findings

01

Significant performance improvements on multi-hop QA benchmarks.

02

Achieves up to 47x compression rate.

03

Operates as a flexible, cost-efficient plug-in module.

Abstract

Retrieval-augmented generation supports language models to strengthen their factual groundings by providing external contexts. However, language models often face challenges when given extensive information, diminishing their effectiveness in solving questions. Context compression tackles this issue by filtering out irrelevant information, but current methods still struggle in realistic scenarios where crucial information cannot be captured with a single-step approach. To overcome this limitation, we introduce CompAct, a novel framework that employs an active strategy to condense extensive documents without losing key information. Our experiments demonstrate that CompAct brings significant improvements in both performance and compression rate on multi-hop question-answering benchmarks. CompAct flexibly operates as a cost-efficient plug-in module with various off-the-shelf retrievers or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dmis-lab/compact
pytorchOfficial

Videos

COMPACT: Compressing Retrieved Documents Actively for Question Answering· underline

Taxonomy

TopicsTopic Modeling