Topic-DPR: Topic-based Prompts for Dense Passage Retrieval
Qingfa Xiao, Shuangyin Li, Lei Chen

TL;DR
Topic-DPR introduces topic-based prompts and a novel sampling strategy to enhance dense passage retrieval, addressing semantic space collapse and improving retrieval accuracy over previous methods.
Contribution
The paper proposes a new dense passage retrieval model using multiple topic prompts and a semi-structured data sampling strategy, improving semantic space distribution and retrieval performance.
Findings
Outperforms previous state-of-the-art retrieval methods.
Uses multiple topic prompts for better semantic space distribution.
Employs semi-structured data for improved sampling efficiency.
Abstract
Prompt-based learning's efficacy across numerous natural language processing tasks has led to its integration into dense passage retrieval. Prior research has mainly focused on enhancing the semantic understanding of pre-trained language models by optimizing a single vector as a continuous prompt. This approach, however, leads to a semantic space collapse; identical semantic information seeps into all representations, causing their distributions to converge in a restricted region. This hinders differentiation between relevant and irrelevant passages during dense retrieval. To tackle this issue, we present Topic-DPR, a dense passage retrieval model that uses topic-based prompts. Unlike the single prompt method, multiple topic-based prompts are established over a probabilistic simplex and optimized simultaneously through contrastive learning. This encourages representations to align with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsALIGN
