POQD: Performance-Oriented Query Decomposer for Multi-vector retrieval
Yaoyang Liu, Junlin Li, Yinjun Wu, Zhen Chen

TL;DR
POQD introduces an end-to-end trainable query decomposition framework leveraging LLMs, significantly improving multi-vector retrieval performance and integration into retrieval-based systems like RAG, with demonstrated empirical gains.
Contribution
The paper presents POQD, a novel, end-to-end trainable query decomposer for MVR that optimizes query prompts with an LLM-based approach, enhancing retrieval and QA accuracy.
Findings
Outperforms existing query decomposition methods in retrieval tasks.
Improves end-to-end QA accuracy in RAG systems.
Achieves superior MVR performance with reasonable training costs.
Abstract
Although Multi-Vector Retrieval (MVR) has achieved the state of the art on many information retrieval (IR) tasks, its performance highly depends on how to decompose queries into smaller pieces, say phrases or tokens. However, optimizing query decomposition for MVR performance is not end-to-end differentiable. Even worse, jointly solving this problem and training the downstream retrieval-based systems, say RAG systems could be highly inefficient. To overcome these challenges, we propose Performance-Oriented Query Decomposer (POQD), a novel query decomposition framework for MVR. POQD leverages one LLM for query decomposition and searches the optimal prompt with an LLM-based optimizer. We further propose an end-to-end training algorithm to alternatively optimize the prompt for query decomposition and the downstream models. This algorithm can achieve superior MVR performance at a reasonable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Management and Algorithms · Data Mining Algorithms and Applications · Advanced Database Systems and Queries
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Attention Dropout · Softmax · WordPiece · Weight Decay · Multi-Head Attention · Layer Normalization · Byte Pair Encoding
