Dual-View Training for Instruction-Following Information Retrieval
Qingcheng Zeng, Puxuan Yu, Aman Mehta, Fuheng Zhao, Rajhans Samdani

TL;DR
This paper introduces a dual-view data synthesis method using polarity reversal to train instruction-following retrieval models, significantly enhancing their ability to obey explicit user constraints.
Contribution
The authors propose a novel polarity reversal-based data synthesis strategy that improves instruction sensitivity in retrieval systems beyond standard relevance training.
Findings
Improves FollowIR benchmark performance by 45% on a 305M-parameter encoder.
Demonstrates that data diversity and instruction supervision are complementary.
Outperforms larger general-purpose embedding models on instruction-aware retrieval tasks.
Abstract
Instruction-following information retrieval (IF-IR) studies retrieval systems that must not only find documents relevant to a query, but also obey explicit user constraints such as required attributes, exclusions, or output preferences. However, most retrievers are trained primarily for semantic relevance and often fail to distinguish documents that match the topic from those that satisfy the instruction. We propose a dual-view data synthesis strategy based on polarity reversal: given a query, a document that is relevant under the instruction, and a hard negative that matches the query but violates the instruction, we prompt an LLM to generate a complementary instruction under which the two documents swap relevance labels. By presenting the same document pair under complementary instructions that invert their relevance labels, the training signal forces the retriever to reconsider the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
