TL;DR
This paper introduces a method to interpret neural retrieval models by decoding their latent space into queries, enabling better understanding and improved query suggestion through synthetic data generation and a T5-based feedback model.
Contribution
It presents a novel query decoder that interprets neural retrieval latent space and leverages it for generating synthetic queries to enhance retrieval performance.
Findings
Decoding latent space yields meaningful queries.
Synthetic query data improves retrieval results.
T5-based PRF model outperforms baselines.
Abstract
Neural retrieval models have superseded classic bag-of-words methods such as BM25 as the retrieval framework of choice. However, neural systems lack the interpretability of bag-of-words models; it is not trivial to connect a query change to a change in the latent space that ultimately determines the retrieval results. To shed light on this embedding space, we learn a "query decoder" that, given a latent representation of a neural search engine, generates the corresponding query. We show that it is possible to decode a meaningful query from its latent representation and, when moving in the right direction in latent space, to decode a query that retrieves the relevant paragraph. In particular, the query decoder can be useful to understand "what should have been asked" to retrieve a particular paragraph from the collection. We employ the query decoder to generate a large synthetic dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Residual Connection · Dropout · Adafactor · Inverse Square Root Schedule
