Lost in Decoding? Reproducing and Stress-Testing the Look-Ahead Prior in Generative Retrieval
Kidist Amde Mekonnen, Yongkang Li, Yubao Tang, Simon Lupart, Maarten de Rijke

TL;DR
This paper reproduces and stress-tests the planning-ahead method in generative retrieval, revealing its effectiveness and limitations under realistic query variations and multilingual settings.
Contribution
It reproduces PAG's effectiveness, introduces diagnostics for plan stability, and evaluates robustness strategies without re-indexing.
Findings
PAG's planning signal is brittle under lexical variations.
Typos can cause plan collapse, reducing guidance effectiveness.
Query translation improves robustness in cross-lingual retrieval.
Abstract
Generative retrieval (GR) ranks documents by autoregressively generating document identifiers. Because many GR methods rely on trie-constrained beam search, they are vulnerable to early pruning of relevant prefixes under finite-beam decoding. Planning Ahead in Generative Retrieval (PAG) mitigates this failure mode by using simultaneous decoding to compute a document-level look-ahead prior that guides subsequent sequential decoding. We reproduce PAG at inference time and stress-test its decoding behavior. Using the authors' released checkpoint and identifier/trie artifacts under the reported decoding setup, we reproduce the main effectiveness results on MS MARCO Dev and TREC-DL 2019/2020, and corroborate the reported beam-size-latency trade-off in our hardware setting. Beyond reproduction, we introduce plan drift diagnostics that quantify how intent-preserving query variations alter the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
