Scaling the Vocabulary of Non-autoregressive Models for Efficient   Generative Retrieval

Ravisri Valluri; Akash Kumar Mohankumar; Kushal Dave; Amit Singh; Jian; Jiao; Manik Varma; Gaurav Sinha

arXiv:2406.06739·cs.CL·June 12, 2024

Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval

Ravisri Valluri, Akash Kumar Mohankumar, Kushal Dave, Amit Singh, Jian, Jiao, Manik Varma, Gaurav Sinha

PDF

Open Access

TL;DR

This paper introduces PIXAR, a non-autoregressive model with an expanded vocabulary for efficient generative retrieval, significantly improving retrieval performance while maintaining low latency and cost.

Contribution

PIXAR expands NAR model vocabularies to include multi-word entities, reducing token dependencies and enhancing retrieval accuracy without increasing inference latency.

Findings

01

31.0% improvement in MRR@10 on MS MARCO

02

23.2% increase in Hits@5 on Natural Questions

03

5.08% increase in ad clicks in online experiments

Abstract

Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates fully Non-autoregressive (NAR) language models as a more efficient alternative for generative retrieval. While standard NAR models alleviate latency and cost concerns, they exhibit a significant drop in retrieval performance (compared to AR models) due to their inability to capture dependencies between target tokens. To address this, we question the conventional choice of limiting the target token space to solely words or sub-words. We propose PIXAR, a novel approach that expands the target…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques