QUILL: Query Intent with Large Language Models using Retrieval   Augmentation and Multi-stage Distillation

Krishna Srinivasan; Karthik Raman; Anupam Samanta; Lingrui Liao; Luca; Bertelli; Mike Bendersky

arXiv:2210.15718·cs.CL·October 31, 2022·1 cites

QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation

Krishna Srinivasan, Karthik Raman, Anupam Samanta, Lingrui Liao, Luca, Bertelli, Mike Bendersky

PDF

Open Access

TL;DR

This paper introduces QUILL, a method that enhances query understanding in large language models by combining retrieval augmentation with a novel two-stage distillation process, leading to significant real-world performance improvements.

Contribution

The paper presents a practical two-stage distillation technique that preserves retrieval augmentation benefits in LLMs without increased computational costs.

Findings

01

Improved query understanding with retrieval augmentation.

02

Effective distillation retains retrieval benefits.

03

Significant gains on real-world large-scale query systems.

Abstract

Large Language Models (LLMs) have shown impressive results on a variety of text understanding tasks. Search queries though pose a unique challenge, given their short-length and lack of nuance or context. Complicated feature engineering efforts do not always lead to downstream improvements as their performance benefits may be offset by increased complexity of knowledge distillation. Thus, in this paper we make the following contributions: (1) We demonstrate that Retrieval Augmentation of queries provides LLMs with valuable additional context enabling improved understanding. While Retrieval Augmentation typically increases latency of LMs (thus hurting distillation efficacy), (2) we provide a practical and effective way of distilling Retrieval Augmentation LLMs. Specifically, we use a novel two-stage distillation approach that allows us to carry over the gains of retrieval augmentation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Information Retrieval and Search Behavior · Advanced Graph Neural Networks