Differentiable Retrieval Augmentation via Generative Language Modeling   for E-commerce Query Intent Classification

Chenyu Zhao; Yunjiang Jiang; Yiming Qiu; Han Zhang; Wen-Yun Yang

arXiv:2308.09308·cs.IR·September 18, 2023

Differentiable Retrieval Augmentation via Generative Language Modeling for E-commerce Query Intent Classification

Chenyu Zhao, Yunjiang Jiang, Yiming Qiu, Han Zhang, Wen-Yun Yang

PDF

TL;DR

This paper introduces Dragan, a differentiable retrieval augmentation method using generative language modeling, which improves e-commerce query intent classification by enabling end-to-end training.

Contribution

The paper proposes a novel differentiable reformulation for retrieval augmentation, allowing joint training of retriever and classifier in NLP tasks.

Findings

01

Significant improvement over state-of-the-art baselines.

02

Effective in offline and online evaluations.

03

Enhances query intent classification accuracy.

Abstract

Retrieval augmentation, which enhances downstream models by a knowledge retriever and an external corpus instead of by merely increasing the number of model parameters, has been successfully applied to many natural language processing (NLP) tasks such as text classification, question answering and so on. However, existing methods that separately or asynchronously train the retriever and downstream model mainly due to the non-differentiability between the two parts, usually lead to degraded performance compared to end-to-end joint training. In this paper, we propose Differentiable Retrieval Augmentation via Generative lANguage modeling(Dragan), to address this problem by a novel differentiable reformulation. We demonstrate the effectiveness of our proposed method on a challenging NLP task in e-commerce search, namely query intent classification. Both the experimental results and ablation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.