Knowledge Distillation for Enhancing Walmart E-commerce Search Relevance Using Large Language Models
Hongwei Shang, Nguyen Vo, Nitin Yadav, Tian Zhang, Ajit Puthenputhussery, Xunfan Cai, Shuyi Chen, Prijith Chandran, Changsung Kang

TL;DR
This paper introduces a knowledge distillation framework that transfers the high ranking capabilities of large language models into a smaller, efficient model for e-commerce search relevance, achieving improved performance and low latency.
Contribution
The paper presents a novel distillation method that expands training data with unlabeled data and trains a student model to outperform the teacher LLM in search relevance tasks.
Findings
Student model performance improves with more augmented data.
The student model can outperform the teacher model with sufficient data.
Successfully deployed in Walmart's production system with positive metrics.
Abstract
Ensuring the products displayed in e-commerce search results are relevant to users queries is crucial for improving the user experience. With their advanced semantic understanding, deep learning models have been widely used for relevance matching in search tasks. While large language models (LLMs) offer superior ranking capabilities, it is challenging to deploy LLMs in real-time systems due to the high-latency requirements. To leverage the ranking power of LLMs while meeting the low-latency demands of production systems, we propose a novel framework that distills a high performing LLM into a more efficient, low-latency student model. To help the student model learn more effectively from the teacher model, we first train the teacher LLM as a classification model with soft targets. Then, we train the student model to capture the relevance margin between pairs of products for a given query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
