An Embedding-Based Grocery Search Model at Instacart
Yuqing Xie, Taesik Na, Xiao Xiao, Saurav Manchanda, Young, Rao, Zhihong Xu, Guanghua Shu, Esther Vasiete, Tejaswi Tenneti, and Haixun Wang

TL;DR
This paper introduces an embedding-based grocery search model at Instacart that leverages transformer encoders, content features, and novel training methods to improve search relevance and online shopping metrics.
Contribution
The paper presents a new embedding-based search system with self-adversarial and cascade training techniques to handle noisy data and cold-start issues in e-commerce search.
Findings
10% relative improvement in RECALL@20
4.1% increase in cart-adds per search (CAPS)
1.5% increase in gross merchandise value (GMV)
Abstract
The key to e-commerce search is how to best utilize the large yet noisy log data. In this paper, we present our embedding-based model for grocery search at Instacart. The system learns query and product representations with a two-tower transformer-based encoder architecture. To tackle the cold-start problem, we focus on content-based features. To train the model efficiently on noisy data, we propose a self-adversarial learning method and a cascade training method. AccOn an offline human evaluation dataset, we achieve 10% relative improvement in RECALL@20, and for online A/B testing, we achieve 4.1% cart-adds per search (CAPS) and 1.5% gross merchandise value (GMV) improvement. We describe how we train and deploy the embedding based search model and give a detailed analysis of the effectiveness of our method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Identification and Quantification in Food · Text and Document Classification Technologies
