Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs
Zheng Liu, Wei Zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin, Schroeder

TL;DR
This paper investigates how domain-specific pre-training on search click logs improves the generalization of semantic product search models, revealing that targeted fine-tuning enhances performance on unseen queries and products.
Contribution
It demonstrates that domain-specific fine-tuning with clickstream data significantly improves the generalization ability of semantic search models over general-domain pre-training.
Findings
Domain-specific fine-tuning outperforms general-domain approaches.
Clickstream data enhances model generalization.
Generalization does not improve with general-domain fine-tuning.
Abstract
Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not help generalization, which aligns with the discovery of prior art. Proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a publicly available manual annotated query-product pair da
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsAttention Is All You Need · Linear Layer · Adam · Dense Connections · Attention Dropout · Multi-Head Attention · Layer Normalization · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection
