Approximating Categorical Similarity in Sponsored Search Relevance
Hiba Ahsan, Rahul Agrawal

TL;DR
This paper introduces a neural network-based method to approximate categorical similarity in sponsored search, improving relevance and coverage for tail queries, leading to significant performance gains.
Contribution
It proposes using neural embeddings, specifically CLSM with tri-letter representations, to better approximate category similarity for query-ad relevance.
Findings
5.23% improvement in AUC ROC in offline experiments
8.2% increase in relevance in A/B testing
Enhanced coverage for tail queries using neural embeddings
Abstract
Sponsored Search is a major source of revenue for web search engines. Since sponsored search follows a pay-per-click model, showing relevant ads for receiving clicks is crucial. Matching categories of a query and its ad candidates have been explored in modeling relevance of query-ad pairs. The approach involves matching cached categories of queries seen in the past to categories of candidate ads. Since queries have a heavy tail distribution, the approach has limited coverage. In this work, we propose approximating categorical similarity of a query-ad pairs using neural networks, particularly CLSM. Embedding of a query (or document) is generated using its tri-letter representation which allows coverage of tail queries. Offline experiments of incorporating this feature as opposed to using the categories directly show a 5.23% improvement in AUC ROC. A/B testing results show an improvement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Text and Document Classification Technologies · Web Data Mining and Analysis
