Query Clustering using Segment Specific Context Embeddings
S.K Kolluru, Prasenjit Mukherjee

TL;DR
This paper introduces a new query clustering method using context embeddings derived from search results, enabling effective identification of user interest areas with high monetization potential.
Contribution
It extends word2vec to create query2vec embeddings from search result contexts and applies a scalable Divide & Merge clustering algorithm.
Findings
Effective clustering of user interests across multiple segments
High relevance of discovered interest areas for monetization
Scalable approach suitable for large-scale search data
Abstract
This paper presents a novel query clustering approach to capture the broad interest areas of users querying search engines. We make use of recent advances in NLP - word2vec and extend it to get query2vec, vector representations of queries, based on query contexts, obtained from the top search results for the query and use a highly scalable Divide & Merge clustering algorithm on top of the query vectors, to get the clusters. We have tried this approach on a variety of segments, including Retail, Travel, Health, Phones and found the clusters to be effective in discovering user's interest areas which have high monetization potential.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Recommender Systems and Techniques · Topic Modeling
