Query Clustering using Segment Specific Context Embeddings

S.K Kolluru; Prasenjit Mukherjee

arXiv:1608.01247·cs.IR·November 8, 2016·2 cites

Query Clustering using Segment Specific Context Embeddings

S.K Kolluru, Prasenjit Mukherjee

PDF

Open Access

TL;DR

This paper introduces a new query clustering method using context embeddings derived from search results, enabling effective identification of user interest areas with high monetization potential.

Contribution

It extends word2vec to create query2vec embeddings from search result contexts and applies a scalable Divide & Merge clustering algorithm.

Findings

01

Effective clustering of user interests across multiple segments

02

High relevance of discovered interest areas for monetization

03

Scalable approach suitable for large-scale search data

Abstract

This paper presents a novel query clustering approach to capture the broad interest areas of users querying search engines. We make use of recent advances in NLP - word2vec and extend it to get query2vec, vector representations of queries, based on query contexts, obtained from the top search results for the query and use a highly scalable Divide & Merge clustering algorithm on top of the query vectors, to get the clusters. We have tried this approach on a variety of segments, including Retail, Travel, Health, Phones and found the clusters to be effective in discovering user's interest areas which have high monetization potential.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Recommender Systems and Techniques · Topic Modeling