Query Expansion with Locally-Trained Word Embeddings
Fernando Diaz, Bhaskar Mitra, Nick Craswell

TL;DR
This paper investigates the effectiveness of locally-trained word embeddings for query expansion in information retrieval, showing they outperform global embeddings like word2vec and GloVe in retrieval tasks.
Contribution
It introduces the use of corpus and query-specific embeddings for query expansion, demonstrating their superior performance over traditional global embeddings.
Findings
Locally-trained embeddings outperform global embeddings in retrieval tasks.
Query-specific embeddings improve the effectiveness of query expansion.
Global embeddings like word2vec and GloVe underperform in this context.
Abstract
Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships. We study the use of term relatedness in the context of query expansion for ad hoc information retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when trained globally, underperform corpus and query specific embeddings for retrieval tasks. These results suggest that other tasks benefiting from global embeddings may also benefit from local embeddings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
MethodsGloVe Embeddings
