Enhancing Translation Language Models with Word Embedding for Information Retrieval
Jibril Frej (LIG), Jean-Pierre Chevallet (LGI - IMAG), Didier Schwab, (UGA)

TL;DR
This paper investigates using Word Embedding semantic resources to improve Information Retrieval models by addressing term mismatch, but initial results did not show significant improvements over classical models.
Contribution
The study applies neural Word Embedding to enhance IR Language Models by estimating translation probabilities through cosine similarity.
Findings
No statistically significant improvement over classical models
Applied neural Word Embedding to IR task
Explored translation probability estimation using cosine similarity
Abstract
In this paper, we explore the usage of Word Embedding semantic resources for Information Retrieval (IR) task. This embedding, produced by a shallow neural network, have been shown to catch semantic similarities between words (Mikolov et al., 2013). Hence, our goal is to enhance IR Language Models by addressing the term mismatch problem. To do so, we applied the model presented in the paper Integrating and Evaluating Neural Word Embedding in Information Retrieval by Zuccon et al. (2015) that proposes to estimate the translation probability of a Translation Language Model using the cosine similarity between Word Embedding. The results we obtained so far did not show a statistically significant improvement compared to classical Language Model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
