Keyword Embeddings for Query Suggestion
Jorge Gab\'in, M. Eduardo Ares, Javier Parapar

TL;DR
This paper introduces two novel keyword embedding models based on Word2Vec and FastText, trained on scientific literature, to improve semantic keyword suggestion for complex query formulation tasks.
Contribution
It presents new models and a negative sampling approach tailored for academic keywords, enhancing semantic keyword suggestion capabilities.
Findings
Models outperform state-of-the-art embeddings in keyword suggestion tasks.
Proposed negative sampling improves keyword embedding quality.
Evaluation shows significant gains in retrieval scenarios.
Abstract
Nowadays, search engine users commonly rely on query suggestions to improve their initial inputs. Current systems are very good at recommending lexical adaptations or spelling corrections to users' queries. However, they often struggle to suggest semantically related keywords given a user's query. The construction of a detailed query is crucial in some tasks, such as legal retrieval or academic search. In these scenarios, keyword suggestion methods are critical to guide the user during the query formulation. This paper proposes two novel models for the keyword suggestion task trained on scientific literature. Our techniques adapt the architecture of Word2Vec and FastText to generate keyword embeddings by leveraging documents' keyword co-occurrence. Along with these models, we also present a specially tailored negative sampling approach that exploits how keywords appear in academic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Information Retrieval and Search Behavior
MethodsfastText
