LEMON: LanguagE ModeL for Negative Sampling of Knowledge Graph Embeddings
Md Rashad Al Hasan Rony, Mirza Mohtashim Alam, Semab Ali, Jens, Lehmann, Sahar Vahdati

TL;DR
This paper introduces LEMON, a novel method using pre-trained language models to generate informative negative samples for knowledge graph embedding, improving link prediction performance by leveraging textual information.
Contribution
LEMON is the first approach to utilize language models for negative sampling in knowledge graph embeddings, incorporating textual data to enhance sample informativeness.
Findings
LEMON outperforms traditional random sampling methods in link prediction tasks.
Using textual information improves the quality of negative samples.
The approach is effective on benchmark knowledge graphs with textual data.
Abstract
Knowledge Graph Embedding models have become an important area of machine learning.Those models provide a latent representation of entities and relations in a knowledge graph which can then be used in downstream machine learning tasks such as link prediction. The learning process of such models can be performed by contrasting positive and negative triples. While all triples of a KG are considered positive, negative triples are usually not readily available. Therefore, the choice of the sampling method to obtain the negative triples play a crucial role in the performance and effectiveness of Knowledge Graph Embedding models. Most of the current methods fetch negative samples from a random distribution of entities in the underlying Knowledge Graph which also often includes meaningless triples. Other known methods use adversarial techniques or generative neural networks which consequently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Data Quality and Management
