Biomedical Entity Linking with Triple-aware Pre-Training
Xi Yan, Cedric M\"oller, Ricardo Usbeck

TL;DR
This paper introduces a novel pre-training framework for biomedical entity linking that synthesizes a corpus from knowledge graphs to enhance LLM understanding, but finds limited benefits from additional relational information.
Contribution
The paper proposes a new KG-based pre-training method for LLMs to improve biomedical entity linking, addressing issues of data scarcity and semantic connection awareness.
Findings
Limited benefit from including synonyms, descriptions, or relational info.
Pre-training with KG-synthesized corpus offers a new approach.
Addresses knowledge integration challenges in biomedical NLP.
Abstract
Linking biomedical entities is an essential aspect in biomedical natural language processing tasks, such as text mining and question answering. However, a difficulty of linking the biomedical entities using current large language models (LLM) trained on a general corpus is that biomedical entities are scarcely distributed in texts and therefore have been rarely seen during training by the LLM. At the same time, those LLMs are not aware of high level semantic connection between different biomedical entities, which are useful in identifying similar concepts in different textual contexts. To cope with aforementioned problems, some recent works focused on injecting knowledge graph information into LLMs. However, former methods either ignore the relational knowledge of the entities or lead to catastrophic forgetting. Therefore, we propose a novel framework to pre-train the powerful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
MethodsAttentive Walk-Aggregating Graph Neural Network
