Learning Domain-Specialised Representations for Cross-Lingual Biomedical   Entity Linking

Fangyu Liu; Ivan Vuli\'c; Anna Korhonen; Nigel Collier

arXiv:2105.14398·cs.CL·June 1, 2021

Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking

Fangyu Liu, Ivan Vuli\'c, Anna Korhonen, Nigel Collier

PDF

1 Repo

TL;DR

This paper introduces a new cross-lingual biomedical entity linking benchmark across 10 languages, investigates knowledge transfer from resource-rich to resource-poor languages, and proposes methods that improve performance without in-domain data.

Contribution

It establishes the XL-BEL benchmark, analyzes the limitations of existing models, and proposes novel transfer methods leveraging general-domain bitext for resource-efficient knowledge transfer.

Findings

01

Significant performance gaps between English and other languages in biomedical entity linking.

02

Cross-lingual transfer methods improve results up to 20 Precision@1 points.

03

Domain-specific transfer methods work effectively without in-domain data.

Abstract

Injecting external domain-specific knowledge (e.g., UMLS) into pretrained language models (LMs) advances their capability to handle specialised in-domain tasks such as biomedical entity linking (BEL). However, such abundant expert knowledge is available only for a handful of languages (e.g., English). In this work, by proposing a novel cross-lingual biomedical entity linking task (XL-BEL) and establishing a new XL-BEL benchmark spanning 10 typologically diverse languages, we first investigate the ability of standard knowledge-agnostic as well as knowledge-enhanced monolingual and multilingual LMs beyond the standard monolingual English BEL task. The scores indicate large gaps to English performance. We then address the challenge of transferring domain-specific knowledge in resource-rich languages to resource-poor ones. To this end, we propose and evaluate a series of cross-lingual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cambridgeltl/sapbert
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.