Drug and Disease Interpretation Learning with Biomedical Entity   Representation Transformer

Zulfat Miftahutdinov; Artur Kadurin; Roman Kudrin; and Elena; Tutubalina

arXiv:2101.09311·cs.CL·January 26, 2021

Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer

Zulfat Miftahutdinov, Artur Kadurin, Roman Kudrin, and Elena, Tutubalina

PDF

1 Repo

TL;DR

This paper presents a two-stage BERT-based neural approach for zero-shot concept normalization in biomedical texts, effectively transferring knowledge from scientific literature to clinical trial data.

Contribution

It introduces a simple, effective method combining metric learning and embedding similarity for biomedical concept normalization across domains.

Findings

01

Effective transfer of concept normalization from scientific abstracts to clinical trial data.

02

Outperforms state-of-the-art models in zero-shot biomedical concept normalization.

03

Validated on real-world clinical trial datasets.

Abstract

Concept normalization in free-form texts is a crucial step in every text-mining pipeline. Neural architectures based on Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art results in the biomedical domain. In the context of drug discovery and development, clinical trials are necessary to establish the efficacy and safety of drugs. We investigate the effectiveness of transferring concept normalization from the general biomedical domain to the clinical trials domain in a zero-shot setting with an absence of labeled data. We propose a simple and effective two-stage neural approach based on fine-tuned BERT architectures. In the first stage, we train a metric learning model that optimizes relative similarity of mentions and concepts via triplet loss. The model is trained on available labeled corpora of scientific abstracts to obtain vector embeddings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

insilicomedicine/DILBERT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Layer Normalization · WordPiece · Residual Connection · Attention Dropout · Attention Is All You Need · Dense Connections · Adam · Linear Warmup With Linear Decay · Dropout