TL;DR
This paper presents a two-stage BERT-based neural approach for zero-shot concept normalization in biomedical texts, effectively transferring knowledge from scientific literature to clinical trial data.
Contribution
It introduces a simple, effective method combining metric learning and embedding similarity for biomedical concept normalization across domains.
Findings
Effective transfer of concept normalization from scientific abstracts to clinical trial data.
Outperforms state-of-the-art models in zero-shot biomedical concept normalization.
Validated on real-world clinical trial datasets.
Abstract
Concept normalization in free-form texts is a crucial step in every text-mining pipeline. Neural architectures based on Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art results in the biomedical domain. In the context of drug discovery and development, clinical trials are necessary to establish the efficacy and safety of drugs. We investigate the effectiveness of transferring concept normalization from the general biomedical domain to the clinical trials domain in a zero-shot setting with an absence of labeled data. We propose a simple and effective two-stage neural approach based on fine-tuned BERT architectures. In the first stage, we train a metric learning model that optimizes relative similarity of mentions and concepts via triplet loss. The model is trained on available labeled corpora of scientific abstracts to obtain vector embeddings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Layer Normalization · WordPiece · Residual Connection · Attention Dropout · Attention Is All You Need · Dense Connections · Adam · Linear Warmup With Linear Decay · Dropout
