CODER: Knowledge infused cross-lingual medical term embedding for term normalization
Zheng Yuan, Zhengyun Zhao, Haixia Sun, Jiao Li, Fei Wang, and Sheng Yu

TL;DR
CODER introduces a contrastive learning approach on medical knowledge graphs to generate cross-lingual medical term embeddings, significantly improving term normalization and semantic tasks across languages.
Contribution
The paper presents a novel contrastive learning method on medical knowledge graphs for cross-lingual medical term embedding, enhancing normalization and semantic understanding.
Findings
Outperforms state-of-the-art biomedical embeddings in benchmarks
Effective zero-shot cross-lingual term normalization
Improves semantic similarity and relation classification tasks
Abstract
This paper proposes CODER: contrastive learning on knowledge graphs for cross-lingual medical term representation. CODER is designed for medical term normalization by providing close vector representations for different terms that represent the same or similar medical concepts with cross-lingual support. We train CODER via contrastive learning on a medical knowledge graph (KG) named the Unified Medical Language System, where similarities are calculated utilizing both terms and relation triplets from KG. Training with relations injects medical knowledge into embeddings and aims to provide potentially better machine learning features. We evaluate CODER in zero-shot term normalization, semantic similarity, and relation classification benchmarks, which show that CODERoutperforms various state-of-the-art biomedical word embedding, concept embeddings, and contextual embeddings. Our codes and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · linguistics and terminology studies · Medical and Biological Sciences
MethodsLinear Layer · Contrastive Learning · Softmax · Dense Connections · WordPiece · Linear Warmup With Linear Decay · Attention Dropout · Residual Connection · Adam · Dropout
