MED-SE: Medical Entity Definition-based Sentence Embedding
Hyeonbin Hwang, Haanju Yoo, Yera Choi

TL;DR
MED-SE introduces an unsupervised contrastive learning method leveraging medical entity definitions to improve sentence embeddings specifically for clinical texts, outperforming existing methods in semantic similarity tasks.
Contribution
The paper presents MED-SE, a novel entity definition-based contrastive learning framework tailored for clinical sentence embeddings, addressing domain-specific challenges.
Findings
MED-SE outperforms existing unsupervised methods in clinical semantic textual similarity.
Entity-centric contrastive approaches better capture clinical sentence semantics.
Discrepancies between general and clinical texts are mitigated by MED-SE.
Abstract
We propose Medical Entity Definition-based Sentence Embedding (MED-SE), a novel unsupervised contrastive learning framework designed for clinical texts, which exploits the definitions of medical entities. To this end, we conduct an extensive analysis of multiple sentence embedding techniques in clinical semantic textual similarity (STS) settings. In the entity-centric setting that we have designed, MED-SE achieves significantly better performance, while the existing unsupervised methods including SimCSE show degraded performance. Our experiments elucidate the inherent discrepancies between the general- and clinical-domain texts, and suggest that entity-centric contrastive approaches may help bridge this gap and lead to a better representation of clinical sentences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsContrastive Learning · SimCSE
