UVA Resources for the Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus
Vinh Nguyen, Olivier Bodenreider

TL;DR
This paper introduces reusable resources and baselines for the UVA task, aimed at improving the efficiency and accuracy of UMLS Metathesaurus construction through dataset generation and neural/logical models.
Contribution
It provides a dataset generator, multiple datasets, and baseline models for the UVA task, facilitating scalable and reproducible UMLS vocabulary alignment research.
Findings
Generated datasets for three UMLS releases demonstrating the generator's flexibility.
Baseline models using neural networks and logical rules for UVA.
Resources are publicly available for community use and further development.
Abstract
The construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus is time-consuming, costly, and error-prone as it relies on (1) the lexical and semantic processing for suggesting synonymous terms, and (2) the expertise of UMLS editors for curating the suggestions. For improving the UMLS Metathesaurus construction process, our research group has defined a new task called UVA (UMLS Vocabulary Alignment) and generated a dataset for evaluating the task. Our group has also developed different baselines for this task using logical rules (RBA), and neural networks (LexLM and ConLM). In this paper, we present a set of reusable and reproducible resources including (1) a dataset generator, (2) three datasets generated by using the generator, and (3) three baseline approaches. We describe the UVA dataset generator and its implementation generalized for any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies · Topic Modeling
