A dataset for mapping the Japanese drugs to RxNorm standard concepts
Eizen Kimura, Yukinobu Kawakami, Shingo Inoue, Ai Okajima

TL;DR
This paper introduces a dataset that maps Japanese drugs to the RxNorm standard, enabling international research using Japanese real-world healthcare data.
Contribution
The paper provides a validated dataset for mapping Japanese pharmaceutical terms to RxNorm using a large language model and expert validation.
Findings
A large language model accurately identified RxNorm mapping candidates for Japanese drugs.
Pharmacists and researchers validated the mappings, resulting in an ingredient-based alignment.
The dataset supports NLP and ML development for terminology mapping in healthcare.
Abstract
Observational Health Data Sciences and Informatics (OHDSI) is an international research community dedicated to large-scale observational studies using real-world healthcare data. Participation in OHDSI requires mapping local terminology systems to the OHDSI Standard Vocabulary (OSV) and transforming healthcare data into the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), a standardized database schema. Adherence to the OSV and CDM enables the integration of datasets from different countries and regions, facilitating international cross-sectional analyses and supporting the discovery of large-scale evidence and new medical knowledge. Despite the globally recognized healthcare technology and systems excellence in Japan, Japanese real-world data (RWD) remain underutilized in international research. This is primarily due to reliance on domestically managed…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Biomedical Text Mining and Ontologies
