Ensemble BERT for Medication Event Classification on Electronic Health Records (EHRs)
Shouvon Sarker, Xishuang Dong, and Lijun Qian

TL;DR
This paper presents an ensemble approach using multiple BERT models, pretrained on diverse datasets and fine-tuned on clinical data, to improve medication event classification accuracy in electronic health records.
Contribution
The study introduces a novel BERT-based ensemble model that enhances medication event classification in clinical notes through pretrained models and voting strategies.
Findings
Ensemble BERT improved strict Micro-F score by about 5%.
Ensemble BERT improved strict Macro-F score by about 6%.
Pretraining on diverse datasets benefits clinical NLP tasks.
Abstract
Identification of key variables such as medications, diseases, relations from health records and clinical notes has a wide range of applications in the clinical domain. n2c2 2022 provided shared tasks on challenges in natural language processing for clinical data analytics on electronic health records (EHR), where it built a comprehensive annotated clinical data Contextualized Medication Event Dataset (CMED). This study focuses on subtask 2 in Track 1 of this challenge that is to detect and classify medication events from clinical notes through building a novel BERT-based ensemble model. It started with pretraining BERT models on different types of big data such as Wikipedia and MIMIC. Afterwards, these pretrained BERT models were fine-tuned on CMED training data. These fine-tuned BERT models were employed to accomplish medication event classification on CMED testing data with multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Electronic Health Records Systems
