Active learning for medical code assignment
Martha Dais Ferreira, Michal Malyska, Nicola Sahar, Riccardo Miotto,, Fernando Paulovich, Evangelos Milios

TL;DR
This paper demonstrates that active learning can effectively reduce the amount of labeled data needed for automatic medical code assignment from electronic health records, maintaining high accuracy with fewer annotations.
Contribution
It applies well-known active learning methods to multi-label clinical text classification, showing significant reduction in annotation effort for ICD-9 code assignment.
Findings
Achieved satisfactory classification with only 8.3% of total instances
Active learning reduces manual annotation costs substantially
Maintains model performance with fewer labeled examples
Abstract
Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision-making. However, ML models require a large number of annotated examples to provide satisfactory results, which is not possible in most healthcare scenarios due to the high cost of clinician-labeled data. Active Learning (AL) is a process of selecting the most informative instances to be labeled by an expert to further train a supervised algorithm. We demonstrate the effectiveness of AL in multi-label text classification in the clinical domain. In this context, we apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset. Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set (8.3\%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Biomedical Text Mining and Ontologies
