Medical Coding with Biomedical Transformer Ensembles and Zero/Few-shot Learning
Angelo Ziletti, Alan Akbik, Christoph Berns, Thomas Herold, Marion, Legler, Martina Viell

TL;DR
This paper introduces xTARS, a novel ensemble method combining BERT and zero/few-shot learning for medical coding, achieving improved accuracy especially with limited training data, and is deployed in a real-world setting.
Contribution
The paper presents xTARS, a new ensemble approach that enhances medical coding accuracy by integrating BERT with zero/few-shot learning techniques, addressing data scarcity and large label space.
Findings
Outperforms strong baselines in few-shot scenarios
Effective in real-world deployment at Bayer since 2021
Reproducible code released for research community
Abstract
Medical coding (MC) is an essential pre-requisite for reliable data retrieval and reporting. Given a free-text reported term (RT) such as "pain of right thigh to the knee", the task is to identify the matching lowest-level term (LLT) - in this case "unilateral leg pain" - from a very large and continuously growing repository of standardized medical terms. However, automating this task is challenging due to a large number of LLT codes (as of writing over 80,000), limited availability of training data for long tail/emerging classes, and the general high accuracy demands of the medical domain. With this paper, we introduce the MC task, discuss its challenges, and present a novel approach called xTARS that combines traditional BERT-based classification with a recent zero/few-shot learning approach (TARS). We present extensive experiments that show that our combined approach outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Traditional Chinese Medicine Studies
