A Medical Information Extraction Workbench to Process German Clinical Text
Roland Roller, Laura Seiffe, Ammer Ayach, Sebastian M\"oller, Oliver, Marten, Michael Mikhailov, Christoph Alt, Danilo Schmidt, Fabian Halleck,, Marcel Naik, Wiebke Duettmann, Klemens Budde

TL;DR
This paper introduces a publicly available workbench of German clinical text processing models trained on de-identified nephrology reports, facilitating research and development in German biomedical NLP.
Contribution
It provides the first collection of German clinical NLP models trained on de-identified data, available for benchmarking and transfer to related biomedical tasks.
Findings
Models achieve promising in-domain results
Models can be applied to other German biomedical texts
Workbench is publicly available for use and benchmarking
Abstract
Background: In the information extraction and natural language processing domain, accessible datasets are crucial to reproduce and compare results. Publicly available implementations and tools can serve as benchmark and facilitate the development of more complex applications. However, in the context of clinical text processing the number of accessible datasets is scarce -- and so is the number of existing tools. One of the main reasons is the sensitivity of the data. This problem is even more evident for non-English languages. Approach: In order to address this situation, we introduce a workbench: a collection of German clinical text processing models. The models are trained on a de-identified corpus of German nephrology reports. Result: The presented models provide promising results on in-domain data. Moreover, we show that our models can be also successfully applied to other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
