IAM at CLEF eHealth 2018: Concept Annotation and Coding in French Death Certificates
S\'ebastien Cossin, Vianney Jouhet, Fleur Mougin, Gayo Diallo, Frantz, Thiessard

TL;DR
This paper presents a dictionary-based system for automatically assigning ICD-10 codes to French death certificates, achieving high accuracy in a multilingual eHealth challenge.
Contribution
It introduces a normalization and typo-tolerant approach using a tree structure and manual abbreviation detection for improved coding accuracy.
Findings
Achieved an F-score of 0.786, outperforming other systems.
Utilized Levenshtein distance for typo detection.
Demonstrated effectiveness in multilingual medical coding.
Abstract
In this paper, we describe the approach and results for our participation in the task 1 (multilingual information extraction) of the CLEF eHealth 2018 challenge. We addressed the task of automatically assigning ICD-10 codes to French death certificates. We used a dictionary-based approach using materials provided by the task organizers. The terms of the ICD-10 terminology were normalized, tokenized and stored in a tree data structure. The Levenshtein distance was used to detect typos. Frequent abbreviations were detected by manually creating a small set of them. Our system achieved an F-score of 0.786 (precision: 0.794, recall: 0.779). These scores were substantially higher than the average score of the systems that participated in the challenge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Natural Language Processing Techniques
