Evaluation and LLM-Guided Learning of ICD Coding Rationales

Mingyang Li; Viktor Schlegel; Tingting Mu; Wuraola Oyewusi; Kai Kang; Goran Nenadic

arXiv:2508.16777·cs.AI·March 13, 2026

Evaluation and LLM-Guided Learning of ICD Coding Rationales

Mingyang Li, Viktor Schlegel, Tingting Mu, Wuraola Oyewusi, Kai Kang, Goran Nenadic

PDF

1 Video

TL;DR

This paper evaluates the explainability of ICD coding models by systematically analyzing faithfulness and plausibility of different rationale types, introducing a new dataset, and leveraging LLMs for rationale generation and learning.

Contribution

It introduces a novel multi-granular rationale-annotated ICD dataset and develops LLM-guided rationale learning methods to improve explainability in ICD coding.

Findings

01

LLM-generated rationales exhibit high plausibility.

02

Rationale learning with LLM guidance improves plausibility of explanations.

03

A new dataset enables systematic evaluation of ICD rationale explainability.

Abstract

ICD coding is the process of mapping unstructured text from Electronic Health Records (EHRs) to standardised codes defined by the International Classification of Diseases (ICD) system. In order to promote trust and transparency, existing explorations on the explainability of ICD coding models primarily rely on attention-based rationales and qualitative assessments conducted by physicians, yet lack a systematic evaluation across diverse types of rationales using consistent criteria and high-quality rationale-annotated datasets specifically designed for the ICD coding task. Moreover, dedicated methods explicitly trained to generate plausible rationales remain scarce. In this work, we present evaluations of the explainability of rationales in ICD coding, focusing on two fundamental dimensions: faithfulness and plausibility -- in short how rationales influence model decisions and how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Evaluation and LLM-Guided Learning of ICD Coding Rationales· underline