Beyond Label Attention: Transparency in Language Models for Automated   Medical Coding via Dictionary Learning

John Wu; David Wu; Jimeng Sun

arXiv:2411.00173·cs.CL·March 25, 2025

Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning

John Wu, David Wu, Jimeng Sun

PDF

Open Access 1 Video

TL;DR

This paper introduces a dictionary learning approach for medical coding with language models, improving interpretability by providing human-understandable explanations beyond traditional label attention methods.

Contribution

It presents a novel dictionary learning method that extracts sparse, interpretable features from language models, enhancing transparency in automated medical coding.

Findings

01

Dictionary features elucidate 90% of irrelevant tokens

02

Model behavior can be steered using dictionary features

03

Enhanced interpretability over label attention mechanisms

Abstract

Medical coding, the translation of unstructured clinical text into standardized medical codes, is a crucial but time-consuming healthcare practice. Though large language models (LLM) could automate the coding process and improve the efficiency of such tasks, interpretability remains paramount for maintaining patient trust. Current efforts in interpretability of medical coding applications rely heavily on label attention mechanisms, which often leads to the highlighting of extraneous tokens irrelevant to the ICD code. To facilitate accurate interpretability in medical language models, this paper leverages dictionary learning that can efficiently extract sparsely activated representations from dense language model embeddings in superposition. Compared with common label attention mechanisms, our model goes beyond token-level representations by building an interpretable dictionary which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning· underline

Taxonomy

TopicsNatural Language Processing Techniques · Biomedical Text Mining and Ontologies · linguistics and terminology studies

MethodsSoftmax · Attention Is All You Need