DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction
John Wu, David Wu, Jimeng Sun

TL;DR
This paper introduces DILA, a novel interpretability module for high-dimensional multi-label medical coding that disentangles dense embeddings into a sparse, concept-based space, improving global interpretability and maintaining performance.
Contribution
The paper proposes DILA, a mechanism that transforms dense embeddings into a sparse, interpretable space with learned medical concepts, enhancing global understanding in medical coding models.
Findings
Sparse embeddings are at least 50% more human understandable.
Automated pipeline uncovers thousands of medical concepts.
Maintains competitive accuracy and scalability.
Abstract
Predicting high-dimensional or extreme multilabels, such as in medical coding, requires both accuracy and interpretability. Existing works often rely on local interpretability methods, failing to provide comprehensive explanations of the overall mechanism behind each label prediction within a multilabel set. We propose a mechanistic interpretability module called DIctionary Label Attention (\method) that disentangles uninterpretable dense embeddings into a sparse embedding space, where each nonzero element (a dictionary feature) represents a globally learned medical concept. Through human evaluations, we show that our sparse embeddings are more human understandable than its dense counterparts by at least 50 percent. Our automated dictionary feature identification pipeline, leveraging large language models (LLMs), uncovers thousands of learned medical concepts by examining and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
MethodsSoftmax · Attention Is All You Need
