Diagnosable ColBERT: Debugging Late-Interaction Retrieval Models Using a Learned Latent Space as Reference
Fran\c{c}ois Remy

TL;DR
Diagnosable ColBERT enhances interpretability of biomedical retrieval models by aligning token embeddings with clinical knowledge, facilitating error diagnosis and data curation.
Contribution
It introduces a framework that aligns ColBERT token embeddings to a clinical knowledge-based latent space for better interpretability and debugging.
Findings
Enables inspection of model understanding through aligned embeddings
Facilitates diagnosis of model errors without extensive diagnostic queries
Supports more principled data curation in biomedical retrieval
Abstract
Reliable biomedical and clinical retrieval requires more than strong ranking performance: it requires a practical way to find systematic model failures and curate the training evidence needed to correct them. Late-interaction models such as ColBERT provide a first solution thanks to the interpretable token-level interaction scores they expose between document and query tokens. Yet this interpretability is shallow: it explains a particular document--query pairwise score, but does not reveal whether the model has learned a clinical concept in a stable, reusable, and context-sensitive way across diverse expressions. As a result, these scores provide limited support for diagnosing misunderstandings, identifying irreasonably distant biomedical concepts, or deciding what additional data or feedback is needed to address this. In this short position paper, we propose Diagnosable ColBERT, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
