ViKL: A Mammography Interpretation Framework via Multimodal Aggregation of Visual-knowledge-linguistic Features
Xin Wei, Yaling Tao, Changde Du, Gangming Zhao, Yizhou Yu, Jinpeng Li

TL;DR
ViKL introduces a multimodal framework combining visual, linguistic, and knowledge features for mammography interpretation, improving generalization and interpretability without requiring pathology labels.
Contribution
This paper presents MVKL, a new multimodal mammography dataset, and proposes ViKL, a novel unsupervised pretraining framework that integrates multiple modalities for better breast cancer diagnosis.
Findings
Enhanced pathological classification with multimodal pretraining
Manifestations enable a new hard negative sample selection
Features transfer effectively across datasets
Abstract
Mammography is the primary imaging tool for breast cancer diagnosis. Despite significant strides in applying deep learning to interpret mammography images, efforts that focus predominantly on visual features often struggle with generalization across datasets. We hypothesize that integrating additional modalities in the radiology practice, notably the linguistic features of reports and manifestation features embodying radiological insights, offers a more powerful, interpretable and generalizable representation. In this paper, we announce MVKL, the first multimodal mammography dataset encompassing multi-view images, detailed manifestations and reports. Based on this dataset, we focus on the challanging task of unsupervised pretraining and propose ViKL, a innovative framework that synergizes Visual, Knowledge, and Linguistic features. This framework relies solely on pairing information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · AI in cancer detection · Natural Language Processing Techniques
MethodsFocus · Contrastive Learning
