ViKL: A Mammography Interpretation Framework via Multimodal Aggregation   of Visual-knowledge-linguistic Features

Xin Wei; Yaling Tao; Changde Du; Gangming Zhao; Yizhou Yu; Jinpeng Li

arXiv:2409.15744·eess.IV·September 25, 2024

ViKL: A Mammography Interpretation Framework via Multimodal Aggregation of Visual-knowledge-linguistic Features

Xin Wei, Yaling Tao, Changde Du, Gangming Zhao, Yizhou Yu, Jinpeng Li

PDF

Open Access 1 Repo

TL;DR

ViKL introduces a multimodal framework combining visual, linguistic, and knowledge features for mammography interpretation, improving generalization and interpretability without requiring pathology labels.

Contribution

This paper presents MVKL, a new multimodal mammography dataset, and proposes ViKL, a novel unsupervised pretraining framework that integrates multiple modalities for better breast cancer diagnosis.

Findings

01

Enhanced pathological classification with multimodal pretraining

02

Manifestations enable a new hard negative sample selection

03

Features transfer effectively across datasets

Abstract

Mammography is the primary imaging tool for breast cancer diagnosis. Despite significant strides in applying deep learning to interpret mammography images, efforts that focus predominantly on visual features often struggle with generalization across datasets. We hypothesize that integrating additional modalities in the radiology practice, notably the linguistic features of reports and manifestation features embodying radiological insights, offers a more powerful, interpretable and generalizable representation. In this paper, we announce MVKL, the first multimodal mammography dataset encompassing multi-view images, detailed manifestations and reports. Based on this dataset, we focus on the challanging task of unsupervised pretraining and propose ViKL, a innovative framework that synergizes Visual, Knowledge, and Linguistic features. This framework relies solely on pairing information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wxwxwwxxx/vikl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · AI in cancer detection · Natural Language Processing Techniques

MethodsFocus · Contrastive Learning