A scoping review on multimodal deep learning in biomedical images and   texts

Zhaoyi Sun; Mingquan Lin; Qingqing Zhu; Qianqian Xie; Fei Wang,; Zhiyong Lu; Yifan Peng

arXiv:2307.07362·cs.CV·October 20, 2023

A scoping review on multimodal deep learning in biomedical images and texts

Zhaoyi Sun, Mingquan Lin, Qingqing Zhu, Qianqian Xie, Fei Wang,, Zhiyong Lu, Yifan Peng

PDF

TL;DR

This scoping review summarizes the current state, applications, and research gaps of multimodal deep learning combining biomedical images and texts, aiming to advance diagnostic and interpretative systems.

Contribution

It provides a comprehensive overview of multimodal deep learning in biomedical data, identifying key concepts, study types, and future research directions.

Findings

01

Multimodal deep learning is applied to report generation, visual question answering, and diagnosis.

02

Research gaps include limited integration techniques and evaluation standards.

03

Diverse applications highlight the potential of MDL in biomedical fields.

Abstract

Computer-assisted diagnostic and prognostic systems of the future should be capable of simultaneously processing multimodal data. Multimodal deep learning (MDL), which involves the integration of multiple sources of data, such as images and text, has the potential to revolutionize the analysis and interpretation of biomedical data. However, it only caught researchers' attention recently. To this end, there is a critical need to conduct a systematic review on this topic, identify the limitations of current work, and explore future directions. In this scoping review, we aim to provide a comprehensive overview of the current state of the field and identify key concepts, types of studies, and research gaps with a focus on biomedical images and texts joint learning, mainly because these two were the most commonly available data types in MDL research. This study reviewed the current uses of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus · Minimum Description Length