MMSummary: Multimodal Summary Generation for Fetal Ultrasound Video

Xiaoqing Guo; Qianhui Men; and J. Alison Noble

arXiv:2408.03761·cs.CV·October 31, 2024

MMSummary: Multimodal Summary Generation for Fetal Ultrasound Video

Xiaoqing Guo, Qianhui Men, and J. Alison Noble

PDF

Open Access

TL;DR

MMSummary is an automated multimodal system that generates comprehensive summaries of fetal ultrasound videos, reducing scanning time and aiding clinical workflow through keyframe detection, captioning, and biometric measurement.

Contribution

The paper introduces the first automated multimodal summary generation system for fetal ultrasound videos, integrating keyframe detection, captioning, and biometric analysis in a three-stage pipeline.

Findings

01

Reduces scanning time by approximately 31.5%.

02

Provides comprehensive summaries of fetal ultrasound examinations.

03

Automates keyframe selection, captioning, and biometric measurement.

Abstract

We present the first automated multimodal summary generation system, MMSummary, for medical imaging video, particularly with a focus on fetal ultrasound analysis. Imitating the examination process performed by a human sonographer, MMSummary is designed as a three-stage pipeline, progressing from keyframe detection to keyframe captioning and finally anatomy segmentation and measurement. In the keyframe detection stage, an innovative automated workflow is proposed to progressively select a concise set of keyframes, preserving sufficient video information without redundancy. Subsequently, we adapt a large language model to generate meaningful captions for fetal ultrasound keyframes in the keyframe captioning stage. If a keyframe is captioned as fetal biometry, the segmentation and measurement stage estimates biometric parameters by segmenting the region of interest according to the textual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsSparse Evolutionary Training · Focus