ProMRVL-CAD: Proactive Dialogue System with Multi-Round Vision-Language   Interactions for Computer-Aided Diagnosis

Xueshen Li; Xinlong Hou; Ziyi Huang; Yu Gan

arXiv:2502.10620·cs.AI·February 18, 2025·2 cites

ProMRVL-CAD: Proactive Dialogue System with Multi-Round Vision-Language Interactions for Computer-Aided Diagnosis

Xueshen Li, Xinlong Hou, Ziyi Huang, Yu Gan

PDF

Open Access

TL;DR

This paper introduces ProMRVL-CAD, a proactive multi-round vision-language dialogue system that enhances computer-aided diagnosis by generating patient-friendly reports through interactive questioning and knowledge integration.

Contribution

It presents a novel proactive dialogue system with two specialized generators, improving diagnostic report quality and robustness, especially with low-quality images, and provides a synthetic dataset for training.

Findings

01

Outperforms existing models in report quality on MIMIC-CXR and IU-Xray datasets.

02

Demonstrates robustness in scenarios with low image quality.

03

Provides a new synthetic dialogue dataset for medical diagnosis training.

Abstract

Recent advancements in large language models (LLMs) have demonstrated extraordinary comprehension capabilities with remarkable breakthroughs on various vision-language tasks. However, the application of LLMs in generating reliable medical diagnostic reports remains in the early stages. Currently, medical LLMs typically feature a passive interaction model where doctors respond to patient queries with little or no involvement in analyzing medical images. In contrast, some ChatBots simply respond to predefined queries based on visual inputs, lacking interactive dialogue or consideration of medical history. As such, there is a gap between LLM-generated patient-ChatBot interactions and those occurring in actual patient-doctor consultations. To bridge this gap, we develop an LLM-based dialogue system, namely proactive multi-round vision-language interactions for computer-aided diagnosis…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Robotics and Automated Systems · AI in Service Interactions