Stone Needle: A General Multimodal Large-scale Model Framework towards Healthcare
Weihua Liu, Yong Zuo

TL;DR
Stone Needle is a comprehensive multimodal large-scale model framework designed for healthcare, integrating various data types like text, images, videos, and audio to improve diagnostic accuracy and patient care.
Contribution
It introduces a novel general multimodal model framework tailored for healthcare, capable of multi-round dialogue and integrating diverse modalities for improved medical analysis.
Findings
Outperforms single-modal systems in medical tasks
Effectively integrates multiple data modalities
Enhances diagnosis and treatment recommendations
Abstract
In healthcare, multimodal data is prevalent and requires to be comprehensively analyzed before diagnostic decisions, including medical images, clinical reports, etc. However, current large-scale artificial intelligence models predominantly focus on single-modal cognitive abilities and neglect the integration of multiple modalities. Therefore, we propose Stone Needle, a general multimodal large-scale model framework tailored explicitly for healthcare applications. Stone Needle serves as a comprehensive medical multimodal model foundation, integrating various modalities such as text, images, videos, and audio to surpass the limitations of single-modal systems. Through the framework components of intent analysis, medical foundation models, prompt manager, and medical language module, our architecture can perform multi-modal interaction in multiple rounds of dialogue. Our method is a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems
MethodsFocus
