Towards Scalable SOAP Note Generation: A Weakly Supervised Multimodal Framework
Sadia Kamal, Tim Oates, Joy Wan

TL;DR
This paper introduces a weakly supervised multimodal framework for generating structured SOAP notes from limited inputs, reducing annotation needs and clinician workload while maintaining high clinical relevance.
Contribution
It presents a novel approach that leverages weak supervision and multimodal data to generate SOAP notes, with new metrics for evaluating clinical quality.
Findings
Achieves performance comparable to state-of-the-art models.
Introduces MedConceptEval and CCS metrics for clinical assessment.
Reduces reliance on manual annotations for SOAP note generation.
Abstract
Skin carcinoma is the most prevalent form of cancer globally, accounting for over $8 billion in annual healthcare expenditures. In clinical settings, physicians document patient visits using detailed SOAP (Subjective, Objective, Assessment, and Plan) notes. However, manually generating these notes is labor-intensive and contributes to clinician burnout. In this work, we propose a weakly supervised multimodal framework to generate clinically structured SOAP notes from limited inputs, including lesion images and sparse clinical text. Our approach reduces reliance on manual annotations, enabling scalable, clinically grounded documentation while alleviating clinician burden and reducing the need for large annotated data. Our method achieves performance comparable to GPT-4o, Claude, and DeepSeek Janus Pro across key clinical relevance metrics. To evaluate clinical quality, we introduce two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Machine Learning in Healthcare
