MammoWise: Multi-Model Local RAG Pipeline for Mammography Report Generation
Raiyan Jahangir, Nafiz Imtiaz Khan, Amritanand Sudheerkumar, Vladimir Filkov

TL;DR
MammoWise is a flexible, local multi-model pipeline that enhances mammography report generation and classification using open source vision language models, with improved accuracy through few-shot learning and retrieval augmentation.
Contribution
It introduces a versatile, privacy-preserving framework for mammogram reporting that supports multiple models, prompting strategies, and retrieval-augmented generation, with effective fine-tuning.
Findings
Report quality improves with few-shot prompting and RAG.
BI-RADS classification accuracy reaches 0.7545 after fine-tuning.
The pipeline is adaptable to various models and datasets.
Abstract
Screening mammography is high volume, time sensitive, and documentation heavy. Radiologists must translate subtle visual findings into consistent BI-RADS assessments, breast density categories, and structured narrative reports. While recent Vision Language Models (VLMs) enable image-to-text reporting, many rely on closed cloud systems or tightly coupled architectures that limit privacy, reproducibility, and adaptability. We present MammoWise, a local multi-model pipeline that transforms open source VLMs into mammogram report generators and multi-task classifiers. MammoWise supports any Ollama-hosted VLM and mammography dataset, and enables zero-shot, few-shot, and Chain-of-Thought prompting, with optional multimodal Retrieval Augmented Generation (RAG) using a vector database for case-specific context. We evaluate MedGemma, LLaVA-Med, and Qwen2.5-VL on VinDr-Mammo and DMID datasets,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Multimodal Machine Learning Applications · Biomedical Text Mining and Ontologies
