BUSTR: Breast Ultrasound Text Reporting with a Descriptor-Aware Vision-Language Model

Rawa Mohammed; Mina Attin; Bryar Shareef

arXiv:2511.20956·cs.CV·November 27, 2025

BUSTR: Breast Ultrasound Text Reporting with a Descriptor-Aware Vision-Language Model

Rawa Mohammed, Mina Attin, Bryar Shareef

PDF

Open Access

TL;DR

BUSTR is a multitask vision-language model that generates breast ultrasound reports from images using structured descriptors and radiomics features, improving report quality and clinical relevance without needing paired datasets.

Contribution

It introduces a descriptor-aware vision-language framework trained with multitask and alignment losses, enabling report generation without paired image-report supervision.

Findings

01

Improves natural language generation metrics on two datasets.

02

Enhances clinical efficacy metrics, especially for BI-RADS and pathology.

03

Operates effectively without paired image-report data.

Abstract

Automated radiology report generation (RRG) for breast ultrasound (BUS) is limited by the lack of paired image-report datasets and the risk of hallucinations from large language models. We propose BUSTR, a multitask vision-language framework that generates BUS reports without requiring paired image-report supervision. BUSTR constructs reports from structured descriptors (e.g., BI-RADS, pathology, histology) and radiomics features, learns descriptor-aware visual representations with a multi-head Swin encoder trained using a multitask loss over dataset-specific descriptor sets, and aligns visual and textual tokens via a dual-level objective that combines token-level cross-entropy with a cosine-similarity alignment loss between input and output representations. We evaluate BUSTR on two public BUS datasets, BrEaST and BUS-BRA, which differ in size and available descriptors. Across both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Multimodal Machine Learning Applications · Radiology practices and education