Automated Radiological Report Generation from Breast Ultrasound Images Using Vision and Language Transformers
Shaheen Khatoon, Azhar Mahmood

TL;DR
This paper introduces a new AI system that automatically generates radiology reports for breast ultrasound images using advanced machine learning techniques.
Contribution
The novel contribution is a multimodal Transformer framework that combines Vision Transformers and biomedical language models for breast ultrasound report generation.
Findings
BioBERT-based models show higher clinical specificity compared to general language models.
GPT-2-based decoders enhance the fluency of generated reports.
The proposed framework outperforms prior convolutional–recurrent architectures in report quality.
Abstract
Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support clinical workflows, yet most existing approaches focus on chest X-ray imaging and rely on convolutional–recurrent architectures with limited capacity to model long-range dependencies and complex clinical semantics. In this work, we propose a multimodal Transformer-based framework for automatic breast ultrasound report generation that integrates visual and textual information through cross-attention mechanisms. The proposed architecture employs a Vision Transformer (ViT) to extract rich spatial and morphological features from ultrasound images. For textual…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications
