Recurrent Visual Feature Extraction and Stereo Attentions for CT Report Generation
Yuanhe Tian, Lei Mao, Yan Song

TL;DR
This paper introduces a novel CT report generation method using recurrent visual feature extraction and stereo attentions, leveraging a vision Transformer and hierarchical modeling to improve accuracy over existing approaches.
Contribution
It proposes a new hierarchical feature modeling approach with recurrent visual extraction and stereo attentions, explicitly capturing inter-slice transformations for better report generation.
Findings
Outperforms baseline models on M3D-Cap dataset
Achieves state-of-the-art results in CT report generation
Demonstrates effectiveness of hierarchical visual feature modeling
Abstract
Generating reports for computed tomography (CT) images is a challenging task, while similar to existing studies for medical image report generation, yet has its unique characteristics, such as spatial encoding of multiple images, alignment between image volume and texts, etc. Existing solutions typically use general 2D or 3D image processing techniques to extract features from a CT volume, where they firstly compress the volume and then divide the compressed CT slices into patches for visual encoding. These approaches do not explicitly account for the transformations among CT slices, nor do they effectively integrate multi-level image features, particularly those containing specific organ lesions, to instruct CT report generation (CTRG). In considering the strong correlation among consecutive slices in CT scans, in this paper, we propose a large language model (LLM) based CTRG method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Biomedical Text Mining and Ontologies · Image Retrieval and Classification Techniques
