AResNet-ViT: A Hybrid CNN-Transformer Network for Benign and Malignant Breast Nodule Classification in Ultrasound Images
Xin Zhao, Qianqian Zhu, Jialing Wu

TL;DR
This paper introduces AResNet-ViT, a hybrid CNN-Transformer network that effectively combines local and global feature extraction for improved classification of benign and malignant breast nodules in ultrasound images.
Contribution
The paper proposes a novel dual-branch CNN-Transformer architecture that enhances feature extraction for breast nodule classification in ultrasound images.
Findings
Outperforms existing comparison networks on public dataset
Effectively captures local details and global features
Improves accuracy in benign-malignant classification
Abstract
To address the challenges of similarity between lesions and surrounding tissues, overlapping appearances of partially benign and malignant nodules, and difficulty in classification, a deep learning network that integrates CNN and Transformer is proposed for the classification of benign and malignant breast lesions in ultrasound images. This network adopts a dual-branch architecture for local-global feature extraction, making full use of the advantages of CNN in extracting local features and the ability of ViT to extract global features to enhance the network's feature extraction capabilities for breast nodules. The local feature extraction branch employs a residual network with multiple attention-guided modules, which can effectively capture the local details and texture features of breast nodules, enhance sensitivity to subtle changes within the nodules, and thus can aid in accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection
MethodsAttention Is All You Need · Sparse Evolutionary Training · Label Smoothing · Adam · Linear Layer · Byte Pair Encoding · Layer Normalization · Softmax · Position-Wise Feed-Forward Layer · Dense Connections
