A Hybrid Architecture for Benign-Malignant Classification of Mammography ROIs
Mohammed Asad, Mohit Bajpai, Sudhir Singh, Rahul Katarya

TL;DR
This paper introduces a hybrid model combining EfficientNetV2-M and Vision Mamba SSM for improved benign-malignant classification of mammography ROIs, balancing local and global feature modeling.
Contribution
It proposes a novel hybrid architecture that integrates CNNs with a linear-complexity sequence model for efficient global context understanding in mammography classification.
Findings
Achieves strong lesion-level classification performance on CBIS-DDSM dataset.
Combines CNN and SSM for effective local and global feature extraction.
Addresses computational challenges of Vision Transformers with a hybrid approach.
Abstract
Accurate characterization of suspicious breast lesions in mammography is important for early diagnosis and treatment planning. While Convolutional Neural Networks (CNNs) are effective at extracting local visual patterns, they are less suited to modeling long-range dependencies. Vision Transformers (ViTs) address this limitation through self-attention, but their quadratic computational cost can be prohibitive. This paper presents a hybrid architecture that combines EfficientNetV2-M for local feature extraction with Vision Mamba, a State Space Model (SSM), for efficient global context modeling. The proposed model performs binary classification of abnormality-centered mammography regions of interest (ROIs) from the CBIS-DDSM dataset into benign and malignant classes. By combining a strong CNN backbone with a linear-complexity sequence model, the approach achieves strong lesion-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
