A Hybrid Architecture for Benign-Malignant Classification of Mammography ROIs

Mohammed Asad; Mohit Bajpai; Sudhir Singh; Rahul Katarya

arXiv:2604.12437·cs.CV·April 15, 2026

A Hybrid Architecture for Benign-Malignant Classification of Mammography ROIs

Mohammed Asad, Mohit Bajpai, Sudhir Singh, Rahul Katarya

PDF

TL;DR

This paper introduces a hybrid model combining EfficientNetV2-M and Vision Mamba SSM for improved benign-malignant classification of mammography ROIs, balancing local and global feature modeling.

Contribution

It proposes a novel hybrid architecture that integrates CNNs with a linear-complexity sequence model for efficient global context understanding in mammography classification.

Findings

01

Achieves strong lesion-level classification performance on CBIS-DDSM dataset.

02

Combines CNN and SSM for effective local and global feature extraction.

03

Addresses computational challenges of Vision Transformers with a hybrid approach.

Abstract

Accurate characterization of suspicious breast lesions in mammography is important for early diagnosis and treatment planning. While Convolutional Neural Networks (CNNs) are effective at extracting local visual patterns, they are less suited to modeling long-range dependencies. Vision Transformers (ViTs) address this limitation through self-attention, but their quadratic computational cost can be prohibitive. This paper presents a hybrid architecture that combines EfficientNetV2-M for local feature extraction with Vision Mamba, a State Space Model (SSM), for efficient global context modeling. The proposed model performs binary classification of abnormality-centered mammography regions of interest (ROIs) from the CBIS-DDSM dataset into benign and malignant classes. By combining a strong CNN backbone with a linear-complexity sequence model, the approach achieves strong lesion-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.