RS-CA-HSICT: A Residual and Spatial Channel Augmented CNN Transformer Framework for Monkeypox Detection
Rashid Iqbal, Saddam Hussain Khan

TL;DR
This paper introduces RS-CA-HSICT, a hybrid CNN-Transformer framework that combines residual, spatial, and attention mechanisms to improve Monkeypox detection with high accuracy and robustness.
Contribution
The novel RS-CA-HSICT architecture integrates CNN and Transformer modules with residual and spatial learning for enhanced feature extraction in Monkeypox detection.
Findings
Achieved up to 98.30% classification accuracy.
Outperformed existing CNN and ViT models.
Demonstrated robustness across diverse datasets.
Abstract
This work proposes a hybrid deep learning approach, namely Residual and Spatial Learning based Channel Augmented Integrated CNN-Transformer architecture, that leverages the strengths of CNN and Transformer towards enhanced MPox detection. The proposed RS-CA-HSICT framework is composed of an HSICT block, a residual CNN module, a spatial CNN block, and a CA, which enhances the diverse feature space, detailed lesion information, and long-range dependencies. The new HSICT module first integrates an abstract representation of the stem CNN and customized ICT blocks for efficient multihead attention and structured CNN layers with homogeneous (H) and structural (S) operations. The customized ICT blocks learn global contextual interactions and local texture extraction. Additionally, H and S layers learn spatial homogeneity and fine structural details by reducing noise and modeling complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoxvirus research and outbreaks · COVID-19 diagnosis using AI · Digital Media Forensic Detection
