Multi-Scale Transformer Architecture for Accurate Medical Image   Classification

Jiacheng Hu; Yanlin Xiang; Yang Lin; Junliang Du; Hanchao Zhang; Houze; Liu

arXiv:2502.06243·cs.CV·February 11, 2025

Multi-Scale Transformer Architecture for Accurate Medical Image Classification

Jiacheng Hu, Yanlin Xiang, Yang Lin, Junliang Du, Hanchao Zhang, Houze, Liu

PDF

Open Access

TL;DR

This paper presents a multi-scale Transformer architecture that significantly improves skin lesion classification accuracy and interpretability, outperforming existing models on the ISIC 2017 dataset.

Contribution

It introduces a novel multi-scale feature fusion mechanism within a Transformer model tailored for medical image classification, enhancing global and local feature extraction.

Findings

01

Outperforms ResNet50, VGG19, ResNext, and Vision Transformer on key metrics

02

Demonstrates superior accuracy, AUC, F1-Score, and Precision

03

Provides interpretable Grad-CAM visualizations aligning with lesion sites

Abstract

This study introduces an AI-driven skin lesion classification algorithm built on an enhanced Transformer architecture, addressing the challenges of accuracy and robustness in medical image analysis. By integrating a multi-scale feature fusion mechanism and refining the self-attention process, the model effectively extracts both global and local features, enhancing its ability to detect lesions with ambiguous boundaries and intricate structures. Performance evaluation on the ISIC 2017 dataset demonstrates that the improved Transformer surpasses established AI models, including ResNet50, VGG19, ResNext, and Vision Transformer, across key metrics such as accuracy, AUC, F1-Score, and Precision. Grad-CAM visualizations further highlight the interpretability of the model, showcasing strong alignment between the algorithm's focus areas and actual lesion sites. This research underscores the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBrain Tumor Detection and Classification