Convolutional Nets Versus Vision Transformers for Diabetic Foot Ulcer Classification
Adrian Galdran, Gustavo Carneiro, Miguel A. Gonz\'alez Ballester

TL;DR
This study compares CNNs and Vision Transformers for diabetic foot ulcer classification, showing CNNs outperform Transformers in low-data settings and that SAM optimization enhances model generalization, leading to the best results.
Contribution
The paper provides a comprehensive comparison between CNNs and Vision Transformers for diabetic foot ulcer classification, highlighting the effectiveness of CNNs and the SAM optimizer.
Findings
CNNs outperform Transformers in low-data regimes.
SAM optimizer improves generalization for both models.
CNNs with SAM achieve the best performance.
Abstract
This paper compares well-established Convolutional Neural Networks (CNNs) to recently introduced Vision Transformers for the task of Diabetic Foot Ulcer Classification, in the context of the DFUC 2021 Grand-Challenge, in which this work attained the first position. Comprehensive experiments demonstrate that modern CNNs are still capable of outperforming Transformers in a low-data regime, likely owing to their ability for better exploiting spatial correlations. In addition, we empirically demonstrate that the recent Sharpness-Aware Minimization (SAM) optimization algorithm considerably improves the generalization capability of both kinds of models. Our results demonstrate that for this task, the combination of CNNs and the SAM optimization process results in superior performance than any other of the considered approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSharpness-Aware Minimization
