nnMobileNet: Rethinking CNN for Retinopathy Research
Wenhui Zhu, Peijie Qiu, Xiwen Chen, Xin Li, Natasha Lepore, Oana M., Dumitrascu, Yalin Wang

TL;DR
This paper presents an optimized MobileNet CNN architecture that outperforms vision transformer models in retinal disease detection tasks, offering a more efficient alternative for medical diagnostics.
Contribution
We redesigned and optimized MobileNet for retinal disease detection, demonstrating it surpasses ViT models in accuracy and efficiency on multiple benchmarks.
Findings
Optimized MobileNet outperforms ViT models in RD benchmarks.
Modified CNN achieves higher accuracy in diabetic retinopathy grading.
Code available for reproducibility and further research.
Abstract
Over the past few decades, convolutional neural networks (CNNs) have been at the forefront of the detection and tracking of various retinal diseases (RD). Despite their success, the emergence of vision transformers (ViT) in the 2020s has shifted the trajectory of RD model development. The leading-edge performance of ViT-based models in RD can be largely credited to their scalability-their ability to improve as more parameters are added. As a result, ViT-based models tend to outshine traditional CNNs in RD applications, albeit at the cost of increased data and computational demands. ViTs also differ from CNNs in their approach to processing images, working with patches rather than local regions, which can complicate the precise localization of small, variably presented lesions in RD. In our study, we revisited and updated the architecture of a CNN model, specifically MobileNet, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRetinal Imaging and Analysis · Retinal and Optic Conditions · COVID-19 diagnosis using AI
MethodsMulti-Head Attention · Attention Is All You Need · Residual Connection · Linear Layer · Label Smoothing · Dropout · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Softmax
