ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic   Nerve Head Detection

Jiayi Wang; Yi-An Mao; Xiaoyu Ma; Sicen Guo; Yuting Shao; Xiao Lv,; Wenting Han; Mark Christopher; Linda M. Zangwill; Yanlong Bi; Rui Fan

arXiv:2405.09552·eess.IV·June 4, 2024

ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv,, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

PDF

Open Access

TL;DR

This paper introduces ODFormer, a transformer-based network for optic nerve head detection in fundus images, along with a new dataset and benchmark to improve generalizability across diverse datasets and camera types.

Contribution

The paper presents a novel transformer-based network, ODFormer, a large-scale multi-camera dataset TongjiU-DROD, and a comprehensive benchmark for ONH detection, addressing dataset discrepancy issues.

Findings

01

ODFormer outperforms existing models in accuracy and generalizability.

02

The TongjiU-DROD dataset enhances diversity in fundus imaging data.

03

The benchmark facilitates evaluation across multiple datasets and camera types.

Abstract

Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose semantic segmentation methods using convolutional neural networks (CNNs) and Transformers, there is currently a lack of benchmarks for these state-of-the-art (SoTA) networks specifically trained for ONH detection. Therefore, in this article, we make contributions from three key aspects: network design, the publication of a dataset, and the establishment of a comprehensive benchmark. Our newly developed ONH detection network, referred to as ODFormer, is based upon the Swin Transformer architecture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Imaging and Analysis · Retinal and Optic Conditions · Brain Tumor Detection and Classification

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Stochastic Depth · Residual Connection · Absolute Position Encodings · Byte Pair Encoding · Adam · Dropout