OU-CoViT: Copula-Enhanced Bi-Channel Multi-Task Vision Transformers with   Dual Adaptation for OU-UWF Images

Yang Li; Jianing Deng; Chong Zhong; Danjuan Yang; Meiyan Li; A.H.; Welsh; Aiyi Liu; Xingtao Zhou; Catherine C. Liu; Bo Fu

arXiv:2408.09395·cs.CV·August 20, 2024

OU-CoViT: Copula-Enhanced Bi-Channel Multi-Task Vision Transformers with Dual Adaptation for OU-UWF Images

Yang Li, Jianing Deng, Chong Zhong, Danjuan Yang, Meiyan Li, A.H., Welsh, Aiyi Liu, Xingtao Zhou, Catherine C. Liu, Bo Fu

PDF

Open Access

TL;DR

OU-CoViT introduces a novel transformer-based framework that leverages copula models and dual adaptation to improve multi-task myopia screening from ultra-widefield images, effectively handling mixed data types and interocular asymmetries.

Contribution

It proposes a new copula-enhanced bi-channel transformer architecture with dual adaptation, enabling effective multi-task learning on small medical datasets with mixed discrete and continuous labels.

Findings

01

Significantly outperforms baseline models in prediction accuracy.

02

Effectively models interocular asymmetries and label correlations.

03

Demonstrates adaptability of the architecture to various ViT variants.

Abstract

Myopia screening using cutting-edge ultra-widefield (UWF) fundus imaging and joint modeling of multiple discrete and continuous clinical scores presents a promising new paradigm for multi-task problems in Ophthalmology. The bi-channel framework that arises from the Ophthalmic phenomenon of ``interocular asymmetries'' of both eyes (OU) calls for new employment on the SOTA transformer-based models. However, the application of copula models for multiple mixed discrete-continuous labels on deep learning (DL) is challenging. Moreover, the application of advanced large transformer-based models to small medical datasets is challenging due to overfitting and computational resource constraints. To resolve these challenges, we propose OU-CoViT: a novel Copula-Enhanced Bi-Channel Multi-Task Vision Transformers with Dual Adaptation for OU-UWF images, which can i) incorporate conditional correlation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications

MethodsSoftmax · Linear Layer · Residual Connection · Layer Normalization · Multi-Head Attention · Attention Is All You Need · Dense Connections · Vision Transformer