Copula-enhanced Vision Transformer for high myopia diagnosis through OU UWF fundus images
Chong Zhong, Yunhao Liu, Yang Li, Xiang Fu, Jin Yang, Danjuan Yang, Meiyan Li, Jinfeng Xu, Aiyi Liu, Alan H. Welsh, Xingtao Zhou, Bo Fu, and Catherine C. Liu

TL;DR
This paper introduces a novel copula-enhanced Vision Transformer model for joint high myopia diagnosis and axial length prediction from OU UWF fundus images, addressing inter-ocular asymmetry and dependence modeling.
Contribution
It proposes a residual adapter-based Vision Transformer with a copula loss and an efficient fMCEM algorithm for stable, joint classification and regression in myopia screening.
Findings
Improved predictive accuracy on OU fundus image dataset.
Stable estimation of copula parameters despite overfitting issues.
Enhanced joint diagnosis and prediction performance.
Abstract
The advancement of AI-assisted myopia screening necessitates the joint diagnosis of both-eye (OU) high myopia (HM) status and the prediction of axial length (AL). This clinical requirement introduces a complex mixed-type (binary-continuous) multitask learning task with bi-domain (OU) image covariates, giving rise to two key challenges: i) capture the inter-ocular asymmetry of OU images within a cutting-edge foundation model; ii) model and estimate the conditional dependence structure among mixed-type multivariate responses given image covariates. We address the challenges by: i) imposing residual adapters on the Vision Transformer foundation model to capture the OU similarity and heterogeneity simultaneously; ii) developing a four-dimensional copula loss that is implementable in PyTorch based on a latent variable expression for the Gaussian copula likelihood, and proposing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
