Reverse Knowledge Distillation: Training a Large Model using a Small One   for Retinal Image Matching on Limited Data

Sahar Almahfouz Nasser; Nihar Gupte; and Amit Sethi

arXiv:2307.10698·cs.CV·July 24, 2023·1 cites

Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data

Sahar Almahfouz Nasser, Nihar Gupte, and Amit Sethi

PDF

Open Access 1 Repo

TL;DR

This paper introduces reverse knowledge distillation, training a large vision transformer model using a smaller CNN model, to improve retinal image matching with limited data and prevent overfitting.

Contribution

It proposes a novel reverse knowledge distillation approach, architectural improvements to SuperRetina, and provides a new annotated dataset for retinal keypoint detection.

Findings

01

Reverse knowledge distillation enhances model generalization.

02

High-dimensional representation fitting helps prevent overfitting.

03

The approach outperforms traditional training methods on retinal matching tasks.

Abstract

Retinal image matching plays a crucial role in monitoring disease progression and treatment response. However, datasets with matched keypoints between temporally separated pairs of images are not available in abundance to train transformer-based model. We propose a novel approach based on reverse knowledge distillation to train large models with limited data while preventing overfitting. Firstly, we propose architectural modifications to a CNN-based semi-supervised method called SuperRetina that help us improve its results on a publicly available dataset. Then, we train a computationally heavier model based on a vision transformer encoder using the lighter CNN-based model, which is counter-intuitive in the field knowledge-distillation research where training lighter models based on heavier ones is the norm. Surprisingly, such reverse knowledge distillation improves generalization even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SaharAlmahfouzNasser/MeDAL-Retina
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Imaging and Analysis · Brain Tumor Detection and Classification · Retinal Diseases and Treatments

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Residual Connection · Layer Normalization · Linear Layer · Dense Connections · Knowledge Distillation · Vision Transformer