Less is More: Selective Reduction of CT Data for Self-Supervised   Pre-Training of Deep Learning Models with Contrastive Learning Improves   Downstream Classification Performance

Daniel Wolf; Tristan Payer; Catharina Silvia Lisson; Christoph Gerhard; Lisson; Meinrad Beer; Michael G\"otz; Timo Ropinski

arXiv:2410.14524·eess.IV·October 21, 2024

Less is More: Selective Reduction of CT Data for Self-Supervised Pre-Training of Deep Learning Models with Contrastive Learning Improves Downstream Classification Performance

Daniel Wolf, Tristan Payer, Catharina Silvia Lisson, Christoph Gerhard, Lisson, Meinrad Beer, Michael G\"otz, Timo Ropinski

PDF

1 Repo

TL;DR

This paper demonstrates that selectively reducing CT datasets based on information-theoretic strategies enhances contrastive self-supervised pre-training, leading to better downstream classification performance and faster training times in medical imaging.

Contribution

The study introduces a novel dataset reduction approach for contrastive pre-training that improves downstream task accuracy and efficiency in medical image analysis.

Findings

01

Dataset reduction improves AUC scores across multiple medical classification tasks.

02

Pre-training time is reduced by up to nine times with dataset reduction.

03

Selective dataset reduction enhances the effectiveness of contrastive learning in medical imaging.

Abstract

Self-supervised pre-training of deep learning models with contrastive learning is a widely used technique in image analysis. Current findings indicate a strong potential for contrastive pre-training on medical images. However, further research is necessary to incorporate the particular characteristics of these images. We hypothesize that the similarity of medical images hinders the success of contrastive learning in the medical imaging domain. To this end, we investigate different strategies based on deep embedding, information theory, and hashing in order to identify and reduce redundancy in medical pre-training datasets. The effect of these different reduction strategies on contrastive learning is evaluated on two pre-training datasets and several downstream classification tasks. In all of our experiments, dataset reduction leads to a considerable performance gain in downstream tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Wolfda95/Less_is_More
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Learning