Is Dataset Quality Still a Concern in Diagnosis Using Large Foundation   Model?

Ziqin Lin; Heng Li; Zinan Li; Huazhu Fu; Jiang Liu

arXiv:2405.12584·eess.IV·May 22, 2024

Is Dataset Quality Still a Concern in Diagnosis Using Large Foundation Model?

Ziqin Lin, Heng Li, Zinan Li, Huazhu Fu, Jiang Liu

PDF

Open Access

TL;DR

This paper investigates the robustness of large foundation models in medical fundus diagnosis, finding they are more resilient to dataset quality issues than traditional models and that fine-tuning can further mitigate these effects.

Contribution

It demonstrates the robustness of large foundation models to dataset quality issues in medical diagnosis and shows fine-tuning as an effective method to improve their performance.

Findings

01

LFM is more resilient to image quality and dataset bias than CNNs.

02

Fine-tuning significantly improves LFM performance on low-quality datasets.

03

LFM outperforms traditional models in robustness to dataset issues.

Abstract

Recent advancements in pre-trained large foundation models (LFM) have yielded significant breakthroughs across various domains, including natural language processing and computer vision. These models have been particularly impactful in the domain of medical diagnostic tasks. With abundant unlabeled data, an LFM has been developed for fundus images using the Vision Transformer (VIT) and a self-supervised learning framework. This LFM has shown promising performance in fundus disease diagnosis across multiple datasets. On the other hand, deep learning models have long been challenged by dataset quality issues, such as image quality and dataset bias. To investigate the influence of data quality on LFM, we conducted explorations in two fundus diagnosis tasks using datasets of varying quality. Specifically, we explored the following questions: Is LFM more robust to image quality? Is LFM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · AI in cancer detection

MethodsLinear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Byte Pair Encoding · Adam · Dropout · Softmax · Attention Is All You Need · Adapter