Fake It Till You Make It: Using Synthetic Data and Domain Knowledge for Improved Text-Based Learning for LGE Detection
Athira J Jacob, Puneet Sharma, Daniel Rueckert

TL;DR
This study introduces a novel approach for detecting cardiac LGE in MRI images using only clinical report text, synthetic data augmentation, and domain knowledge, achieving improved performance with limited data.
Contribution
The paper presents a new method leveraging synthetic data, standardized image orientation, and captioning loss for LGE detection without extensive image annotations.
Findings
Synthetic data augmentation improves detection accuracy.
Standardized image orientation enhances model alignment.
Pretraining vision encoder boosts performance.
Abstract
Detection of hyperenhancement from cardiac LGE MRI images is a complex task requiring significant clinical expertise. Although deep learning-based models have shown promising results for the task, they require large amounts of data with fine-grained annotations. Clinical reports generated for cardiac MR studies contain rich, clinically relevant information, including the location, extent and etiology of any scars present. Although recently developed CLIP-based training enables pretraining models with image-text pairs, it requires large amounts of data and further finetuning strategies on downstream tasks. In this study, we use various strategies rooted in domain knowledge to train a model for LGE detection solely using text from clinical reports, on a relatively small clinical cohort of 965 patients. We improve performance through the use of synthetic data augmentation, by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Applications and Data Management
