Masks and Manuscripts: Advancing Medical Pre-training with End-to-End Masking and Narrative Structuring
Shreyank N Gowda, David A. Clifton

TL;DR
This paper introduces a novel end-to-end masking and narrative structuring approach for medical pre-training, improving cross-modal representation and setting new benchmarks in medical image analysis by standardizing reports and innovating in visual masking.
Contribution
It proposes a two-step report standardization into triplets and binary questions, along with a Meijering-based masking technique for medical images, advancing multimodal contrastive learning.
Findings
Achieved new state-of-the-art performance in medical image analysis benchmarks.
Developed a standardized report conversion method improving semantic consistency.
Introduced a novel visual masking technique focusing on local image context.
Abstract
Contemporary medical contrastive learning faces challenges from inconsistent semantics and sample pair morphology, leading to dispersed and converging semantic shifts. The variability in text reports, due to multiple authors, complicates semantic consistency. To tackle these issues, we propose a two-step approach. Initially, text reports are converted into a standardized triplet format, laying the groundwork for our novel concept of ``observations'' and ``verdicts''. This approach refines the {Entity, Position, Exist} triplet into binary questions, guiding towards a clear ``verdict''. We also innovate in visual pre-training with a Meijering-based masking, focusing on features representative of medical images' local context. By integrating this with our text conversion method, our model advances cross-modal representation in a multimodal contrastive learning framework, setting new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
