Improving Representation of High-frequency Components for Medical Visual   Foundation Models

Yuetan Chu; Yilan Zhang; Zhongyi Han; Changchun Yang; Longxi Zhou,; Gongning Luo; Chao Huang; Xin Gao

arXiv:2407.14651·eess.IV·April 18, 2025

Improving Representation of High-frequency Components for Medical Visual Foundation Models

Yuetan Chu, Yilan Zhang, Zhongyi Han, Changchun Yang, Longxi Zhou,, Gongning Luo, Chao Huang, Xin Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces Frepa, a novel pretraining strategy that enhances high-frequency component representation in medical visual models, significantly improving performance on detailed medical imaging tasks.

Contribution

The paper proposes Frepa, a new pretraining method combining high-frequency masking, low-frequency perturbation, and adversarial learning, extending to various architectures and modalities.

Findings

01

Frepa outperforms existing self-supervised methods without fine-tuning.

02

Achieves up to +15% DSC in retina vessel segmentation.

03

Enables better high-frequency feature preservation in embeddings.

Abstract

Foundation models have recently attracted significant attention for their impressive generalizability across diverse downstream tasks. However, these models are demonstrated to exhibit great limitations in representing high-frequency components and fine-grained details. In many medical imaging tasks, the precise representation of such information is crucial due to the inherently intricate anatomical structures, sub-visual features, and complex boundaries involved. Consequently, the limited representation of prevalent foundation models can result in significant performance degradation or even failure in these tasks. To address these challenges, we propose a novel pretraining strategy, named Frequency-advanced Representation Autoencoder (Frepa). Through high-frequency masking and low-frequency perturbation combined with adversarial learning, Frepa encourages the encoder to effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Arturia-Pendragon-Iris/Frepa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging Techniques and Applications · Advanced Data Processing Techniques · Engineering Technology and Methodologies

MethodsAttention Is All You Need · Stochastic Depth · Swin Transformer · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings