Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning
Jinjin Guo, Yexin Li, Zhichao Huang, Jun Fang, Zhiyuan Liu, Chao Liu, Pengzhang Liu, Qixia Jiang

TL;DR
This paper introduces SDE, a dual-domain contrastive framework that uses spectral analysis to improve multimodal representation learning by enhancing signal separation and robustness.
Contribution
It proposes a novel spectral disentanglement and enhancement method that adaptively partitions feature dimensions and integrates spectral regularization into contrastive learning.
Findings
Outperforms state-of-the-art methods on large-scale benchmarks.
Improves robustness and generalization of learned representations.
Effectively integrates spectral regularization into contrastive frameworks.
Abstract
Large-scale multimodal contrastive learning has recently achieved impressive success in learning rich and transferable representations, yet it remains fundamentally limited by the uniform treatment of feature dimensions and the neglect of the intrinsic spectral structure of the learned features. Empirical evidence indicates that high-dimensional embeddings tend to collapse into narrow cones, concentrating task-relevant semantics in a small subspace, while the majority of dimensions remain occupied by noise and spurious correlations. Such spectral imbalance and entanglement undermine model generalization. We propose Spectral Disentanglement and Enhancement (SDE), a novel framework that bridges the gap between the geometry of the embedded spaces and their spectral properties. Our approach leverages singular value decomposition to adaptively partition feature dimensions into strong signals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
