DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation
Zelin Zang, Hao Luo, Kai Wang, Panpan Zhang, Fan Wang, Stan.Z Li, Yang, You

TL;DR
DiffAug introduces a diffusion-based data augmentation method for unsupervised contrastive learning, generating positive samples without domain knowledge or large external datasets, improving representation across diverse data types.
Contribution
The paper presents DiffAug, a novel diffusion model-based augmentation technique that enhances unsupervised contrastive learning without domain-specific data or supervision.
Findings
Outperforms existing augmentation methods on multiple datasets
Improves representation quality in unsupervised contrastive learning
Works effectively across DNA, visual, and bio-feature data
Abstract
Unsupervised Contrastive learning has gained prominence in fields such as vision, and biology, leveraging predefined positive/negative samples for representation learning. Data augmentation, categorized into hand-designed and model-based methods, has been identified as a crucial component for enhancing contrastive learning. However, hand-designed methods require human expertise in domain-specific data while sometimes distorting the meaning of the data. In contrast, generative model-based approaches usually require supervised or large-scale external data, which has become a bottleneck constraining model training in many domains. To address the problems presented above, this paper proposes DiffAug, a novel unsupervised contrastive learning technique with diffusion mode-based positive data generation. DiffAug consists of a semantic encoder and a conditional diffusion model; the conditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research · AI in cancer detection
MethodsContrastive Learning · Diffusion
