DiffuLT: How to Make Diffusion Model Useful for Long-tail Recognition
Jie Shao, Ke Zhu, Hanxiao Zhang, Jianxin Wu

TL;DR
DiffuLT introduces a novel pipeline using a diffusion model trained solely on long-tailed datasets to generate balanced samples, significantly improving long-tail recognition without external data.
Contribution
It pioneers the use of diffusion models for long-tail recognition, generating synthetic balanced data directly from the dataset itself.
Findings
Achieves state-of-the-art results on CIFAR10-LT, CIFAR100-LT, and ImageNet-LT.
Operates without external data or pre-trained models.
Provides interpretable ablation studies.
Abstract
This paper proposes a new pipeline for long-tail (LT) recognition. Instead of re-weighting or re-sampling, we utilize the long-tailed dataset itself to generate a balanced proxy that can be optimized through cross-entropy (CE). Specifically, a randomly initialized diffusion model, trained exclusively on the long-tailed dataset, is employed to synthesize new samples for underrepresented classes. Then, we utilize the inherent information in the original dataset to filter out harmful samples and keep the useful ones. Our strategy, Diffusion model for Long-Tail recognition (DiffuLT), represents a pioneering utilization of generative models in long-tail recognition. DiffuLT achieves state-of-the-art results on CIFAR10-LT, CIFAR100-LT, and ImageNet-LT, surpassing the best competitors with non-trivial margins. Abundant ablations make our pipeline interpretable, too. The whole generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsDiffusion
