Heterogeneous Decentralized Diffusion Models
Zhiying Jiang, Raihan Seraj, Marcos Villagra, Bidhan Roy

TL;DR
This paper introduces a resource-efficient decentralized training framework for diffusion models that supports heterogeneous objectives, significantly reducing compute and data requirements while improving quality and diversity.
Contribution
It presents a novel heterogeneous decentralized training paradigm, checkpoint conversion method, and an efficient architecture, enabling flexible, scalable, and resource-efficient diffusion model training.
Findings
Reduced compute from 1176 to 72 GPU-days (16x)
Achieved better FID and diversity metrics than homogeneous baselines
Enabled mixed objectives without synchronization, lowering infrastructure needs
Abstract
Training frontier-scale diffusion models often requires substantial computational resources concentrated in tightly coupled clusters, limiting participation to well-resourced institutions. While Decentralized Diffusion Models (DDM) enable training multiple experts in isolation, existing approaches require 1176 GPU-days and homogeneous training objectives across all experts. We present an efficient framework that reduces resource requirements while supporting heterogeneous training objectives. Our approach combines three contributions: (1) a heterogeneous decentralized training paradigm that allows experts to use different objectives (DDPM and Flow Matching), unified at inference time via a deterministic schedule-aware conversion into a common velocity space without retraining; (2) pretrained checkpoint conversion from ImageNet-DDPM to Flow Matching objectives, accelerating convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
