CommonCanvas: An Open Diffusion Model Trained with Creative-Commons   Images

Aaron Gokaslan; A. Feder Cooper; Jasmine Collins; Landan Seguin,; Austin Jacobson; Mihir Patel; Jonathan Frankle; Cory Stephenson; Volodymyr; Kuleshov

arXiv:2310.16825·cs.CV·October 26, 2023·2 cites

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Aaron Gokaslan, A. Feder Cooper, Jasmine Collins, Landan Seguin,, Austin Jacobson, Mihir Patel, Jonathan Frankle, Cory Stephenson, Volodymyr, Kuleshov

PDF

Open Access 1 Repo 4 Models 5 Datasets

TL;DR

CommonCanvas introduces a set of open diffusion models trained on a large dataset of Creative-Commons images, using synthetic captions and efficient training techniques to achieve competitive quality with less data and faster training.

Contribution

The paper presents a novel training approach for diffusion models using CC images and synthetic captions, enabling high-quality models with reduced data and computational requirements.

Findings

01

Models achieve comparable quality to Stable Diffusion 2 in human evaluations.

02

Training speed is improved by approximately 3 times through optimization techniques.

03

The dataset of around 70 million CC images is sufficient for training high-quality diffusion models.

Abstract

We assemble a dataset of Creative-Commons-licensed (CC) images, which we use to train a set of open diffusion models that are qualitatively competitive with Stable Diffusion 2 (SD2). This task presents two challenges: (1) high-resolution CC images lack the captions necessary to train text-to-image generative models; (2) CC images are relatively scarce. In turn, to address these challenges, we use an intuitive transfer learning technique to produce a set of high-quality synthetic captions paired with curated CC images. We then develop a data- and compute-efficient training recipe that requires as little as 3% of the LAION-2B data needed to train existing SD2 models, but obtains comparable quality. These results indicate that we have a sufficient number of CC images (~70 million) for training high-quality models. Our training recipe also implements a variety of optimizations that achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mosaicml/diffusion
pytorchOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsSparse Evolutionary Training · Diffusion