Imagen 3
Imagen-Team-Google: Jason Baldridge, Jakob Bauer, Mukul Bhutani,, Nicole Brichtova, Andrew Bunner, Lluis Castrejon, Kelvin Chan, Yichang Chen,, Sander Dieleman, Yuqing Du, Zach Eaton-Rosen, Hongliang Fei, Nando de, Freitas, Yilin Gao, Evgeny Gladchenko

TL;DR
Imagen 3 is a high-quality text-to-image diffusion model that outperforms existing models and incorporates safety and ethical considerations in its design.
Contribution
We present Imagen 3, a novel latent diffusion model that achieves superior image quality and addresses safety and bias issues in text-to-image generation.
Findings
Imagen 3 outperforms SOTA models in quality evaluations
Our safety assessments show reduced potential harm
The model demonstrates strong alignment with text prompts
Abstract
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging
MethodsLatent Diffusion Model · Diffusion
