GazeFusion: Saliency-Guided Image Generation

Yunxiang Zhang; Nan Wu; Connor Z. Lin; Gordon Wetzstein; Qi Sun

arXiv:2407.04191·cs.CV·February 18, 2025

GazeFusion: Saliency-Guided Image Generation

Yunxiang Zhang, Nan Wu, Connor Z. Lin, Gordon Wetzstein, Qi Sun

PDF

Open Access

TL;DR

GazeFusion introduces a saliency-guided diffusion framework that enables control over viewer attention in generated images, aligning visual focus with user-specified attention distributions.

Contribution

It is the first to incorporate human visual attention priors into diffusion-based image generation for explicit attention control.

Findings

01

Attention-guided images match desired gaze distributions.

02

Eye-tracked studies confirm alignment with user intentions.

03

Saliency models accurately predict viewer attention in generated images.

Abstract

Diffusion models offer unprecedented image generation power given just a text prompt. While emerging approaches for controlling diffusion models have enabled users to specify the desired spatial layouts of the generated content, they cannot predict or control where viewers will pay more attention due to the complexity of human vision. Recognizing the significance of attention-controllable image generation in practical applications, we present a saliency-guided framework to incorporate the data priors of human visual attention mechanisms into the generation process. Given a user-specified viewer attention distribution, our control module conditions a diffusion model to generate images that attract viewers' attention toward the desired regions. To assess the efficacy of our approach, we performed an eye-tracked user study and a large-scale model-based saliency analysis. The results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Virtual Reality Applications and Impacts

MethodsDiffusion · ALIGN