Predicting Visual Attention in Graphic Design Documents
Souradeep Chakraborty, Zijun Wei, Conor Kelton, Seoyoung Ahn, Aruna, Balasubramanian, Gregory J. Zelinsky, Dimitris Samaras

TL;DR
This paper introduces a deep learning model that predicts both static visual saliency and dynamic gaze sequences in graphic design documents, especially webpages, outperforming existing methods and generalizing well across various visual media.
Contribution
It is the first to jointly predict spatial saliency and temporal fixation sequences in graphic design documents using a two-stage deep learning approach.
Findings
Model outperforms existing saliency and scanpath prediction models.
Large dataset of 450 webpages collected for evaluation.
Good generalization to comics, posters, UIs, and natural images.
Abstract
We present a model for predicting visual attention during the free viewing of graphic design documents. While existing works on this topic have aimed at predicting static saliency of graphic designs, our work is the first attempt to predict both spatial attention and dynamic temporal order in which the document regions are fixated by gaze using a deep learning based model. We propose a two-stage model for predicting dynamic attention on such documents, with webpages being our primary choice of document design for demonstration. In the first stage, we predict the saliency maps for each of the document components (e.g. logos, banners, texts, etc. for webpages) conditioned on the type of document layout. These component saliency maps are then jointly used to predict the overall document saliency. In the second stage, we use these layout-specific component saliency maps as the state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need
