PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

Marc Assens; Xavier Giro-i-Nieto; Kevin McGuinness; Noel E.; O'Connor

arXiv:1809.00567·cs.CV·September 5, 2018·19 cites

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

Marc Assens, Xavier Giro-i-Nieto, Kevin McGuinness, Noel E., O'Connor

PDF

Open Access 1 Repo

TL;DR

PathGAN is a novel deep neural network that uses adversarial training to predict human visual scanpaths, improving accuracy over previous methods on benchmark datasets.

Contribution

Introduces PathGAN, combining generator and discriminator with off-the-shelf features for realistic scanpath prediction via adversarial training.

Findings

01

Outperforms previous models on iSUN dataset

02

Achieves state-of-the-art results on Salient360! dataset

03

Demonstrates effectiveness of adversarial training for scanpath prediction

Abstract

We introduce PathGAN, a deep neural network for visual scanpath prediction trained on adversarial examples. A visual scanpath is defined as the sequence of fixation points over an image defined by a human observer with its gaze. PathGAN is composed of two parts, the generator and the discriminator. Both parts extract features from images using off-the-shelf networks, and train recurrent layers to generate or discriminate scanpaths accordingly. In scanpath prediction, the stochastic nature of the data makes it very difficult to generate realistic predictions using supervised learning strategies, but we adopt adversarial training as a suitable alternative. Our experiments prove how PathGAN improves the state of the art of visual scanpath prediction on the iSUN and Salient360! datasets. Source code and models are available at https://imatge-upc.github.io/pathgan/

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

imatge-upc/pathgan
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications