SalNet360: Saliency Maps for omni-directional images with CNN

Rafael Monroy; Sebastian Lutz; Tejo Chalasani; Aljosa Smolic

arXiv:1709.06505·cs.CV·May 11, 2018

SalNet360: Saliency Maps for omni-directional images with CNN

Rafael Monroy, Sebastian Lutz, Tejo Chalasani, Aljosa Smolic

PDF

TL;DR

This paper introduces SalNet360, an extension of CNN architectures that adapts traditional 2D saliency prediction methods to omnidirectional images, enhancing visual attention modeling for VR media.

Contribution

It proposes an end-to-end CNN-based extension specifically designed for accurate saliency prediction in omnidirectional images, a novel adaptation for VR content analysis.

Findings

01

Improved saliency map accuracy on ground truth data

02

Effective adaptation of 2D CNNs to 360-degree images

03

Pipeline steps enhance prediction quality

Abstract

The prediction of Visual Attention data from any kind of media is of valuable use to content creators and used to efficiently drive encoding algorithms. With the current trend in the Virtual Reality (VR) field, adapting known techniques to this new kind of media is starting to gain momentum. In this paper, we present an architectural extension to any Convolutional Neural Network (CNN) to fine-tune traditional 2D saliency prediction to Omnidirectional Images (ODIs) in an end-to-end manner. We show that each step in the proposed pipeline works towards making the generated saliency map more accurate with respect to ground truth data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.