ASOD60K: An Audio-Induced Salient Object Detection Dataset for Panoramic   Videos

Yi Zhang

arXiv:2107.11629·cs.CV·November 15, 2021·5 cites

ASOD60K: An Audio-Induced Salient Object Detection Dataset for Panoramic Videos

Yi Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces ASOD60K, a large-scale dataset for audio-induced salient object detection in panoramic videos, along with benchmarks to advance research in this emerging area.

Contribution

The paper presents the first large-scale dataset ASOD60K for audio-induced salient object detection in panoramic videos, including detailed annotations and benchmark evaluations.

Findings

01

Existing SOD models face challenges with panoramic video data.

02

Audio cues significantly influence human attention in panoramic scenes.

03

Benchmark results highlight the need for specialized models for PV-SOD.

Abstract

Exploring to what humans pay attention in dynamic panoramic scenes is useful for many fundamental applications, including augmented reality (AR) in retail, AR-powered recruitment, and visual language navigation. With this goal in mind, we propose PV-SOD, a new task that aims to segment salient objects from panoramic videos. In contrast to existing fixation-/object-level saliency detection tasks, we focus on audio-induced salient object detection (SOD), where the salient objects are labeled with the guidance of audio-induced eye movements. To support this task, we collect the first large-scale dataset, named ASOD60K, which contains 4K-resolution video frames annotated with a six-level hierarchy, thus distinguishing itself with richness, diversity and quality. Specifically, each sequence is marked with both its super-/sub-class, with objects of each sub-class being further annotated with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PanoAsh/ASOD60K
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Olfactory and Sensory Function Studies