Global and Local Sensitivity Guided Key Salient Object Re-augmentation   for Video Saliency Detection

Ziqi Zhou; Zheng Wang; Huchuan Lu; Song Wang; Meijun Sun

arXiv:1811.07480·cs.CV·November 20, 2018·1 cites

Global and Local Sensitivity Guided Key Salient Object Re-augmentation for Video Saliency Detection

Ziqi Zhou, Zheng Wang, Huchuan Lu, Song Wang, Meijun Sun

PDF

Open Access

TL;DR

This paper introduces KSORA, a novel video saliency detection method that combines local feature selection and global object ranking to enhance key salient object detection in dynamic scenes, outperforming existing methods.

Contribution

The paper proposes KSORA, a new approach integrating top-down and bottom-up strategies for improved key salient object detection in videos, addressing limitations of previous static feature weighting methods.

Findings

01

KSORA achieves higher detection accuracy on benchmark datasets.

02

The method runs at 17FPS on modern GPUs, demonstrating real-time capability.

03

Outperforms ten state-of-the-art algorithms in complex scene detection.

Abstract

The existing still-static deep learning based saliency researches do not consider the weighting and highlighting of extracted features from different layers, all features contribute equally to the final saliency decision-making. Such methods always evenly detect all "potentially significant regions" and unable to highlight the key salient object, resulting in detection failure of dynamic scenes. In this paper, based on the fact that salient areas in videos are relatively small and concentrated, we propose a \textbf{key salient object re-augmentation method (KSORA) using top-down semantic knowledge and bottom-up feature guidance} to improve detection accuracy in video scenes. KSORA includes two sub-modules (WFE and KOS): WFE processes local salient feature selection using bottom-up strategy, while KOS ranks each object in global fashion by top-down statistical knowledge, and chooses the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings