PSNet: Parallel Symmetric Network for Video Salient Object Detection
Runmin Cong, Weiyu Song, Jianjun Lei, Guanghui Yue, Yao Zhao, and Sam, Kwong

TL;DR
PSNet introduces a parallel symmetric network architecture for video salient object detection, effectively integrating appearance and motion modalities through specialized modules for improved performance across diverse scenarios.
Contribution
The paper proposes a novel parallel symmetric network with modules for cross-modality refinement and adaptive feature fusion, advancing the integration of appearance and motion information in VSOD.
Findings
Achieves competitive performance on four benchmark datasets.
Effectively models the importance of different modalities in various scenarios.
Demonstrates the effectiveness of the parallel symmetry and specialized modules.
Abstract
For the video salient object detection (VSOD) task, how to excavate the information from the appearance modality and the motion modality has always been a topic of great concern. The two-stream structure, including an RGB appearance stream and an optical flow motion stream, has been widely used as a typical pipeline for VSOD tasks, but the existing methods usually only use motion features to unidirectionally guide appearance features or adaptively but blindly fuse two modality features. However, these methods underperform in diverse scenarios due to the uncomprehensive and unspecific learning schemes. In this paper, following a more secure modeling philosophy, we deeply investigate the importance of appearance modality and motion modality in a more comprehensive way and propose a VSOD network with up and down parallel symmetry, named PSNet. Two parallel branches with different dominant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Face Recognition and Perception
MethodsDiffusion
