GAPNet: A Lightweight Framework for Image and Video Salient Object Detection via Granularity-Aware Paradigm

Yu-Huan Wu; Wei Liu; Zi-Xuan Zhu; Zizhou Wang; Yong Liu; Liangli Zhen

arXiv:2508.07585·cs.CV·September 3, 2025

GAPNet: A Lightweight Framework for Image and Video Salient Object Detection via Granularity-Aware Paradigm

Yu-Huan Wu, Wei Liu, Zi-Xuan Zhu, Zizhou Wang, Yong Liu, Liangli Zhen

PDF

TL;DR

GAPNet is a lightweight, granularity-aware network for image and video salient object detection that achieves state-of-the-art performance with efficient feature fusion and supervision strategies.

Contribution

The paper introduces a novel granularity-aware paradigm with specialized modules for efficient multi-scale feature fusion in a lightweight SOD framework.

Findings

01

Achieves state-of-the-art results among lightweight SOD models

02

Uses granular pyramid convolution and cross-scale attention modules

03

Maintains high accuracy with negligible computational cost

Abstract

Recent salient object detection (SOD) models predominantly rely on heavyweight backbones, incurring substantial computational cost and hindering their practical application in various real-world settings, particularly on edge devices. This paper presents GAPNet, a lightweight network built on the granularity-aware paradigm for both image and video SOD. We assign saliency maps of different granularities to supervise the multi-scale decoder side-outputs: coarse object locations for high-level outputs and fine-grained object boundaries for low-level outputs. Specifically, our decoder is built with granularity-aware connections which fuse high-level features of low granularity and low-level features of high granularity, respectively. To support these connections, we design granular pyramid convolution (GPC) and cross-scale attention (CSA) modules for efficient fusion of low-scale and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.