Rapid Salient Object Detection with Difference Convolutional Neural Networks

Zhuo Su; Li Liu; Matthias M\"uller; Jiehua Zhang; Diana Wofk; Ming-Ming Cheng; Matti Pietik\"ainen

arXiv:2507.01182·cs.CV·July 3, 2025

Rapid Salient Object Detection with Difference Convolutional Neural Networks

Zhuo Su, Li Liu, Matthias M\"uller, Jiehua Zhang, Diana Wofk, Ming-Ming Cheng, Matti Pietik\"ainen

PDF

Open Access

TL;DR

This paper introduces efficient CNN-based models for real-time salient object detection on resource-limited devices, utilizing novel difference convolution techniques to improve speed and accuracy for images and videos.

Contribution

The paper proposes Pixel Difference Convolutions and a reparameterization strategy to embed contrast cues into CNNs, enabling fast and accurate salient object detection on embedded devices.

Findings

01

Models operate at 46 FPS and 150 FPS on Jetson Orin with less than 1M parameters.

02

Our models outperform lightweight competitors by over 2x and 3x in speed.

03

Achieve superior accuracy in real-time image and video SOD tasks.

Abstract

This paper addresses the challenge of deploying salient object detection (SOD) on resource-constrained devices with real-time performance. While recent advances in deep neural networks have improved SOD, existing top-leading models are computationally expensive. We propose an efficient network design that combines traditional wisdom on SOD and the representation power of modern CNNs. Like biologically-inspired classical SOD methods relying on computing contrast cues to determine saliency of image regions, our model leverages Pixel Difference Convolutions (PDCs) to encode the feature contrasts. Differently, PDCs are incorporated in a CNN architecture so that the valuable contrast cues are extracted from rich feature maps. For efficiency, we introduce a difference convolution reparameterization (DCR) strategy that embeds PDCs into standard convolutions, eliminating computation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Multimodal Machine Learning Applications