Co-Salient Object Detection with Semantic-Level Consensus Extraction and Dispersion
Peiran Xu, Yadong Mu

TL;DR
This paper introduces a novel co-salient object detection method using hierarchical Transformer modules for semantic consensus extraction and dispersion, achieving state-of-the-art results without auxiliary losses.
Contribution
It proposes a Transformer-based framework that captures semantic-level consensus and models object dispersion, improving over local feature-based methods.
Findings
Achieves state-of-the-art performance on three CoSOD datasets.
Utilizes a hierarchical Transformer for comprehensive consensus extraction.
Employs a dispersion module to adapt to object variation across images.
Abstract
Given a group of images, co-salient object detection (CoSOD) aims to highlight the common salient object in each image. There are two factors closely related to the success of this task, namely consensus extraction, and the dispersion of consensus to each image. Most previous works represent the group consensus using local features, while we instead utilize a hierarchical Transformer module for extracting semantic-level consensus. Therefore, it can obtain a more comprehensive representation of the common object category, and exclude interference from other objects that share local similarities with the target object. In addition, we propose a Transformer-based dispersion module that takes into account the variation of the co-salient object in different scenes. It distributes the consensus to the image feature maps in an image-specific way while making full use of interactions within the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Layer Normalization · Dropout
