Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded   Scenes

Zhi Cai; Yingjie Gao; Yaoyan Zheng; Nan Zhou; Di Huang

arXiv:2407.11464·cs.CV·July 22, 2024·1 cites

Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes

Zhi Cai, Yingjie Gao, Yaoyan Zheng, Nan Zhou, Di Huang

PDF

Open Access 1 Repo

TL;DR

Crowd-SAM leverages a SAM-based framework with novel prompt sampling and discrimination networks to improve object detection accuracy in crowded scenes with minimal labeled data, rivaling state-of-the-art methods.

Contribution

The paper introduces Crowd-SAM, a framework that enhances SAM's performance in crowded scenes using an efficient prompt sampler and a part-whole discrimination network.

Findings

01

Rivals state-of-the-art fully-supervised methods on CrowdHuman and CityPersons.

02

Uses minimal labeled images and few learnable parameters.

03

Achieves high accuracy in crowded and occluded scenes.

Abstract

In computer vision, object detection is an important task that finds its application in many scenarios. However, obtaining extensive labels can be challenging, especially in crowded scenes. Recently, the Segment Anything Model (SAM) has been proposed as a powerful zero-shot segmenter, offering a novel approach to instance segmentation tasks. However, the accuracy and efficiency of SAM and its variants are often compromised when handling objects in crowded and occluded scenes. In this paper, we introduce Crowd-SAM, a SAM-based framework designed to enhance SAM's performance in crowded and occluded scenes with the cost of few learnable parameters and minimal labeled images. We introduce an efficient prompt sampler (EPS) and a part-whole discrimination network (PWD-Net), enhancing mask selection and accuracy in crowded scenes. Despite its simplicity, Crowd-SAM rivals state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

felixcaae/crowdsam
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods

MethodsSegment Anything Model