Custom Object Detection via Multi-Camera Self-Supervised Learning

Yan Lu; Yuanchao Shu

arXiv:2102.03442·cs.CV·February 9, 2021

Custom Object Detection via Multi-Camera Self-Supervised Learning

Yan Lu, Yuanchao Shu

PDF

Open Access

TL;DR

This paper introduces MCSSL, a self-supervised learning method for custom object detection in multi-camera networks, leveraging epipolar geometry and reID algorithms to improve detection accuracy without extensive manual labeling.

Contribution

The paper presents MCSSL, a novel self-supervised approach that associates multi-camera bounding boxes and generates pseudo-labels for improved object detection.

Findings

01

MCSSL improves mAP by 5.44% on WildTrack

02

MCSSL improves mAP by 6.76% on CityFlow

03

Effective training with pseudo-labels and consistency loss

Abstract

This paper proposes MCSSL, a self-supervised learning approach for building custom object detection models in multi-camera networks. MCSSL associates bounding boxes between cameras with overlapping fields of view by leveraging epipolar geometry and state-of-the-art tracking and reID algorithms, and prudently generates two sets of pseudo-labels to fine-tune backbone and detection networks respectively in an object detection model. To train effectively on pseudo-labels,a powerful reID-like pretext task with consistency loss is constructed for model customization. Our evaluation shows that compared with legacy selftraining methods, MCSSL improves average mAP by 5.44% and 6.76% on WildTrack and CityFlow dataset, respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques