Self-supervised Social Relation Representation for Human Group Detection
Jiacheng Li, Ruize Han, Haomin Yan, Zekun Qian, Wei Feng, Song Wang

TL;DR
This paper introduces a self-supervised, two-stage framework for human group detection that learns social relation features and improves detection accuracy with minimal labeled data.
Contribution
It proposes a novel two-stage multi-head framework utilizing self-supervised learning for social relation embedding and group detection.
Findings
Achieves remarkable performance on PANDA and JRDB-Group benchmarks.
Effective with very few labeled training samples.
Provides a self-supervised approach to social relation representation.
Abstract
Human group detection, which splits crowd of people into groups, is an important step for video-based human social activity analysis. The core of human group detection is the human social relation representation and division.In this paper, we propose a new two-stage multi-head framework for human group detection. In the first stage, we propose a human behavior simulator head to learn the social relation feature embedding, which is self-supervisely trained by leveraging the socially grounded multi-person behavior relationship. In the second stage, based on the social relation embedding, we develop a self-attention inspired network for human group detection. Remarkable performance on two state-of-the-art large-scale benchmarks, i.e., PANDA and JRDB-Group, verifies the effectiveness of the proposed framework. Benefiting from the self-supervised social relation embedding, our method can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
