Associative Embedding: End-to-End Learning for Joint Detection and Grouping
Alejandro Newell, Zhiao Huang, Jia Deng

TL;DR
Associative embedding is a new end-to-end learning method that enables neural networks to simultaneously detect and group objects, improving performance in tasks like pose estimation and segmentation.
Contribution
The paper introduces associative embedding, a novel approach for joint detection and grouping that integrates seamlessly into existing neural network architectures.
Findings
Achieves state-of-the-art multi-person pose estimation results on MPII and MS-COCO datasets.
Simplifies the detection and grouping pipeline by eliminating multi-stage processes.
Demonstrates versatility across multiple computer vision tasks.
Abstract
We introduce associative embedding, a novel method for supervising convolutional neural networks for the task of detection and grouping. A number of computer vision problems can be framed in this manner including multi-person pose estimation, instance segmentation, and multi-object tracking. Usually the grouping of detections is achieved with multi-stage pipelines, instead we propose an approach that teaches a network to simultaneously output detections and group assignments. This technique can be easily integrated into any state-of-the-art network architecture that produces pixel-wise predictions. We show how to apply this method to both multi-person pose estimation and instance segmentation and report state-of-the-art performance for multi-person pose on the MPII and MS-COCO datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
