BBoxMaskPose v2: Expanding Mutual Conditioning to 3D
Miroslav Purkrabek, Constantin Kolomiiets, Jiri Matas

TL;DR
This paper introduces BMPv2, a novel 3D-aware 2D human pose estimation method that leverages mutual conditioning and mask refinement to significantly improve crowded scene performance and enhance 3D pose estimation accuracy.
Contribution
The paper presents BMPv2, integrating PMPose with an advanced mask refinement module, achieving state-of-the-art results and demonstrating benefits for 3D pose estimation in crowded scenes.
Findings
BMPv2 surpasses state-of-the-art AP on COCO and OCHuman datasets.
Improved 2D pose quality enhances 3D pose estimation accuracy.
Multi-person pose estimation performance depends more on pose prediction than detection.
Abstract
Most 2D human pose estimation benchmarks are nearly saturated, with the exception of crowded scenes. We introduce PMPose, a top-down 2D pose estimator that incorporates the probabilistic formulation and the mask-conditioning. PMPose improves crowded pose estimation without sacrificing performance on standard scenes. Building on this, we present BBoxMaskPose v2 (BMPv2) integrating PMPose and an enhanced SAM-based mask refinement module. BMPv2 surpasses state-of-the-art by 1.5 average precision (AP) points on COCO and 6 AP points on OCHuman, becoming the first method to exceed 50 AP on OCHuman. We demonstrate that BMP's 2D prompting of 3D model improves 3D pose estimation in crowded scenes and that advances in 2D pose quality directly benefit 3D estimation. Results on the new OCHuman-Pose dataset show that multi-person performance is more affected by pose prediction accuracy than by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Human Motion and Animation
