Rethinking the Data Annotation Process for Multi-view 3D Pose Estimation with Active Learning and Self-Training
Qi Feng, Kun He, He Wen, Cem Keskin, Yuting Ye

TL;DR
This paper introduces an active learning framework with self-training for multi-view 3D pose estimation, significantly reducing annotation time and cost while improving accuracy on large-scale benchmarks.
Contribution
It presents novel multi-view active learning strategies and integrates pseudo-labeling to enhance annotation efficiency for 3D pose estimation.
Findings
Reduces annotation time by 60% on CMU Panoptic Studio.
Decreases annotation cost by 80% compared to traditional methods.
Achieves superior performance on large-scale 3D pose benchmarks.
Abstract
Pose estimation of the human body and hands is a fundamental problem in computer vision, and learning-based solutions require a large amount of annotated data. In this work, we improve the efficiency of the data annotation process for 3D pose estimation problems with Active Learning (AL) in a multi-view setting. AL selects examples with the highest value to annotate under limited annotation budgets (time and cost), but choosing the selection strategy is often nontrivial. We present a framework to efficiently extend existing single-view AL strategies. We then propose two novel AL strategies that make full use of multi-view geometry. Moreover, we demonstrate additional performance gains by incorporating pseudo-labels computed during the AL process, which is a form of self-training. Our system significantly outperforms simulated annotation baselines in 3D body and hand pose estimation on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Robot Manipulation and Learning
