Human Pose Estimation from RGB Input Using Synthetic Training Data
Oscar Danielsson, Omid Aghazadeh

TL;DR
This paper presents a method for human pose estimation from RGB images using synthetic training data and a novel objective function for random forests, improving generalization to real images.
Contribution
It introduces a new training objective for random forests that leverages weakly labeled real data to enhance generalization from synthetic to real images.
Findings
Significant performance improvement over baseline classifiers.
Effective use of synthetic data for training without depth information.
Demonstrated on a public dataset with improved accuracy.
Abstract
We address the problem of estimating the pose of humans using RGB image input. More specifically, we are using a random forest classifier to classify pixels into joint-based body part categories, much similar to the famous Kinect pose estimator [11], [12]. However, we are using pure RGB input, i.e. no depth. Since the random forest requires a large number of training examples, we are using computer graphics generated, synthetic training data. In addition, we assume that we have access to a large number of real images with bounding box labels, extracted for example by a pedestrian detector or a tracking system. We propose a new objective function for random forest training that uses the weakly labeled data from the target domain to encourage the learner to select features that generalize from the synthetic source domain to the real target domain. We demonstrate on a publicly available…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
