Real-time Convolutional Networks for Depth-based Human Pose Estimation

Angel Mart\'inez-Gonz\'alez; Michael Villamizar; Olivier Can\'evet and; Jean-Marc Odobez

arXiv:1910.13911·cs.CV·October 31, 2019

Real-time Convolutional Networks for Depth-based Human Pose Estimation

Angel Mart\'inez-Gonz\'alez, Michael Villamizar, Olivier Can\'evet and, Jean-Marc Odobez

PDF

TL;DR

This paper introduces a fast, efficient depth-based CNN model for multi-person human pose estimation, leveraging synthetic data for training and demonstrating strong real-world performance.

Contribution

The paper presents a novel residual network architecture for depth images, a new synthetic dataset, and shows effective training from scratch on synthetic data for real-world pose estimation.

Findings

01

The RPM network achieves accurate pose estimation from depth images.

02

Synthetic data training yields comparable results to pre-trained models.

03

The approach is suitable for real-time human-robot interaction applications.

Abstract

We propose to combine recent Convolutional Neural Networks (CNN) models with depth imaging to obtain a reliable and fast multi-person pose estimation algorithm applicable to Human Robot Interaction (HRI) scenarios. Our hypothesis is that depth images contain less structures and are easier to process than RGB images while keeping the required information for human detection and pose inference, thus allowing the use of simpler networks for the task. Our contributions are threefold. (i) we propose a fast and efficient network based on residual blocks (called RPM) for body landmark localization from depth images; (ii) we created a public dataset DIH comprising more than 170k synthetic images of human bodies with various shapes and viewpoints as well as real (annotated) data for evaluation; (iii) we show that our model trained on synthetic data from scratch can perform well on real data,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.