Efficient Convolutional Neural Networks for Depth-Based Multi-Person   Pose Estimation

Angel Mart\'inez-Gonz\'alez; Michael Villamizar; Olivier Can\'evet and; Jean-Marc Odobez

arXiv:1912.00711·cs.CV·December 3, 2019

Efficient Convolutional Neural Networks for Depth-Based Multi-Person Pose Estimation

Angel Mart\'inez-Gonz\'alez, Michael Villamizar, Olivier Can\'evet and, Jean-Marc Odobez

PDF

TL;DR

This paper develops fast, lightweight CNN architectures for multi-person 2D pose estimation using depth images, combining synthetic data, domain adaptation, and knowledge distillation to achieve accurate results efficiently.

Contribution

It introduces novel lightweight CNN designs for depth-based pose estimation, leveraging synthetic data, domain adaptation, and knowledge distillation to improve accuracy and speed.

Findings

01

Lightweight CNN architectures achieve competitive accuracy.

02

Synthetic and real data experiments validate the approach.

03

Knowledge distillation enhances model performance.

Abstract

Achieving robust multi-person 2D body landmark localization and pose estimation is essential for human behavior and interaction understanding as encountered for instance in HRI settings. Accurate methods have been proposed recently, but they usually rely on rather deep Convolutional Neural Network (CNN) architecture, thus requiring large computational and training resources. In this paper, we investigate different architectures and methodologies to address these issues and achieve fast and accurate multi-person 2D pose estimation. To foster speed, we propose to work with depth images, whose structure contains sufficient information about body landmarks while being simpler than textured color images and thus potentially requiring less complex CNNs for processing. In this context, we make the following contributions. i) we study several CNN architecture designs combining pose machines…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation