3D Human Pose Estimation with Siamese Equivariant Embedding

M\'arton V\'eges; Viktor Varga; Andr\'as L\H{o}rincz

arXiv:1809.07217·cs.CV·February 19, 2019·1 cites

3D Human Pose Estimation with Siamese Equivariant Embedding

M\'arton V\'eges, Viktor Varga, Andr\'as L\H{o}rincz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a siamese neural network architecture that learns rotation-equivariant embeddings to improve monocular 3D human pose estimation, reducing overfitting to camera angles and achieving state-of-the-art cross-camera accuracy.

Contribution

The authors propose a novel siamese network with rotation-equivariant embeddings that enhances 3D pose estimation robustness across different camera views.

Findings

01

Consistent error reduction across multiple datasets

02

State-of-the-art cross-camera error rate

03

Effective with various base networks

Abstract

In monocular 3D human pose estimation a common setup is to first detect 2D positions and then lift the detection into 3D coordinates. Many algorithms suffer from overfitting to camera positions in the training set. We propose a siamese architecture that learns a rotation equivariant hidden representation to reduce the need for data augmentation. Our method is evaluated on multiple databases with different base networks and shows a consistent improvement of error metrics. It achieves state-of-the-art cross-camera error rate among algorithms that use estimated 2D joint coordinates only.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vegesm/siamese-pose-estimation
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging