From Sparse to Dense: Spatio-Temporal Fusion for Multi-View 3D Human Pose Estimation with DenseWarper

Ling Li; Changjie Chen; Yuyan Wang; Jiaqing Lyu; Kenglun Chang; Yiyun Chen; Zhidong Deng

arXiv:2605.14525·cs.CV·May 15, 2026

From Sparse to Dense: Spatio-Temporal Fusion for Multi-View 3D Human Pose Estimation with DenseWarper

Ling Li, Changjie Chen, Yuyan Wang, Jiaqing Lyu, Kenglun Chang, Yiyun Chen, Zhidong Deng

PDF

1 Repo 1 Video

TL;DR

This paper introduces a novel spatio-temporal fusion method for multi-view 3D human pose estimation, leveraging sparse interleaved inputs to enhance temporal resolution and performance, with the DenseWarper model utilizing epipolar geometry.

Contribution

It proposes a new sparse interleaved input approach and the DenseWarper model, improving 3D pose estimation by capturing rich spatio-temporal information and increasing frame rate.

Findings

01

Outperforms traditional dense multi-view methods on Human3.6M and MPI-INF-3DHP datasets.

02

Achieves state-of-the-art performance with sparse interleaved inputs.

03

Theoretically increases pose frame rate by N times with N cameras.

Abstract

In multi-view 3D human pose estimation, models typically rely on images captured simultaneously from different camera views to predict a pose at a specific moment. While providing accurate spatial information, this traditional approach often overlooks the rich temporal dependencies between adjacent frames. We propose a novel 3D human pose estimation input method: the sparse interleaved input to address this. This method leverages images captured from different camera views at various time points (e.g., View 1 at time $t$ and View 2 at time $t + δ$ ), allowing our model to capture rich spatio-temporal information and effectively boost performance. More importantly, this approach offers two key advantages: First, it can theoretically increase the output pose frame rate by N times with N cameras, thereby breaking through single-view frame rate limitations and enhancing the temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lingli1724/DenseWarper-ICLR2026
github

Videos

From Sparse to Dense: Spatio-Temporal Fusion for Multi-View 3D Human Pose Estimation with DenseWarper· slideslive