DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action   Segmentation

Yue Zhang; Hehe Fan; Yi Yang; Mohan Kankanhalli

arXiv:2307.16803·cs.CV·August 1, 2023·2 cites

DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation

Yue Zhang, Hehe Fan, Yi Yang, Mohan Kankanhalli

PDF

Open Access

TL;DR

This paper introduces DPMix, a method combining depth and point cloud video models to enhance 4D action segmentation, leveraging traditional video techniques for better temporal modeling of long point cloud videos.

Contribution

The paper proposes DPMix, a novel ensemble approach that integrates depth and point cloud video methods, achieving state-of-the-art results in 4D action segmentation.

Findings

01

DPMix achieved first place in the HOI4D Challenge 2023.

02

Ensembling depth and point cloud methods significantly improves accuracy.

03

Traditional video models effectively handle long point cloud videos.

Abstract

In this technical report, we present our findings from the research conducted on the Human-Object Interaction 4D (HOI4D) dataset for egocentric action segmentation task. As a relatively novel research area, point cloud video methods might not be good at temporal modeling, especially for long point cloud videos (\eg, 150 frames). In contrast, traditional video understanding methods have been well developed. Their effectiveness on temporal modeling has been widely verified on many large scale video datasets. Therefore, we convert point cloud videos into depth videos and employ traditional video modeling methods to improve 4D action segmentation. By ensembling depth and point cloud video methods, the accuracy is significantly improved. The proposed method, named Mixture of Depth and Point cloud video experts (DPMix), achieved the first place in the 4D Action Segmentation Track of the HOI4D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · 3D Shape Modeling and Analysis