TL;DR
This paper introduces a new synthetic data generation framework and a transformer-based model, Human3D, for accurate 3D human segmentation in cluttered indoor scenes, advancing the state-of-the-art in the field.
Contribution
It presents the first end-to-end transformer model for multi-human and body-part segmentation, leveraging synthetic data for improved performance.
Findings
Pre-training on synthetic data enhances segmentation accuracy.
Human3D outperforms existing state-of-the-art methods.
Synthetic data effectively captures diverse human-scene interactions.
Abstract
Segmenting humans in 3D indoor scenes has become increasingly important with the rise of human-centered robotics and AR/VR applications. To this end, we propose the task of joint 3D human semantic segmentation, instance segmentation and multi-human body-part segmentation. Few works have attempted to directly segment humans in cluttered 3D scenes, which is largely due to the lack of annotated training data of humans interacting with 3D scenes. We address this challenge and propose a framework for generating training data of synthetic humans interacting with real 3D scenes. Furthermore, we propose a novel transformer-based model, Human3D, which is the first end-to-end model for segmenting multiple human instances and their body-parts in a unified manner. The key advantage of our synthetic data generation framework is its ability to generate diverse and realistic human-scene interactions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
3D Segmentation of Humans in Point Clouds with Synthetic Data· youtube
Taxonomy
MethodsTest
