Class-Partitioned VQ-VAE and Latent Flow Matching for Point Cloud Scene Generation
Dasith de Silva Edirimuni, Ajmal Saeed Mian

TL;DR
This paper introduces a class-partitioned VQ-VAE and latent flow matching approach for generating complex 3D point cloud scenes directly, improving scene plausibility and accuracy without external object databases.
Contribution
The paper proposes a novel class-partitioned VQ-VAE with class-aware codebook updates and a latent flow matching model for direct point cloud scene generation.
Findings
Achieves up to 70.4% reduction in Chamfer error on complex scenes.
Effectively decodes class-specific point clouds without external databases.
Reliable scene recovery demonstrated on complex living room scenes.
Abstract
Most 3D scene generation methods are limited to only generating object bounding box parameters while newer diffusion methods also generate class labels and latent features. Using object size or latent feature, they then retrieve objects from a predefined database. For complex scenes of varied, multi-categorical objects, diffusion-based latents cannot be effectively decoded by current autoencoders into the correct point cloud objects which agree with target classes. We introduce a Class-Partitioned Vector Quantized Variational Autoencoder (CPVQ-VAE) that is trained to effectively decode object latent features, by employing a pioneering where codevectors are labeled by class. To address the problem of , we propose a running average update which reinitializes dead codevectors within each partition.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
