Multi-Frequency-Aware Patch Adversarial Learning for Neural Point Cloud Rendering
Jay Karhade, Haiyue Zhu, Ka-Shing Chung, Rajesh Tripathy, Wei Lin,, Marcelo H. Ang Jr

TL;DR
This paper introduces a multi-frequency-aware patch adversarial learning framework for neural point cloud rendering, significantly improving realism by aligning spectral and spatial features, and enhancing training stability.
Contribution
The paper proposes a novel multi-discriminator scheme combining spectral and spatial domain discriminators, and a noise-resistant voxelisation method for improved neural point cloud rendering.
Findings
Achieves state-of-the-art rendering quality
Improves convergence speed and stability
Effectively captures spectral distributions of real images
Abstract
We present a neural point cloud rendering pipeline through a novel multi-frequency-aware patch adversarial learning framework. The proposed approach aims to improve the rendering realness by minimizing the spectrum discrepancy between real and synthesized images, especially on the high-frequency localized sharpness information which causes image blur visually. Specifically, a patch multi-discriminator scheme is proposed for the adversarial learning, which combines both spectral domain (Fourier Transform and Discrete Wavelet Transform) discriminators as well as the spatial (RGB) domain discriminator to force the generator to capture global and local spectral distributions of the real images. The proposed multi-discriminator scheme not only helps to improve rendering realness, but also enhance the convergence speed and stability of adversarial learning. Moreover, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
