Geometric Neural Process Fields

Wenzhe Yin; Zehao Xiao; Jiayi Shen; Yunlu Chen; Cees G. M. Snoek,; Jan-Jakob Sonke; Efstratios Gavves

arXiv:2502.02338·cs.CV·February 5, 2025

Geometric Neural Process Fields

Wenzhe Yin, Zehao Xiao, Jiayi Shen, Yunlu Chen, Cees G. M. Snoek,, Jan-Jakob Sonke, Efstratios Gavves

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Geometric Neural Process Fields, a probabilistic framework that enhances neural radiance fields' ability to generalize to new signals by explicitly modeling uncertainty and incorporating geometric structure.

Contribution

We propose G-NPF, a novel probabilistic approach that combines geometric bases with hierarchical latent variables to improve NeF generalization and uncertainty estimation.

Findings

01

G-NPF outperforms existing methods in novel-view synthesis.

02

It effectively captures uncertainty in 3D scene reconstructions.

03

The hierarchical model improves generalization to unseen signals.

Abstract

This paper addresses the challenge of Neural Field (NeF) generalization, where models must efficiently adapt to new signals given only a few observations. To tackle this, we propose Geometric Neural Process Fields (G-NPF), a probabilistic framework for neural radiance fields that explicitly captures uncertainty. We formulate NeF generalization as a probabilistic problem, enabling direct inference of NeF function distributions from limited context observations. To incorporate structural inductive biases, we introduce a set of geometric bases that encode spatial structure and facilitate the inference of NeF function distributions. Building on these bases, we design a hierarchical latent variable model, allowing G-NPF to integrate structural information across multiple spatial levels and effectively parameterize INR functions. This hierarchical approach improves generalization to novel…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

- The work effectively frames the generalization of Neural Radiance Fields (NeRF) as a probabilistic modeling problem, allowing for the integration of uncertainty and enabling the model to adapt to new scenes with limited observations. - The introduction of geometric bases addresses the challenge of information misalignment between 2D context images and 3D structures. - The incorporation of hierarchical latent variables allows for effective modulation of the INR function at multiple spatial

Weaknesses

- Missing comparison with state-of-the-art generalizable approaches [1,2]. The PixelNeRF method is published on ICCV 2021, which is a very old baseline. - I'm confused by the goal of this work. It seems that the method tries to train a generalizable INR that can leverage multiple input signals, but the experiments largely focus on predict INR from a single-signal (like a single view). To me, these two are different topics (generation v.s. reconstruction), and INR is designed to correctly store

Reviewer 02Rating 6Confidence 4

Strengths

1. **Paper-Writing and Presentation**: The paper is well-written and presents its content clearly and comprehensibly, making it easy to follow. 2. **Geometric Bases module**: The authors introduce a method that models the structure of an object using a mixture of 3D Gaussians. A learnable encoder based on a transformer architecture predicts the parameters for these Gaussians. This approach leverages the continuous properties of Gaussians. Table 3 shows an ablation that analyzes the sensitivity

Weaknesses

1. **Missing Comparison with several baselines**: The proposed method solves novel-view synthesis given few images. However, it does not compare with some popular methods such as Splatter-Image[W1], pixelSplat [W2], MVSplat [W3]. Further, it will also be interesting to compare this method with LRM [W4], a feedforward method to generate 3D from a single image. 2. **Evaluation on popular 3D datasets**: Recent methods such as Splatter-Image [W1] show comparisons on Objaverse, Google-Scanned Objec

Reviewer 03Rating 6Confidence 4

Strengths

- The paper uses Geometric Basis to maintain alignment between 2D context view and 3D target points and induce prior structure. - Geometric neural processes with hierarchical latent variables are used to encode spatial specific information. - The method shows superior results in Shapenet and DTU MVS dataset. - The paper is presented well and easy to follow.

Weaknesses

- The authors have only compared with pixelNeRF for DTU-MVS dataset. There are many other SOTA methods. Comparison with more recent methods is necessary. - Since the method uses a probabilistic approach, it can be resource-intensive and may require more memory and computation compared to simpler, deterministic NeRF models. It's necessary to compare the extra computational cost compared to other methods that use probabilistic approach (maybe comparison with baseline methods shown in Table 1.) -

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSparse Evolutionary Training