NOVUM: Neural Object Volumes for Robust Object Classification

Artur Jesslen; Guofeng Zhang; Angtian Wang; Wufei Ma; Alan Yuille,; Adam Kortylewski

arXiv:2305.14668·cs.CV·August 29, 2024·1 cites

NOVUM: Neural Object Volumes for Robust Object Classification

Artur Jesslen, Guofeng Zhang, Angtian Wang, Wufei Ma, Alan Yuille,, Adam Kortylewski

PDF

Open Access 1 Repo

TL;DR

NOVUM introduces a 3D compositional neural object volume model that significantly improves robustness and interpretability in object classification, especially under out-of-distribution conditions, while maintaining real-time performance.

Contribution

The paper presents NOVUM, a novel architecture integrating 3D Gaussian-based object volumes into deep networks for enhanced robustness and interpretability in image classification.

Findings

01

Superior robustness to out-of-distribution shifts

02

Enhanced interpretability of object representations

03

Maintains real-time inference with competitive accuracy

Abstract

Discriminative models for object classification typically learn image-based representations that do not capture the compositional and 3D nature of objects. In this work, we show that explicitly integrating 3D compositional object representations into deep networks for image classification leads to a largely enhanced generalization in out-of-distribution scenarios. In particular, we introduce a novel architecture, referred to as NOVUM, that consists of a feature extractor and a neural object volume for every target object class. Each neural object volume is a composition of 3D Gaussians that emit feature vectors. This compositional object representation allows for a highly robust and fast estimation of the object class by independently matching the features of the 3D Gaussians of each category to features extracted from an input image. Additionally, the object pose can be estimated via…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

genintel/novum
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · 3D Shape Modeling and Analysis · Human Pose and Action Recognition