INRet: A General Framework for Accurate Retrieval of INRs for Shapes
Yushi Guan, Daniel Kwan, Ruofan Liang, Selvakumar Panneer, Nilesh, Jain, Nilesh Ahuja, Nandita Vijaykumar

TL;DR
This paper introduces INRet, a versatile framework for accurately retrieving similar implicit neural representations of shapes from a data store, supporting various architectures and implicit functions, outperforming existing methods.
Contribution
INRet is a novel, generalizable method for INR shape retrieval that supports multiple architectures and functions, improving accuracy over prior approaches.
Findings
INRet outperforms existing INR retrieval methods in accuracy.
Supports diverse INR architectures like octree grids, triplanes, and hash grids.
Avoids conversion overhead by directly working with INRs.
Abstract
Implicit neural representations (INRs) have become an important method for encoding various data types, such as 3D objects or scenes, images, and videos. They have proven to be particularly effective at representing 3D content, e.g., 3D scene reconstruction from 2D images, novel 3D content creation, as well as the representation, interpolation, and completion of 3D shapes. With the widespread generation of 3D data in an INR format, there is a need to support effective organization and retrieval of INRs saved in a data store. A key aspect of retrieval and clustering of INRs in a data store is the formulation of similarity between INRs that would, for example, enable retrieval of similar INRs using a query INR. In this work, we propose INRet, a method for determining similarity between INRs that represent shapes, thus enabling accurate retrieval of similar shape INRs from an INR data…
Peer Reviews
Decision·Submitted to ICLR 2024
The method achieves a significantly higher retrieval accuracy ( > 10%) over the existing state of the art methods including inr2vec on ShapeNet and Pix3D datasets. The authors also show retrieval results on methods such a PointNeXt and View-GCN. The method allows the multiple INR architectures as well as as different implicit function representations, and is thus general and flexible. They design their loss cleverly to include two terms. One term is an L2 loss that minimizes the pairwise dif
The paper does not propose novel architectures or representations or even new metrics for similarity between the implicit representations. Instead they make use of existing architectures and embeddings (separately for the MLP and the feature grid) as well as L2 losses for the pairwise dissimilarity In equation (4), the two terms operate on different types of representations when they are compared. While one can trivially assume an Euclidean embedding (which the authors assume here), this may
1. The task is quite interesting, which retrieves shapes in terms of INRs. This paper handles multiple INRs, such as SDF, UDF and Occ, which covers the commonly used implicit representations. 2. NGLOD and instantNGP are quite popular backbones, it is good to see that INRet supports these methods. 3. The evaluations seems good.
1. The matching among SDF, UDF and Occ may lead to errors. Both SDF and Occ can only represent watertight shapes, but researchers leverage UDF to represent open-surfaces and multi-layer structures. What if I input an INR in terms of a multi-layer car or a shirt with open surfaces? The matching from the SDF and Occ of watertight mesh of the car or the shirt is wrong for the UDF. 2. There is no visualizations for comparison or illustrating, which makes the paper lack of qualitative analysis. I al
originality: compared to inr2vec [1], this paper proposes an additional module to handle the latest architecture of INR (i.e., octree-based and hash grid-based). This paper also describes how to handle different implicit functions (e.g., SDF, UDF, Occ) and to map the INRs into a common latent space, and proposes suitable regularization terms to minimize the domain gap between different implicit functions. quality: the paper is technically sound. clarity: the paper is well-organized and easy
This work designed a framework for retrieval with INRs and proposed regularization terms. However, the analysis of the proposed regularization terms only shows the performance comparison. It would be better if the paper could further analyze why these terms are so important. A potential weakness of this work is that all data must be first encoded into the INR space. If given an unseen/test input that is a 3D model but wants to query similar INRs, how long to convert it into the INR? Since we ca
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques
