RendNet: Unified 2D/3D Recognizer With Latent Space Rendering
Ruoxi Shi, Xinyang Jiang, Caihua Shan, Yansen Wang, Dongsheng Li

TL;DR
RendNet is a unified recognition architecture that leverages both vector and raster graphics representations, utilizing a rendering process to improve 2D and 3D object recognition accuracy.
Contribution
It introduces a novel unified model that combines vector and raster graphics recognition through a differentiable rendering process, enhancing recognition performance.
Findings
Achieves state-of-the-art results on multiple VG datasets.
Effectively combines VG and RG information for improved recognition.
Demonstrates robustness across 2D and 3D recognition tasks.
Abstract
Vector graphics (VG) have been ubiquitous in our daily life with vast applications in engineering, architecture, designs, etc. The VG recognition process of most existing methods is to first render the VG into raster graphics (RG) and then conduct recognition based on RG formats. However, this procedure discards the structure of geometries and loses the high resolution of VG. Recently, another category of algorithms is proposed to recognize directly from the original VG format. But it is affected by the topological errors that can be filtered out by RG rendering. Instead of looking at one format, it is a good solution to utilize the formats of VG and RG together to avoid these shortcomings. Besides, we argue that the VG-to-RG rendering process is essential to effectively combine VG and RG information. By specifying the rules on how to transfer VG primitives to RG pixels, the rendering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
