Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models
Yida Wang, Weihong Deng

TL;DR
This paper introduces a generative model with metric learning that leverages synthetic images from 3D models to improve object recognition, reducing data collection efforts and enhancing recognition accuracy across varied conditions.
Contribution
The paper proposes a novel generative model combining Bayesian rendering and metric learning with a coordinate training strategy for effective synthetic-to-real object recognition.
Findings
Achieved over 50% accuracy on ShapeNet database.
Reduced data collection workload using synthetic images.
Enhanced recognition robustness across different poses and lighting.
Abstract
Given large amount of real photos for training, Convolutional neural network shows excellent performance on object recognition tasks. However, the process of collecting data is so tedious and the background are also limited which makes it hard to establish a perfect database. In this paper, our generative model trained with synthetic images rendered from 3D models reduces the workload of data collection and limitation of conditions. Our structure is composed of two sub-networks: semantic foreground object reconstruction network based on Bayesian inference and classification network based on multi-triplet cost function for avoiding over-fitting problem on monotone surface and fully utilizing pose information by establishing sphere-like distribution of descriptors in each category which is helpful for recognition on regular photos according to poses, lighting condition, background and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
