TL;DR
This paper introduces a novel multi-view RGB-based method for localizing and estimating the shape of transparent containers, outperforming depth-based deep learning approaches under various transparency and lighting conditions.
Contribution
The paper presents a new approach combining generative 3D sampling, iterative shape fitting, and semantic segmentation for transparent object shape estimation using only RGB images.
Findings
Outperforms depth-based deep learning methods in localization success
Achieves higher accuracy in dimension estimation across diverse transparency levels
Effective under different backgrounds and illumination conditions
Abstract
The 3D localisation of an object and the estimation of its properties, such as shape and dimensions, are challenging under varying degrees of transparency and lighting conditions. In this paper, we propose a method for jointly localising container-like objects and estimating their dimensions using two wide-baseline, calibrated RGB cameras. Under the assumption of circular symmetry along the vertical axis, we estimate the dimensions of an object with a generative 3D sampling model of sparse circumferences, iterative shape fitting and image re-projection to verify the sampling hypotheses in each camera using semantic segmentation masks. We evaluate the proposed method on a novel dataset of objects with different degrees of transparency and captured under different backgrounds and illumination conditions. Our method, which is based on RGB images only, outperforms in terms of localisation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
