Multi-View Object Pose Refinement With Differentiable Renderer
Ivan Shugurov, Ivan Pavlov, Sergey Zakharov, Slobodan Ilic

TL;DR
This paper presents a multi-view 6 DoF object pose refinement method using a differentiable renderer, improving accuracy with synthetic training data and demonstrating robustness with few frames and noisy calibrations.
Contribution
It introduces a novel multi-view pose refinement approach leveraging a differentiable renderer and ICP-like loss, trained solely on synthetic data for improved real-world performance.
Findings
Achieves state-of-the-art results on multiple datasets.
Requires only a few frames for effective refinement.
Robust to camera noise and close viewpoints.
Abstract
This paper introduces a novel multi-view 6 DoF object pose refinement approach focusing on improving methods trained on synthetic data. It is based on the DPOD detector, which produces dense 2D-3D correspondences between the model vertices and the image pixels in each frame. We have opted for the use of multiple frames with known relative camera transformations, as it allows introduction of geometrical constraints via an interpretable ICP-like loss function. The loss function is implemented with a differentiable renderer and is optimized iteratively. We also demonstrate that a full detection and refinement pipeline, which is trained solely on synthetic data, can be used for auto-labeling real data. We perform quantitative evaluation on LineMOD, Occlusion, Homebrewed and YCB-V datasets and report excellent performance in comparison to the state-of-the-art methods trained on the synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
