Deep Multicameral Decoding for Localizing Unoccluded Object Instances from a Single RGB Image
Matthieu Grard (imagine), Emmanuel Dellandr\'ea (imagine), Liming Chen, (imagine)

TL;DR
This paper introduces a novel deep multicameral decoding architecture for localizing unoccluded objects in dense, homogeneous layouts from a single RGB image, along with a synthetic dataset to improve occlusion-aware segmentation.
Contribution
It proposes a multicameral encoder-decoder design with subtask-specific units and introduces Mikado, a synthetic dataset for dense object layouts, enhancing occlusion-aware segmentation performance.
Findings
Multiscale units outperform traditional design patterns.
Mikado dataset enables transfer learning to real images.
Proposed method improves localization in dense occlusion scenarios.
Abstract
Occlusion-aware instance-sensitive segmentation is a complex task generally split into region-based segmentations, by approximating instances as their bounding box. We address the showcase scenario of dense homogeneous layouts in which this approximation does not hold. In this scenario, outlining unoccluded instances by decoding a deep encoder becomes difficult, due to the translation invariance of convolutional layers and the lack of complexity in the decoder. We therefore propose a multicameral design composed of subtask-specific lightweight decoder and encoder-decoder units, coupled in cascade to encourage subtask-specific feature reuse and enforce a learning path within the decoding process. Furthermore, the state-of-the-art datasets for occlusion-aware instance segmentation contain real images with few instances and occlusions mostly due to objects occluding the background, unlike…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
