TL;DR
This paper introduces a depth-adaptive neural network with a novel convolution layer that dynamically adjusts receptive fields based on depth information, improving semantic segmentation accuracy across varying object distances.
Contribution
The paper proposes the depth-adaptive multiscale (DaM) convolution layer, enabling neural networks to adapt receptive fields at each neuron using depth data, enhancing segmentation performance.
Findings
Outperforms state-of-the-art methods on RGB-D datasets
Effective in hand-object interaction segmentation
No additional layers or pre/post-processing needed
Abstract
In this work, we present the depth-adaptive deep neural network using a depth map for semantic segmentation. Typical deep neural networks receive inputs at the predetermined locations regardless of the distance from the camera. This fixed receptive field presents a challenge to generalize the features of objects at various distances in neural networks. Specifically, the predetermined receptive fields are too small at a short distance, and vice versa. To overcome this challenge, we develop a neural network which is able to adapt the receptive field not only for each layer but also for each neuron at the spatial location. To adjust the receptive field, we propose the depth-adaptive multiscale (DaM) convolution layer consisting of the adaptive perception neuron and the in-layer multiscale neuron. The adaptive perception neuron is to adjust the receptive field at each spatial location using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
