Densely connected multidilated convolutional networks for dense prediction tasks
Naoya Takahashi, Yuki Mitsufuji

TL;DR
This paper introduces D3Net, a novel CNN architecture that densely models multiresolution features using multidilated convolutions, improving dense prediction tasks like semantic segmentation and audio separation.
Contribution
The paper proposes D3Net, which combines multidilated convolutions with DenseNet to model multiple resolutions simultaneously without aliasing, enhancing dense prediction performance.
Findings
D3Net outperforms state-of-the-art methods on Cityscapes semantic segmentation.
D3Net achieves superior results on MUSDB18 audio source separation.
The architecture effectively models local and global patterns in high-resolution tasks.
Abstract
Tasks that involve high-resolution dense prediction require a modeling of both local and global patterns in a large input field. Although the local and global structures often depend on each other and their simultaneous modeling is important, many convolutional neural network (CNN)-based approaches interchange representations in different resolutions only a few times. In this paper, we claim the importance of a dense simultaneous modeling of multiresolution representation and propose a novel CNN architecture called densely connected multidilated DenseNet (D3Net). D3Net involves a novel multidilated convolution that has different dilation factors in a single layer to model different resolutions simultaneously. By combining the multidilated convolution with the DenseNet architecture, D3Net incorporates multiresolution learning with an exponentially growing receptive field in almost all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Concatenated Skip Connection · Dense Block · Max Pooling · Dense Connections · Kaiming Initialization · Dropout · Softmax · Average Pooling
