Densely connected multidilated convolutional networks for dense   prediction tasks

Naoya Takahashi; Yuki Mitsufuji

arXiv:2011.11844·cs.CV·June 10, 2021·6 cites

Densely connected multidilated convolutional networks for dense prediction tasks

Naoya Takahashi, Yuki Mitsufuji

PDF

Open Access 1 Repo

TL;DR

This paper introduces D3Net, a novel CNN architecture that densely models multiresolution features using multidilated convolutions, improving dense prediction tasks like semantic segmentation and audio separation.

Contribution

The paper proposes D3Net, which combines multidilated convolutions with DenseNet to model multiple resolutions simultaneously without aliasing, enhancing dense prediction performance.

Findings

01

D3Net outperforms state-of-the-art methods on Cityscapes semantic segmentation.

02

D3Net achieves superior results on MUSDB18 audio source separation.

03

The architecture effectively models local and global patterns in high-resolution tasks.

Abstract

Tasks that involve high-resolution dense prediction require a modeling of both local and global patterns in a large input field. Although the local and global structures often depend on each other and their simultaneous modeling is important, many convolutional neural network (CNN)-based approaches interchange representations in different resolutions only a few times. In this paper, we claim the importance of a dense simultaneous modeling of multiresolution representation and propose a novel CNN architecture called densely connected multidilated DenseNet (D3Net). D3Net involves a novel multidilated convolution that has different dilation factors in a single layer to model different resolutions simultaneously. By combining the multidilated convolution with the DenseNet architecture, D3Net incorporates multiresolution learning with an exponentially growing receptive field in almost all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sony/ai-research-code
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Concatenated Skip Connection · Dense Block · Max Pooling · Dense Connections · Kaiming Initialization · Dropout · Softmax · Average Pooling