Dual-stream Network for Visual Recognition
Mingyuan Mao, Renrui Zhang, Honghui Zheng, Peng Gao, Teli Ma, Yan, Peng, Errui Ding, Baochang Zhang, Shumin Han

TL;DR
This paper introduces a Dual-stream Network (DS-Net) that effectively combines local and global features for improved image classification, object detection, and segmentation, outperforming existing models.
Contribution
The paper proposes a novel dual-stream architecture with intra-scale propagation and inter-scale alignment modules to enhance local-global feature integration in vision tasks.
Findings
DS-Net outperforms DeiT-Small by 2.4% top-1 accuracy on ImageNet-1k.
DS-Net-Small surpasses ResNet-50 by 6.4% mAP on MSCOCO 2017.
The approach achieves state-of-the-art results in various vision benchmarks.
Abstract
Transformers with remarkable global representation capacities achieve competitive results for visual tasks, but fail to consider high-level local pattern information in input images. In this paper, we present a generic Dual-stream Network (DS-Net) to fully explore the representation capacity of local and global pattern features for image classification. Our DS-Net can simultaneously calculate fine-grained and integrated features and efficiently fuse them. Specifically, we propose an Intra-scale Propagation module to process two different resolutions in each block and an Inter-Scale Alignment module to perform information interaction across features at dual scales. Besides, we also design a Dual-stream FPN (DS-FPN) to further enhance contextual information for downstream dense predictions. Without bells and whistles, the proposed DS-Net outperforms DeiT-Small by 2.4% in terms of top-1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
Methods1x1 Convolution · Convolution
