Wise-SrNet: A Novel Architecture for Enhancing Image Classification by   Learning Spatial Resolution of Feature Maps

Mohammad Rahimzadeh; AmirAli Askari; Soroush Parvin; Elnaz Safi,; Mohammad Reza Mohammadi

arXiv:2104.12294·cs.CV·March 12, 2024·6 cites

Wise-SrNet: A Novel Architecture for Enhancing Image Classification by Learning Spatial Resolution of Feature Maps

Mohammad Rahimzadeh, AmirAli Askari, Soroush Parvin, Elnaz Safi,, Mohammad Reza Mohammadi

PDF

Open Access 2 Repos

TL;DR

This paper introduces Wise-SrNet, a new architecture that preserves spatial resolution in feature maps to improve image classification accuracy and convergence speed without increasing computational cost.

Contribution

Wise-SrNet replaces the Global Average Pooling layer with a novel, efficient architecture inspired by depthwise convolution, enhancing spatial resolution processing in CNNs.

Findings

01

Increases Top-1 accuracy by 2% to 8% on various datasets and models.

02

Improves accuracy by 3% to 26% on high-resolution images.

03

Enhances convergence speed and classification performance.

Abstract

One of the main challenges since the advancement of convolutional neural networks is how to connect the extracted feature map to the final classification layer. VGG models used two sets of fully connected layers for the classification part of their architectures, which significantly increased the number of models' weights. ResNet and the next deep convolutional models used the Global Average Pooling (GAP) layer to compress the feature map and feed it to the classification layer. Although using the GAP layer reduces the computational cost, but also causes losing spatial resolution of the feature map, which results in decreasing learning efficiency. In this paper, we aim to tackle this problem by replacing the GAP layer with a new architecture called Wise-SrNet. It is inspired by the depthwise convolutional idea and is designed for processing spatial resolution while not increasing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Remote Sensing and LiDAR Applications · Advanced Vision and Imaging

Methods1x1 Convolution · Batch Normalization · Kaiming Initialization · Bottleneck Residual Block · Residual Connection · Average Pooling · Dense Connections · Residual Block · Global Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia?