Deepening Neural Networks Implicitly and Locally via Recurrent Attention   Strategy

Shanshan Zhong; Wushao Wen; Jinghui Qin; Zhongzhan Huang

arXiv:2210.15676·cs.LG·October 31, 2022

Deepening Neural Networks Implicitly and Locally via Recurrent Attention Strategy

Shanshan Zhong, Wushao Wen, Jinghui Qin, Zhongzhan Huang

PDF

Open Access

TL;DR

This paper introduces a Recurrent Attention Strategy (RAS) that implicitly deepens neural networks using lightweight attention modules with local parameter sharing, enhancing performance with minimal additional computational cost.

Contribution

The novel RAS method effectively increases neural network depth implicitly, reducing the need for explicit deepening and significantly lowering computational and parameter overhead.

Findings

01

RAS improves neural network performance on benchmark datasets.

02

RAS outperforms existing attention modules with minimal extra parameters.

03

Experimental results confirm RAS's efficiency and effectiveness.

Abstract

More and more empirical and theoretical evidence shows that deepening neural networks can effectively improve their performance under suitable training settings. However, deepening the backbone of neural networks will inevitably and significantly increase computation and parameter size. To mitigate these problems, we propose a simple-yet-effective Recurrent Attention Strategy (RAS), which implicitly increases the depth of neural networks with lightweight attention modules by local parameter sharing. The extensive experiments on three widely-used benchmark datasets demonstrate that RAS can improve the performance of neural networks at a slight addition of parameter size and computation, performing favorably against other existing well-known attention modules.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications