ViR:the Vision Reservoir

Xian Wei; Bin Wang; Mingsong Chen; Ji Yuan; Hai Lan; Jiehuang Shi,; Xuan Tang; Bo Jin; Guozhang Chen; Dongping Yang

arXiv:2112.13545·cs.CV·December 30, 2021

ViR:the Vision Reservoir

Xian Wei, Bin Wang, Mingsong Chen, Ji Yuan, Hai Lan, Jiehuang Shi,, Xuan Tang, Bo Jin, Guozhang Chen, Dongping Yang

PDF

Open Access

TL;DR

The paper introduces Vision Reservoir (ViR), a novel image classification method that replaces Transformer modules with a reservoir computing approach, reducing complexity and improving performance without pre-training.

Contribution

Proposes ViR, a reservoir computing-based alternative to ViT, addressing high computation and overfitting issues in image classification.

Findings

01

ViR outperforms ViT in accuracy without pre-training.

02

ViR has significantly fewer parameters and lower memory usage.

03

ViR demonstrates superior performance on multiple benchmarks.

Abstract

The most recent year has witnessed the success of applying the Vision Transformer (ViT) for image classification. However, there are still evidences indicating that ViT often suffers following two aspects, i) the high computation and the memory burden from applying the multiple Transformer layers for pre-training on a large-scale dataset, ii) the over-fitting when training on small datasets from scratch. To address these problems, a novel method, namely, Vision Reservoir computing (ViR), is proposed here for image classification, as a parallel to ViT. By splitting each image into a sequence of tokens with fixed length, the ViR constructs a pure reservoir with a nearly fully connected topology to replace the Transformer module in ViT. Two kinds of deep ViR models are subsequently proposed to enhance the network performance. Comparative experiments between the ViR and the ViT are carried…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Label Smoothing · Byte Pair Encoding · Softmax · Dense Connections · Position-Wise Feed-Forward Layer · Adam