Learning Robust and Lightweight Model through Separable Structured   Transformations

Xian Wei; Yanhui Huang; Yangyu Xu; Mingsong Chen; Hai Lan; Yuanxiang; Li; Zhongfeng Wang; Xuan Tang

arXiv:2112.13551·cs.CV·December 30, 2021·1 cites

Learning Robust and Lightweight Model through Separable Structured Transformations

Xian Wei, Yanhui Huang, Yangyu Xu, Mingsong Chen, Hai Lan, Yuanxiang, Li, Zhongfeng Wang, Xuan Tang

PDF

Open Access

TL;DR

This paper introduces a method to create lightweight, robust deep learning models by applying separable structured transformations to fully-connected layers, significantly reducing parameters while maintaining high accuracy and robustness.

Contribution

The paper proposes a novel separable structural transformation of fully-connected layers combined with sparsity and condition number constraints to enhance model efficiency and robustness.

Findings

01

Reduced network parameters by up to 90%.

02

Maintained robust accuracy loss below 1.5%.

03

Achieved high compression rates, e.g., 200 times.

Abstract

With the proliferation of mobile devices and the Internet of Things, deep learning models are increasingly deployed on devices with limited computing resources and memory, and are exposed to the threat of adversarial noise. Learning deep models with both lightweight and robustness is necessary for these equipments. However, current deep learning solutions are difficult to learn a model that possesses these two properties without degrading one or the other. As is well known, the fully-connected layers contribute most of the parameters of convolutional neural networks. We perform a separable structural transformation of the fully-connected layer to reduce the parameters, where the large-scale weight matrix of the fully-connected layer is decoupled by the tensor product of several separable small-sized matrices. Note that data, such as images, no longer need to be flattened before being…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Anomaly Detection Techniques and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Absolute Position Encodings · Residual Connection · Dropout · Softmax · Position-Wise Feed-Forward Layer · Adam