Learning Robust and Lightweight Model through Separable Structured Transformations
Xian Wei, Yanhui Huang, Yangyu Xu, Mingsong Chen, Hai Lan, Yuanxiang, Li, Zhongfeng Wang, Xuan Tang

TL;DR
This paper introduces a method to create lightweight, robust deep learning models by applying separable structured transformations to fully-connected layers, significantly reducing parameters while maintaining high accuracy and robustness.
Contribution
The paper proposes a novel separable structural transformation of fully-connected layers combined with sparsity and condition number constraints to enhance model efficiency and robustness.
Findings
Reduced network parameters by up to 90%.
Maintained robust accuracy loss below 1.5%.
Achieved high compression rates, e.g., 200 times.
Abstract
With the proliferation of mobile devices and the Internet of Things, deep learning models are increasingly deployed on devices with limited computing resources and memory, and are exposed to the threat of adversarial noise. Learning deep models with both lightweight and robustness is necessary for these equipments. However, current deep learning solutions are difficult to learn a model that possesses these two properties without degrading one or the other. As is well known, the fully-connected layers contribute most of the parameters of convolutional neural networks. We perform a separable structural transformation of the fully-connected layer to reduce the parameters, where the large-scale weight matrix of the fully-connected layer is decoupled by the tensor product of several separable small-sized matrices. Note that data, such as images, no longer need to be flattened before being…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Anomaly Detection Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Absolute Position Encodings · Residual Connection · Dropout · Softmax · Position-Wise Feed-Forward Layer · Adam
