TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep   Neural Networks

Taoyong Cui; Jianze Li; Yuhan Dong; Li Liu

arXiv:2211.13902·cs.CV·December 13, 2022

TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural Networks

Taoyong Cui, Jianze Li, Yuhan Dong, Li Liu

PDF

Open Access

TL;DR

TAOTF is a two-stage training framework for deep neural networks that balances orthogonality and task performance, improving robustness to noisy data in image classification tasks.

Contribution

The paper introduces a novel two-stage orthogonal training framework with a polar decomposition-based initialization and soft orthogonal constraints, enhancing robustness and performance.

Findings

01

Achieves superior accuracy on natural and medical image datasets.

02

Provides stable training with improved robustness to noisy data.

03

Outperforms existing orthogonal constraint methods.

Abstract

The orthogonality constraints, including the hard and soft ones, have been used to normalize the weight matrices of Deep Neural Network (DNN) models, especially the Convolutional Neural Network (CNN) and Vision Transformer (ViT), to reduce model parameter redundancy and improve training stability. However, the robustness to noisy data of these models with constraints is not always satisfactory. In this work, we propose a novel two-stage approximately orthogonal training framework (TAOTF) to find a trade-off between the orthogonal solution space and the main task solution space to solve this problem in noisy data scenarios. In the first stage, we propose a novel algorithm called polar decomposition-based orthogonal initialization (PDOI) to find a good initialization for the orthogonal optimization. In the second stage, unlike other existing methods, we apply soft orthogonal constraints…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Industrial Vision Systems and Defect Detection

MethodsMulti-Head Attention · Attention Is All You Need · Adam · Softmax · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Linear Layer