Co-training $2^L$ Submodels for Visual Recognition

Hugo Touvron; Matthieu Cord; Maxime Oquab; Piotr Bojanowski; Jakob; Verbeek; Herv\'e J\'egou

arXiv:2212.04884·cs.CV·December 12, 2022

Co-training $2^L$ Submodels for Visual Recognition

Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob, Verbeek, Herv\'e J\'egou

PDF

Open Access 1 Repo

TL;DR

This paper introduces submodel co-training, a regularization technique that trains neural networks by having stochastic depth submodels teach each other, improving recognition tasks across various architectures without external models.

Contribution

The paper proposes cosub, a novel co-training method using stochastic depth submodels within a single network, enhancing training effectiveness for vision models.

Findings

01

Improves accuracy on image classification and segmentation tasks.

02

Compatible with multiple neural network architectures.

03

Enhances ViT performance to 87.4% top-1 accuracy on ImageNet.

Abstract

We introduce submodel co-training, a regularization method related to co-training, self-distillation and stochastic depth. Given a neural network to be trained, for each sample we implicitly instantiate two altered networks, ``submodels'', with stochastic depth: we activate only a subset of the layers. Each network serves as a soft teacher to the other, by providing a loss that complements the regular loss provided by the one-hot label. Our approach, dubbed cosub, uses a single set of weights, and does not involve a pre-trained external model or temporal averaging. Experimentally, we show that submodel co-training is effective to train backbones for recognition tasks such as image classification and semantic segmentation. Our approach is compatible with multiple architectures, including RegNet, ViT, PiT, XCiT, Swin and ConvNext. Our training strategy improves their results in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/deit
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning