RRR-Net: Reusing, Reducing, and Recycling a Deep Backbone Network

Haozhe Sun (LISN; TAU; Inria); Isabelle Guyon (LISN; TAU; Inria),; Felix Mohr; Hedi Tabia (IBISC)

arXiv:2310.01157·cs.LG·October 3, 2023

RRR-Net: Reusing, Reducing, and Recycling a Deep Backbone Network

Haozhe Sun (LISN, TAU, Inria), Isabelle Guyon (LISN, TAU, Inria),, Felix Mohr, Hedi Tabia (IBISC)

PDF

TL;DR

This paper proposes techniques to reuse and compress large pre-trained backbone networks like ResNet152, creating smaller, faster models that maintain or improve performance across diverse image classification tasks.

Contribution

It introduces methods to reduce, split, and ensemble pre-trained backbones, achieving significant size and speed improvements without sacrificing accuracy.

Findings

01

Reduced ResNet152 from 51 to 5 blocks with minimal performance loss.

02

Created an ensemble of sub-networks with the same parameters and FLOPs.

03

Achieved comparable or better performance with smaller, faster models.

Abstract

It has become mainstream in computer vision and other machine learning domains to reuse backbone networks pre-trained on large datasets as preprocessors. Typically, the last layer is replaced by a shallow learning machine of sorts; the newly-added classification head and (optionally) deeper layers are fine-tuned on a new task. Due to its strong performance and simplicity, a common pre-trained backbone network is ResNet152.However, ResNet152 is relatively large and induces inference latency. In many cases, a compact and efficient backbone with similar performance would be preferable over a larger, slower one. This paper investigates techniques to reuse a pre-trained backbone with the objective of creating a smaller and faster model. Starting from a large ResNet152 backbone pre-trained on ImageNet, we first reduce it from 51 blocks to 5 blocks, reducing its number of parameters and FLOPs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.