How to Train Your Multi-Exit Model? Analyzing the Impact of Training Strategies

Piotr Kubaty; Bartosz W\'ojcik; Bart{\l}omiej Krzepkowski; Monika Michaluk; Tomasz Trzci\'nski; Jary Pomponi; Kamil Adamczewski

arXiv:2407.14320·cs.LG·June 24, 2025

How to Train Your Multi-Exit Model? Analyzing the Impact of Training Strategies

Piotr Kubaty, Bartosz W\'ojcik, Bart{\l}omiej Krzepkowski, Monika Michaluk, Tomasz Trzci\'nski, Jary Pomponi, Kamil Adamczewski

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes how different training strategies affect multi-exit neural networks, introduces metrics for analysis, and proposes a mixed training approach that improves performance and efficiency.

Contribution

It introduces a set of metrics to analyze training dynamics and proposes a novel mixed training strategy for multi-exit models, demonstrating its advantages.

Findings

01

Conventional joint and disjoint training strategies are suboptimal.

02

The proposed mixed training strategy improves performance and efficiency.

03

Comprehensive evaluations validate the effectiveness of the new approach.

Abstract

Early exits enable the network's forward pass to terminate early by attaching trainable internal classifiers to the backbone network. Existing early-exit methods typically adopt either a joint training approach, where the backbone and exit heads are trained simultaneously, or a disjoint approach, where the heads are trained separately. However, the implications of this choice are often overlooked, with studies typically adopting one approach without adequate justification. This choice influences training dynamics and its impact remains largely unexplored. In this paper, we introduce a set of metrics to analyze early-exit training dynamics and guide the choice of training strategy. We demonstrate that conventionally used joint and disjoint regimes yield suboptimal performance. To address these limitations, we propose a mixed training strategy: the backbone is trained first, followed by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kamadforge/early-exit-benchmark
pytorchOfficial

Videos

How to Train Your Multi-Exit Model? Analyzing the Impact of Training Strategies· slideslive

Taxonomy

TopicsEconomic Policies and Impacts