Gatekeeper: Improving Model Cascades Through Confidence Tuning

Stephan Rabanser; Nathalie Rauschmayr; Achin Kulshrestha; Petra Poklukar; Wittawat Jitkrittum; Sean Augenstein; Congchao Wang; Federico Tombari

arXiv:2502.19335·cs.LG·October 24, 2025

Gatekeeper: Improving Model Cascades Through Confidence Tuning

Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari

PDF

Open Access

TL;DR

Gatekeeper introduces a novel loss function to calibrate smaller models in cascades, enabling better task handling and deferral to larger models, thereby improving resource efficiency across diverse tasks.

Contribution

The paper proposes a new loss function called Gatekeeper for calibrating small models in cascades, enhancing deferral accuracy without architectural changes.

Findings

01

Significant improvement in deferral performance across tasks.

02

Applicable to various architectures and domains.

03

Broadly improves resource efficiency in model cascades.

Abstract

Large-scale machine learning models deliver strong performance across a wide range of tasks but come with significant computational and resource constraints. To mitigate these challenges, local smaller models are often deployed alongside larger models, relying on routing and deferral mechanisms to offload complex tasks. However, existing approaches inadequately balance the capabilities of these models, often resulting in unnecessary deferrals or sub-optimal resource usage. In this work we introduce a novel loss function called Gatekeeper for calibrating smaller models in cascade setups. Our approach fine-tunes the smaller model to confidently handle tasks it can perform correctly while deferring complex tasks to the larger model. Moreover, it incorporates a mechanism for managing the trade-off between model performance and deferral accuracy, and is broadly applicable across various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification