Improving Prediction Certainty Estimation for Reliable Early Exiting via Null Space Projection

Jianing He; Qi Zhang; Duoqian Miao; Yi Kun; Shufeng Hao; Hongyun Zhang; Zhihua Wei

arXiv:2506.17249·cs.LG·June 24, 2025

Improving Prediction Certainty Estimation for Reliable Early Exiting via Null Space Projection

Jianing He, Qi Zhang, Duoqian Miao, Yi Kun, Shufeng Hao, Hongyun Zhang, Zhihua Wei

PDF

TL;DR

This paper introduces a novel early exiting method for pre-trained language models that improves prediction certainty estimation by considering class-irrelevant information, leading to faster inference with minimal performance loss.

Contribution

It proposes the NSP score to better estimate prediction certainty and a CAP score for more reliable early exiting, outperforming state-of-the-art methods on the GLUE benchmark.

Findings

01

Achieves 2.19x speed-up on average across GLUE tasks.

02

Surpasses SOTA ConsistentEE by 28% in efficiency.

03

Maintains negligible performance degradation.

Abstract

Early exiting has demonstrated great potential in accelerating the inference of pre-trained language models (PLMs) by enabling easy samples to exit at shallow layers, eliminating the need for executing deeper layers. However, existing early exiting methods primarily rely on class-relevant logits to formulate their exiting signals for estimating prediction certainty, neglecting the detrimental influence of class-irrelevant information in the features on prediction certainty. This leads to an overestimation of prediction certainty, causing premature exiting of samples with incorrect early predictions. To remedy this, we define an NSP score to estimate prediction certainty by considering the proportion of class-irrelevant information in the features. On this basis, we propose a novel early exiting method based on the Certainty-Aware Probability (CAP) score, which integrates insights from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.