Reliable Inference in Edge-Cloud Model Cascades via Conformal Alignment
Jiayi Huang, Sangwoo Park, Nicola Paoletti, Osvaldo Simeone

TL;DR
This paper introduces a conformal alignment-based cascading method for edge-cloud models that guarantees conditional coverage, reduces cloud offloading, and balances coverage, deferral rate, and set size in low-latency AI applications.
Contribution
It formalizes conditional coverage in edge-cloud cascades and develops a conformal alignment mechanism with statistical guarantees, enabling reliable and efficient edge inference.
Findings
Maintains target coverage in edge predictions.
Reduces cloud offloading significantly.
Balances coverage, deferral rate, and set size effectively.
Abstract
Edge intelligence enables low-latency inference via compact on-device models, but assuring reliability remains challenging. We study edge-cloud cascades that must preserve conditional coverage: whenever the edge returns a prediction set, it should contain the true label with a user-specified probability, as if produced by the cloud model. We formalize conditional coverage with respect to the cloud predictive distribution, and introduce a conformal alignment-based (CAb) cascading mechanism that certifies this property with user control over the risk level. Our method casts escalation from edge to cloud models as a multiple-hypothesis testing (MHT) problem, tailoring conformal alignment (CA) to select which inputs can be safely handled at the edge. The proposed CAb model cascading method yields statistical guarantees on the average fraction of edge decisions that satisfy cloud-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
