SAGE: Training-Free Semantic Evidence Composition for Edge-Cloud Inference under Hard Uplink Budgets

Inhyeok Choi; Hyuncheol Park

arXiv:2604.19623·cs.LG·April 22, 2026

SAGE: Training-Free Semantic Evidence Composition for Edge-Cloud Inference under Hard Uplink Budgets

Inhyeok Choi, Hyuncheol Park

PDF

TL;DR

SAGE is a training-free method for edge-cloud inference that improves accuracy under strict uplink budgets by combining importance filtering with diversity sampling, outperforming importance-only approaches.

Contribution

The paper introduces SAGE, a novel importance and diversity-based evidence selection method that operates without training, enhancing edge-cloud inference under hard bandwidth constraints.

Findings

01

SAGE achieves 93% of server accuracy on ImageNet-1K with less than half evidence units.

02

Replacing high-importance units with diverse, low-importance ones improves accuracy.

03

Spatial coverage alone provides competitive accuracy at moderate budgets.

Abstract

Edge-cloud hybrid inference offloads difficult inputs to a powerful remote model, but the uplink channel imposes hard per-request constraints on the number of bits that can be transmitted. We show that selecting transmitted content based solely on attention-based importance, the standard approach in collaborative inference, is inherently limited under hard budgets. Two findings support this claim. First, replacing high-importance units with low-importance but complementary ones improves server accuracy. This shows that what matters is not individual importance but how well the transmitted set covers diverse aspects of the input. Second, spatially uniform selection without any content information achieves competitive accuracy at moderate budgets. This confirms that spatial coverage alone carries independent value. Based on this analysis, we propose SAGE (Semantic Attention-Guided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.