Compressed Models are NOT Trust-equivalent to Their Large Counterparts
Rohit Raj Rai, Chirag Kothari, Siddhesh Shelke, Amit Awekar

TL;DR
This paper introduces a framework to evaluate trust-equivalence between large models and their compressed versions, revealing that similar accuracy does not imply similar interpretability or calibration, thus challenging assumptions about deploying compressed models.
Contribution
The paper proposes a novel two-dimensional trust-equivalence framework using interpretability and calibration measures, highlighting the need for comprehensive evaluation beyond accuracy.
Findings
Compressed models show low interpretability alignment with large models.
Calibration similarity between models is significantly mismatched.
Accuracy parity does not imply trust-equivalence in compressed models.
Abstract
Large Deep Learning models are often compressed before being deployed in a resource-constrained environment. Can we trust the prediction of compressed models just as we trust the prediction of the original large model? Existing work has keenly studied the effect of compression on accuracy and related performance measures. However, performance parity does not guarantee trust-equivalence. We propose a two-dimensional framework for trust-equivalence evaluation. First, interpretability alignment measures whether the models base their predictions on the same input features. We use LIME and SHAP tests to measure the interpretability alignment. Second, calibration similarity measures whether the models exhibit comparable reliability in their predicted probabilities. It is assessed via ECE, MCE, Brier Score, and reliability diagrams. We conducted experiments using BERT-base as the large model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
