Aggregate Representation Measure for Predictive Model Reusability

Vishwesh Sangarya; Richard Bradford; Jung-Eun Kim

arXiv:2405.09600·cs.LG·May 17, 2024·1 cites

Aggregate Representation Measure for Predictive Model Reusability

Vishwesh Sangarya, Richard Bradford, Jung-Eun Kim

PDF

Open Access

TL;DR

This paper introduces the Aggregated Representation Measure (ARM), a predictive index that estimates retraining costs of models under distribution shifts, facilitating cost-effective model reuse and sustainability.

Contribution

The paper presents ARM, a novel measure that predicts retraining costs before actual retraining, aiding in efficient and sustainable model reuse under distribution shifts.

Findings

01

ARM reasonably predicts retraining costs across noise levels.

02

ARM enables comparison of model architectures for cost-effectiveness.

03

ARM supports sustainable AI practices by estimating resource requirements.

Abstract

In this paper, we propose a predictive quantifier to estimate the retraining cost of a trained model in distribution shifts. The proposed Aggregated Representation Measure (ARM) quantifies the change in the model's representation from the old to new data distribution. It provides, before actually retraining the model, a single concise index of resources - epochs, energy, and carbon emissions - required for the retraining. This enables reuse of a model with a much lower cost than training a new model from scratch. The experimental results indicate that ARM reasonably predicts retraining costs for varying noise intensities and enables comparisons among multiple model architectures to determine the most cost-effective and sustainable option.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Machine Learning and Data Classification · Semantic Web and Ontologies