CheapET-3: Cost-Efficient Use of Remote DNN Models
Michael Weiss

TL;DR
CheapET-3 introduces a cost-efficient architecture combining small local DNNs with remote large models, reducing prediction costs by up to 50% while maintaining accuracy.
Contribution
It presents a novel software architecture that enables cost-effective use of large-scale remote DNN models alongside local models for client-side applications.
Findings
Prediction cost reduced by up to 50%.
Maintains system accuracy despite cost reduction.
Demonstrates practical feasibility of cost-efficient remote DNN use.
Abstract
On complex problems, state of the art prediction accuracy of Deep Neural Networks (DNN) can be achieved using very large-scale models, consisting of billions of parameters. Such models can only be run on dedicated servers, typically provided by a 3rd party service, which leads to a substantial monetary cost for every prediction. We propose a new software architecture for client-side applications, where a small local DNN is used alongside a remote large-scale model, aiming to make easy predictions locally at negligible monetary cost, while still leveraging the benefits of a large model for challenging inputs. In a proof of concept we reduce prediction cost by up to 50% without negatively impacting system accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james
