ACME: Adaptive Customization of Large Models via Distributed Systems

Ziming Dai; Chao Qiu; Fei Gao; Yunfeng Zhao; Xiaofei Wang

arXiv:2507.14802·cs.DC·July 22, 2025

ACME: Adaptive Customization of Large Models via Distributed Systems

Ziming Dai, Chao Qiu, Fei Gao, Yunfeng Zhao, Xiaofei Wang

PDF

TL;DR

ACME introduces an adaptive distributed system approach for customizing large Transformer models, reducing costs and improving accuracy for personalized virtual assistants while addressing data privacy and resource constraints.

Contribution

It proposes a novel bidirectional distributed customization framework that efficiently tailors large models to heterogeneous user data and resource constraints.

Findings

01

Data transmission volume reduced to 6% of centralized methods.

02

Average accuracy improved by 10% over baselines.

03

Cost-efficient models achieved under size constraints.

Abstract

Pre-trained Transformer-based large models have revolutionized personal virtual assistants, but their deployment in cloud environments faces challenges related to data privacy and response latency. Deploying large models closer to the data and users has become a key research area to address these issues. However, applying these models directly often entails significant difficulties, such as model mismatching, resource constraints, and energy inefficiency. Automated design of customized models is necessary, but it faces three key challenges, namely, the high cost of centralized model customization, imbalanced performance from user heterogeneity, and suboptimal performance from data heterogeneity. In this paper, we propose ACME, an adaptive customization approach of Transformer-based large models via distributed systems. To avoid the low cost-efficiency of centralized methods, ACME…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.