Federating to Grow Transformers with Constrained Resources without Model Sharing
Shikun Shen, Yifei Zou, Yuan Yuan, Yanwei Zheng, Peng Li, Xiuzhen, Cheng, Dongxiao Yu

TL;DR
This paper introduces Fed-Grow, a federated framework that enables resource-constrained users to collaboratively scale pre-trained small models into large transformers without sharing models, thus preserving privacy and reducing resource use.
Contribution
The paper proposes a novel federated approach with Dual-LiGO architecture for cooperative transformer growth from heterogeneous pre-trained models, emphasizing privacy and resource efficiency.
Findings
Outperforms state-of-the-art methods in accuracy and precision
Reduces computational and communication resource consumption
Maintains user privacy by avoiding model sharing
Abstract
The high resource consumption of large-scale models discourages resource-constrained users from developing their customized transformers. To this end, this paper considers a federated framework named Fed-Grow for multiple participants to cooperatively scale a transformer from their pre-trained small models. Under the Fed-Grow, a Dual-LiGO (Dual Linear Growth Operator) architecture is designed to help participants expand their pre-trained small models to a transformer. In Dual-LiGO, the Local-LiGO part is used to address the heterogeneity problem caused by the various pre-trained models, and the Global-LiGO part is shared to exchange the implicit knowledge from the pre-trained models, local data, and training process of participants. Instead of model sharing, only sharing the Global-LiGO strengthens the privacy of our approach. Compared with several state-of-the-art methods in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Optimization and Search Problems · Blockchain Technology Applications and Security
