Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models
Nikita Agrawal, Simon Mertel, Ruben Mayer

TL;DR
This paper argues that federated post-training of foundation language models should prioritize open-source models over black-box models to better align with privacy and autonomy principles in federated learning.
Contribution
The paper critically analyzes the limitations of using black-box models in federated post-training and advocates for focusing on open-source models to enhance privacy and transparency.
Findings
Black-box models conflict with federated learning principles
Open-source models support privacy and model transparency
Federated post-training benefits from open-source model access
Abstract
Post-training of foundation language models has emerged as a promising research domain in federated learning (FL) with the goal to enable privacy-preserving model improvements and adaptations to user's downstream tasks. Recent advances in this area adopt centralized post-training approaches that build upon black-box foundation language models where there is no access to model weights and architecture details. Although the use of black-box models has been successful in centralized post-training, their blind replication in FL raises several concerns. Our position is that using black-box models in FL contradicts the core principles of federation such as data privacy and autonomy. In this position paper, we critically analyze the usage of black-box models in federated post-training, and provide a detailed account of various aspects of openness and their implications for FL.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Ethics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education
MethodsADaptive gradient method with the OPTimal convergence rate
