Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Bikang Pan, Wei Huang, Ye Shi

TL;DR
This paper provides a theoretical framework for prompt-based federated learning with vision-language models, analyzing signal and noise evolution, and introduces a prompt portfolio approach inspired by portfolio optimization to improve generalization and personalization.
Contribution
It develops a novel theoretical analysis framework for prompt-based federated learning and proposes a prompt portfolio method inspired by portfolio optimization principles.
Findings
Prompt portfolio improves federated learning performance.
Theoretical analysis links task relevance to learning performance.
Empirical results validate the theoretical claims.
Abstract
Integrating pretrained vision-language foundation models like CLIP into federated learning has attracted significant attention for enhancing generalization across diverse tasks. Typically, federated learning of vision-language models employs prompt learning to reduce communication and computational costs, i.e., prompt-based federated learning. However, there is limited theoretical analysis to understand the performance of prompt-based federated learning. In this work, we construct a theoretical analysis framework for prompt-based federated learning via feature learning theory. Specifically, we monitor the evolution of signal learning and noise memorization in prompt-based federated learning, demonstrating that performance can be assessed by the ratio of task-relevant to task-irrelevant coefficients. Furthermore, we draw an analogy between income and risk in portfolio optimization and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Blockchain Technology Applications and Security
MethodsSoftmax · Attention Is All You Need · Contrastive Language-Image Pre-training
