Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models
Shule Lu, Yujing Wang, Hainan Zhang, Xiaoshan Yang, Hongwei Zheng, Yongxin Tong, Changsheng Xu, Zhiming Zheng

TL;DR
This paper introduces MoR, a federated alignment framework for heterogeneous vision-language models that uses preferences and reward models to improve privacy, scalability, and performance in federated settings.
Contribution
It proposes a novel federated alignment method replacing data with preferences, enabling scalable privacy-preserving alignment of diverse VLMs.
Findings
MoR outperforms baselines in generalization and robustness.
The routing-based fusion effectively aggregates heterogeneous client rewards.
Experiments demonstrate improved cross-client adaptability.
Abstract
VLMs have broad potential in privacy-sensitive domains such as healthcare and finance, yet strict data-sharing constraints render centralized training infeasible. FL mitigates this issue by enabling decentralized training, but practical deployments face challenges due to client heterogeneity in computational resources, application requirements, and model architectures. We argue that while replacing data with model parameters characterizes the present of FL, replacing parameters with preferences represents a more scalable and privacy-preserving future. Motivated by this perspective, we propose MoR, a federated alignment framework based on GRPO with Mixture-of-Rewards for heterogeneous VLMs. MoR initializes a visual foundation model as a KL-regularized reference, while each client locally trains a reward model from local preference annotations, capturing specific evaluation signals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks
