FedVLA: Federated Vision-Language-Action Learning with Dual Gating Mixture-of-Experts for Robotic Manipulation

Cui Miao; Tao Chang; Meihan Wu; Hongbin Xu; Chun Li; Ming Li; Xiaodong Wang

arXiv:2508.02190·cs.RO·August 5, 2025

FedVLA: Federated Vision-Language-Action Learning with Dual Gating Mixture-of-Experts for Robotic Manipulation

Cui Miao, Tao Chang, Meihan Wu, Hongbin Xu, Chun Li, Ming Li, Xiaodong Wang

PDF

Open Access

TL;DR

FedVLA introduces a privacy-preserving federated learning framework for vision-language-action models in robotics, utilizing dual gating experts and task-aware mechanisms to match centralized performance.

Contribution

This work is the first to develop a federated VLA learning framework with dual gating mixture-of-experts and expert-driven aggregation for robotic manipulation.

Findings

01

DGMoE improves computational efficiency over vanilla MoE.

02

FedVLA achieves task success rates comparable to centralized training.

03

The framework effectively preserves data privacy in robotic VLA tasks.

Abstract

Vision-language-action (VLA) models have significantly advanced robotic manipulation by enabling robots to interpret language instructions for task execution. However, training these models often relies on large-scale user-specific data, raising concerns about privacy and security, which in turn limits their broader adoption. To address this, we propose FedVLA, the first federated VLA learning framework, enabling distributed model training that preserves data privacy without compromising performance. Our framework integrates task-aware representation learning, adaptive expert selection, and expert-driven federated aggregation, enabling efficient and privacy-preserving training of VLA models. Specifically, we introduce an Instruction Oriented Scene-Parsing mechanism, which decomposes and enhances object-level features based on task instructions, improving contextual understanding. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications