Federated Cross-Modal Retrieval with Missing Modalities via Semantic Routing and Adapter Personalization

Hefeng Zhou; Xuan Liu; Sicheng Chen; Wutong Zhang; Wu Yan; Jiong Lou; Chentao Wu; Guangtao Xue; Wei Zhao; Jie Li

arXiv:2604.22885·cs.CV·April 28, 2026

Federated Cross-Modal Retrieval with Missing Modalities via Semantic Routing and Adapter Personalization

Hefeng Zhou, Xuan Liu, Sicheng Chen, Wutong Zhang, Wu Yan, Jiong Lou, Chentao Wu, Guangtao Xue, Wei Zhao, Jie Li

PDF

1 Repo

TL;DR

This paper introduces RCSR, a federated learning framework for cross-modal retrieval that handles missing modalities and client heterogeneity using semantic routing and adapter personalization.

Contribution

It proposes a novel federated approach combining semantic routing, prototype anchoring, and adapters on a frozen CLIP backbone for improved retrieval accuracy and personalization.

Findings

01

RCSR improves global retrieval accuracy on benchmarks.

02

It enhances client-level retrieval performance, especially with incomplete modalities.

03

The framework stabilizes training under heterogeneous client data.

Abstract

Federated cross-modal retrieval faces severe challenges from heterogeneous client data, particularly non-IID semantic distributions and missing modalities. Under such heterogeneity, a single global model is often insufficient to capture both shared cross-modal knowledge and client-specific characteristics. We propose RCSR, a personalization-friendly federated framework that integrates prototype anchoring, retrieval-centric semantic routing, and optional client-specific adapters. Built on a frozen CLIP backbone, RCSR leverages lightweight shared adapters for global knowledge transfer while supporting efficient local personalization. Prototype anchoring helps unimodal clients align with global cross-modal semantics, and a server-side semantic router adaptively assigns aggregation weights based on retrieval consistency to mitigate alignment drift during heterogeneous updates. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RezinChow/RCSR-Retrieval-Centric-Semantic-Routing
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.