Unleashing MLLMs on the Edge: A Unified Framework for Cross-Modal ReID via Adaptive SVD Distillation
Hongbo Jiang, Jie Li, Xinqi Cai, Tianyu Xie, Yunhang Shen, Pingyang Dai, Liujuan Cao

TL;DR
This paper introduces MLLMEmbed-ReID, a unified cloud-edge framework that adapts large multimodal models for cross-modal re-identification, enabling effective deployment on resource-limited edge devices with state-of-the-art results.
Contribution
It presents a novel cloud-edge architecture with instruction-guided MLLM adaptation and a low-rank based knowledge distillation method for edge deployment.
Findings
Achieves state-of-the-art performance on multiple CM-ReID benchmarks.
Effectively transfers knowledge from cloud MLLMs to lightweight edge models.
Demonstrates the practicality of deploying unified multimodal models on edge devices.
Abstract
Practical cloud-edge deployment of Cross-Modal Re-identification (CM-ReID) faces challenges due to maintaining a fragmented ecosystem of specialized cloud models for diverse modalities. While Multi-Modal Large Language Models (MLLMs) offer strong unification potential, existing approaches fail to adapt them into a single end-to-end backbone and lack effective knowledge distillation strategies for edge deployment. To address these limitations, we propose MLLMEmbed-ReID, a unified framework based on a powerful cloud-edge architecture. First, we adapt a foundational MLLM into a state-of-the-art cloud model. We leverage instruction-based prompting to guide the MLLM in generating a unified embedding space across RGB, infrared, sketch, and text modalities. This model is then trained efficiently with a hierarchical Low-Rank Adaptation finetuning (LoRA-SFT) strategy, optimized under a holistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
