HyDRA: Hierarchical and Dynamic Rank Adaptation for Mobile Vision Language Model

Yuanhao Xi; Xiaohuan Bing; Ramin Yahyapour

arXiv:2512.20674·cs.LG·December 25, 2025

HyDRA: Hierarchical and Dynamic Rank Adaptation for Mobile Vision Language Model

Yuanhao Xi, Xiaohuan Bing, Ramin Yahyapour

PDF

Open Access

TL;DR

HyDRA introduces a hierarchical and dynamic rank adaptation framework for mobile vision language models, enabling efficient fine-tuning that outperforms baselines and sometimes surpasses full-parameter methods without increasing trainable parameters.

Contribution

The paper proposes HyDRA, a novel parameter-efficient fine-tuning framework with hierarchical and dynamic rank scheduling for mobile VLMs, improving performance without extra trainable parameters.

Findings

01

Achieves 4.7% performance improvement over baselines.

02

Outperforms full-parameter fine-tuning on some tasks.

03

Effective across various model sizes.

Abstract

Vision Language Models (VLMs) have undergone significant advancements, particularly with the emergence of mobile-oriented VLMs, which offer a wide range of application scenarios. However, the substantial computational requirements for training these models present a significant obstacle to their practical application. To address this issue, Low-Rank Adaptation (LoRA) has been proposed. Nevertheless, the standard LoRA with a fixed rank lacks sufficient capability for training mobile VLMs that process both text and image modalities. In this work, we introduce HyDRA, a parameter-efficient fine-tuning framework designed to implement hierarchical and dynamic rank scheduling for mobile VLMs. This framework incorporates two essential optimization strategies: (1) hierarchical optimization, which involves a coarse-grained approach that assigns different ranks to various layers, as well as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications