Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction

Li Yuan; Yi Cai; Xudong Shen; Qing Li; Qingbao Huang; Zikun Deng; Tao Wang

arXiv:2505.06303·cs.LG·May 13, 2025

Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction

Li Yuan, Yi Cai, Xudong Shen, Qing Li, Qingbao Huang, Zikun Deng, Tao Wang

PDF

Open Access

TL;DR

This paper introduces C-LoRAE, a novel multi-expert framework with an achievement-based loss for efficient, multi-task multimodal information extraction, improving performance and training efficiency over traditional methods.

Contribution

It proposes a collaborative multi-LoRA experts architecture with an achievement-based loss to enhance multi-task learning in multimodal information extraction.

Findings

01

Outperforms traditional fine-tuning and LoRA methods on seven benchmark datasets.

02

Maintains comparable training parameters to LoRA while achieving superior results.

03

Effectively balances multi-task training despite data imbalance.

Abstract

Multimodal Information Extraction (MIE) has gained attention for extracting structured information from multimedia sources. Traditional methods tackle MIE tasks separately, missing opportunities to share knowledge across tasks. Recent approaches unify these tasks into a generation problem using instruction-based T5 models with visual adaptors, optimized through full-parameter fine-tuning. However, this method is computationally intensive, and multi-task fine-tuning often faces gradient conflicts, limiting performance. To address these challenges, we propose collaborative multi-LoRA experts with achievement-based multi-task loss (C-LoRAE) for MIE tasks. C-LoRAE extends the low-rank adaptation (LoRA) method by incorporating a universal expert to learn shared multimodal knowledge from cross-MIE tasks and task-specific experts to learn specialized instructional task features. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling