UniRGB-IR: A Unified Framework for Visible-Infrared Semantic Tasks via Adapter Tuning

Maoxun Yuan; Bo Cui; Tianyi Zhao; Jiayi Wang; Shan Fu; Xue Yang; Xingxing Wei

arXiv:2404.17360·cs.CV·October 14, 2025

UniRGB-IR: A Unified Framework for Visible-Infrared Semantic Tasks via Adapter Tuning

Maoxun Yuan, Bo Cui, Tianyi Zhao, Jiayi Wang, Shan Fu, Xue Yang, Xingxing Wei

PDF

Open Access 1 Repo 1 Models

TL;DR

UniRGB-IR introduces a scalable adapter-based framework that enhances pre-trained RGB foundation models for diverse RGB-IR semantic tasks, achieving state-of-the-art results with improved generalization.

Contribution

The paper proposes a novel adapter mechanism with MFP and SFI modules to effectively incorporate multi-modal features into frozen pre-trained models for RGB-IR tasks.

Findings

01

Achieves state-of-the-art performance on various RGB-IR semantic tasks.

02

Effectively incorporates multi-scale features via adapter modules.

03

Maintains high scalability and generalization across tasks.

Abstract

Semantic analysis on visible (RGB) and infrared (IR) images has gained significant attention due to their enhanced accuracy and robustness under challenging conditions including low-illumination and adverse weather. However, due to the lack of pre-trained foundation models on the large-scale infrared image datasets, existing methods prefer to design task-specific frameworks and directly fine-tune them with pre-trained foundation models on their RGB-IR semantic relevance datasets, which results in poor scalability and limited generalization. To address these limitations, we propose UniRGB-IR, a scalable and efficient framework for RGB-IR semantic tasks that introduces a novel adapter mechanism to effectively incorporate rich multi-modal features into pre-trained RGB-based foundation models. Our framework comprises three key components: a vision transformer (ViT) foundation model, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

potsui99/unirgb-ir
pytorchOfficial

Models

🤗
tsuipo99/UniRGB-IR
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfrared Target Detection Methodologies · CCD and CMOS Imaging Sensors

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Layer Normalization · Multi-Head Attention · Residual Connection · Softmax · Adapter · Vision Transformer