Leveraging Multimodal Large Language Models for All-in-One Image Restoration via a Mixture of Frequency Experts
Eunho Lee, Rei Kawakami, and Youngbae Hwang

TL;DR
This paper introduces a novel image restoration framework guided by multimodal large language models, utilizing a mixture of frequency experts and relational routing to handle diverse degradations effectively.
Contribution
It proposes a multimodal LLM-guided approach with a mixture-of-frequency-experts module and relational routing, advancing unified image restoration capabilities.
Findings
Achieves state-of-the-art performance on the CDD11 dataset.
Outperforms previous methods by up to 1.35 dB in restoration quality.
Demonstrates strong results across multiple restoration benchmarks.
Abstract
All-in-one image restoration seeks to recover clean images from inputs affected by diverse and unknown degradations using a unified framework. Recent methods have shown strong performance by identifying degradation characteristics to guide the restoration process. However, many of them treat degradations as discrete categories, which limits their ability to model the continuous relational structure that arises in composite degradations. To address this issue, we propose a multimodal large language model (MLLM)-guided image restoration framework that exploits multimodal embeddings as guidance for low-level restoration. Specifically, MLLM-derived features are injected into an encoder-decoder architecture through an MLLM-guided fusion block (MGFB) to enhance degradation-aware representations. In addition, we incorporate a mixture-of-frequency-experts (MoFE) module that adaptively combines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
