MiM-DiT: MoE in MoE with Diffusion Transformers for All-in-One Image Restoration
Lingshun Kong, Jiawei Zhang, Zhengpeng Duan, Xiaohe Wu, Yueqi Yang, Xiaotao Wang, Dongqing Zou, Lei Lei, Jinshan Pan

TL;DR
This paper introduces MiM-DiT, a unified image restoration framework combining dual-level MoE architecture with a diffusion model to effectively handle diverse degradation types in a single model.
Contribution
It presents a novel dual-level MoE architecture integrated with a pretrained diffusion model for all-in-one image restoration.
Findings
Outperforms state-of-the-art methods on multiple restoration tasks.
Effectively handles diverse degradation types with high specialization.
Achieves both coarse and fine-grained adaptation within a unified model.
Abstract
All-in-one image restoration is challenging because different degradation types, such as haze, blur, noise, and low-light, impose diverse requirements on restoration strategies, making it difficult for a single model to handle them effectively. In this paper, we propose a unified image restoration framework that integrates a dual-level Mixture-of-Experts (MoE) architecture with a pretrained diffusion model. The framework operates at two levels: the Inter-MoE layer adaptively combines expert groups to handle major degradation types, while the Intra-MoE layer further selects specialized sub-experts to address fine-grained variations within each type. This design enables the model to achieve coarse-grained adaptation across diverse degradation categories while performing fine-grained modulation for specific intra-class variations, ensuring both high specialization in handling complex,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image Enhancement Techniques · Image and Video Quality Assessment
