Do We Need to Design Specific Diffusion Models for Different Tasks? Try ONE-PIC
Ming Tao, Bing-Kun Bao, Yaowei Wang, Changsheng Xu

TL;DR
This paper introduces ONE-PIC, a simple and efficient method for fine-tuning pretrained diffusion models across various tasks without additional modules, by using in-visual-context tuning and a masking strategy.
Contribution
It proposes a novel fine-tuning approach that enhances diffusion models' adaptability without extra networks, simplifying the process and reducing costs.
Findings
Achieves competitive performance on multiple tasks
Reduces fine-tuning costs and complexity
Streamlines adaptation process
Abstract
Large pretrained diffusion models have demonstrated impressive generation capabilities and have been adapted to various downstream tasks. However, unlike Large Language Models (LLMs) that can learn multiple tasks in a single model based on instructed data, diffusion models always require additional branches, task-specific training strategies, and losses for effective adaptation to different downstream tasks. This task-specific fine-tuning approach brings two drawbacks. 1) The task-specific additional networks create gaps between pretraining and fine-tuning which hinders the transfer of pretrained knowledge. 2) It necessitates careful additional network design, raising the barrier to learning and implementation, and making it less user-friendly. Thus, a question arises: Can we achieve a simple, efficient, and general approach to fine-tune diffusion models? To this end, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications
MethodsDiffusion
