TL;DR
This paper introduces PHDiffusion, a diffusion-based model for painterly image harmonization that efficiently stylizes foreground objects while preserving fine content details, outperforming previous methods in style transfer and content retention.
Contribution
The paper proposes a novel diffusion model with a lightweight adaptive encoder and dual encoder fusion for painterly harmonization, enabling better style transfer and content preservation.
Findings
Outperforms state-of-the-art models in style and content balance
Efficiently stylizes foreground objects with fine detail
Achieves artistically coherent composite images
Abstract
Painterly image harmonization aims to insert photographic objects into paintings and obtain artistically coherent composite images. Previous methods for this task mainly rely on inference optimization or generative adversarial network, but they are either very time-consuming or struggling at fine control of the foreground objects (e.g., texture and content details). To address these issues, we propose a novel Painterly Harmonization stable Diffusion model (PHDiffusion), which includes a lightweight adaptive encoder and a Dual Encoder Fusion (DEF) module. Specifically, the adaptive encoder and the DEF module first stylize foreground features within each encoder. Then, the stylized foreground features from both encoders are combined to guide the harmonization process. During training, besides the noise loss in diffusion model, we additionally employ content loss and two style losses,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
