MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model
Shan Yang

TL;DR
MFTF introduces a novel diffusion-based method that enables precise object layout control in generated images without masks or retraining, supporting multi-object positioning and semantic editing.
Contribution
It presents a training-free, mask-free diffusion model that allows accurate object position adjustments during image generation, a significant advance over existing methods.
Findings
Supports both single and multi-object positional adjustments
Enables simultaneous layout control and semantic editing
Achieves precise object placement without additional masks or retraining
Abstract
Text-to-image generation models have revolutionized content creation, but diffusion-based vision-language models still face challenges in precisely controlling the shape, appearance, and positional placement of objects in generated images using text guidance alone. Existing global image editing models rely on additional masks or images as guidance to achieve layout control, often requiring retraining of the model. While local object-editing models allow modifications to object shapes, they lack the capability to control object positions. To address these limitations, we propose the Mask-free Training-free Object-Level Layout Control Diffusion Model (MFTF), which provides precise control over object positions without requiring additional masks or images. The MFTF model supports both single-object and multi-object positional adjustments, such as translation and rotation, while enabling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · Iterative Learning Control Systems · Advanced Control Systems Optimization
MethodsSoftmax · Attention Is All You Need · Diffusion
