T$^3$-S2S: Training-free Triplet Tuning for Sketch to Scene Synthesis in Controllable Concept Art Generation
Zhenhong Sun, Yifu Wang, Yonhon Ng, Yongzhi Xu, Daoyi Dong, Hongdong Li, Pan Ji

TL;DR
This paper introduces T3-S2S, a training-free method that enhances sketch-to-scene generation in concept art by refining control over multi-instance scene synthesis and terrain layout, improving detail and structure fidelity.
Contribution
It proposes a novel training-free triplet tuning scheme that revitalizes ControlNet for detailed multi-instance scene generation from sketches, with modules for prompt balance, feature emphasis, and contour refinement.
Findings
Consistently produces detailed multi-instance 2D scenes from sketches
Effectively captures terrain layout and scene structure
Improves alignment with input prompts
Abstract
2D concept art generation for 3D scenes is a crucial yet challenging task in computer graphics, as creating natural intuitive environments still demands extensive manual effort in concept design. While generative AI has simplified 2D concept design via text-to-image synthesis, it struggles with complex multi-instance scenes and offers limited support for structured terrain layout. In this paper, we propose a Training-free Triplet Tuning for Sketch-to-Scene (T3-S2S) generation after reviewing the entire cross-attention mechanism. This scheme revitalizes the ControlNet model for detailed multi-instance generation via three key modules: Prompt Balance ensures keyword representation and minimizes the risk of missing critical instances; Characteristic Priority emphasizes sketch-based features by highlighting TopK indices in feature channels; and Dense Tuning refines contour details within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInteractive and Immersive Displays · Music Technology and Sound Studies · Augmented Reality Applications
MethodsSoftmax · Attention Is All You Need
