RealCamo: Boosting Real Camouflage Synthesis with Layout Controls and Textual-Visual Guidance

Chunyuan Chen; Yunuo Cai; Shujuan Li; Weiyun Liang; Bin Wang; Jing Xu

arXiv:2512.22974·cs.CV·January 16, 2026

RealCamo: Boosting Real Camouflage Synthesis with Layout Controls and Textual-Visual Guidance

Chunyuan Chen, Yunuo Cai, Shujuan Li, Weiyun Liang, Bin Wang, Jing Xu

PDF

Open Access

TL;DR

RealCamo is a novel framework that enhances camouflaged image synthesis by incorporating layout controls and textual-visual guidance, resulting in more realistic and semantically coherent camouflaged images for training data augmentation.

Contribution

It introduces a controllable out-painting framework with layout controls and multimodal guidance, significantly improving the realism and semantic consistency of generated camouflaged images.

Findings

01

Improved visual fidelity and realism in generated images.

02

Enhanced semantic coherence between foreground and background.

03

Effective camouflage quality measurement through a new divergence metric.

Abstract

Camouflaged image generation (CIG) has recently emerged as an efficient alternative for acquiring high-quality training data for camouflaged object detection (COD). However, existing CIG methods still suffer from a substantial gap to real camouflaged imagery: generated images either lack sufficient camouflage due to weak visual similarity, or exhibit cluttered backgrounds that are semantically inconsistent with foreground targets. To address these limitations, we propose RealCamo, a novel out-painting-based framework for controllable realistic camouflaged image generation. RealCamo explicitly introduces additional layout controls to regulate global image structure, thereby improving semantic coherence between foreground objects and generated backgrounds. Moreover, we construct a multimodal textual-visual condition by combining a unified fine-grained textual task description with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis · Image Enhancement Techniques