APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic Text Image Generation
Yangming Shi, Haisong Ding, Kai Chen, Qiang Huo

TL;DR
APRNet introduces an attention-based pixel-wise rendering network that effectively transfers style patterns to generate photo-realistic text images, improving synthesis quality for Chinese handwriting datasets.
Contribution
It presents a novel combination of attention mechanisms and pixel-wise style modulation for style transfer in text image generation.
Findings
Enhanced photo-realistic text image quality
Effective style transfer of background and foreground colors
Improved synthesis results on Chinese handwriting datasets
Abstract
Style-guided text image generation tries to synthesize text image by imitating reference image's appearance while keeping text content unaltered. The text image appearance includes many aspects. In this paper, we focus on transferring style image's background and foreground color patterns to the content image to generate photo-realistic text image. To achieve this goal, we propose 1) a content-style cross attention based pixel sampling approach to roughly mimicking the style text image's background; 2) a pixel-wise style modulation technique to transfer varying color patterns of the style image to the content image spatial-adaptively; 3) a cross attention based multi-scale style fusion approach to solving text foreground misalignment issue between style and content images; 4) an image patch shuffling strategy to create style, content and ground truth image tuples for training.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Handwritten Text Recognition Techniques
