XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font   Generation

Wei Liu; Fangyue Liu; Fei Ding; Qian He; Zili Yi

arXiv:2204.05084·cs.CV·May 6, 2022·5 cites

XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

Wei Liu, Fangyue Liu, Fei Ding, Qian He, Zili Yi

PDF

Open Access

TL;DR

This paper introduces XMP-Font, a self-supervised cross-modality pre-training approach with a transformer encoder for few-shot font generation, effectively capturing complex style features without fine-tuning.

Contribution

It proposes a novel self-supervised pre-training strategy and a cross-modality transformer encoder to improve style representation in few-shot font generation.

Findings

01

Successfully transfers styles at all scales.

02

Requires only one reference glyph.

03

Achieves 28% fewer bad cases than state-of-the-art methods.

Abstract

Generating a new font library is a very labor-intensive and time-consuming job for glyph-rich scripts. Few-shot font generation is thus required, as it requires only a few glyph references without fine-tuning during test. Existing methods follow the style-content disentanglement paradigm and expect novel fonts to be produced by combining the style codes of the reference glyphs and the content representations of the source. However, these few-shot font generation methods either fail to capture content-independent style representations, or employ localized component-wise style representations, which is insufficient to model many Chinese font styles that involve hyper-component features such as inter-component spacing and "connected-stroke". To resolve these drawbacks and make the style representations more reliable, we propose a self-supervised cross-modality pre-training strategy and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis