Beyond Patches: Global-aware Autoregressive Model for Multimodal Few-Shot Font Generation
Haonan Cai, Yuxuan Luo, Zhouhui Lian

TL;DR
GAR-Font is a novel autoregressive framework for multimodal few-shot font generation that captures global style patterns and allows flexible style control via language, outperforming existing methods.
Contribution
Introduces GAR-Font, a global-aware autoregressive model with a multimodal style encoder and post-refinement for improved font synthesis from limited references.
Findings
Outperforms existing FFG methods in style fidelity and quality.
Effectively captures global stylistic patterns with a new tokenizer.
Enables style control through lightweight language-style adaptation.
Abstract
Manual font design is an intricate process that transforms a stylistic visual concept into a coherent glyph set. This challenge persists in automated Few-shot Font Generation (FFG), where models often struggle to preserve both the structural integrity and stylistic fidelity from limited references. While autoregressive (AR) models have demonstrated impressive generative capabilities, their application to FFG is constrained by conventional patch-level tokenization, which neglects global dependencies crucial for coherent font synthesis. Moreover, existing FFG methods remain within the image-to-image paradigm, relying solely on visual references and overlooking the role of language in conveying stylistic intent during font design. To address these limitations, we propose GAR-Font, a novel AR framework for multimodal few-shot font generation. GAR-Font introduces a global-aware tokenizer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Multimodal Machine Learning Applications
