Scalable Font Reconstruction with Dual Latent Manifolds
Nikita Srivatsan, Si Wu, Jonathan T. Barron, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces a scalable deep generative model for font reconstruction that learns disentangled style and character shape manifolds, enabling generalization to unseen characters and improving image naturalness without adversarial training.
Contribution
The authors develop a novel model that scales font reconstruction to many character types by disentangling style and shape, and introduce a new likelihood-based loss for more natural images.
Findings
Outperforms existing style transfer methods on font reconstruction tasks
Successfully generalizes to unseen characters in multiple languages
Produces more natural images without using a discriminator
Abstract
We propose a deep generative model that performs typography analysis and font reconstruction by learning disentangled manifolds of both font style and character shape. Our approach enables us to massively scale up the number of character types we can effectively model compared to previous methods. Specifically, we infer separate latent variables representing character and font via a pair of inference networks which take as input sets of glyphs that either all share a character type, or belong to the same font. This design allows our model to generalize to characters that were not observed during training time, an important task in light of the relative sparsity of most fonts. We also put forward a new loss, adapted from prior work that measures likelihood using an adaptive distribution in a projected space, resulting in more natural images without requiring a discriminator. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
