Parameter-Efficient Fine-Tuning of DINOv2 for Large-Scale Font Classification
Daniel Chen, Zaria Zinn, Marcus Lowe

TL;DR
This paper presents GoogleFontsBench, a new benchmark for classifying web fonts, and demonstrates that parameter-efficient fine-tuning with LoRA on DINOv2 achieves high accuracy with minimal training parameters.
Contribution
It introduces a comprehensive web font classification benchmark and shows that LoRA-based fine-tuning on DINOv2 is highly effective and resource-efficient.
Findings
LoRA fine-tuning achieves 99.0% accuracy on font classification.
The benchmark includes 394 font variants and a synthetic data pipeline.
Open-source release of the benchmark, models, and training pipeline.
Abstract
We introduce GoogleFontsBench, the first public benchmark for classifying open-source web fonts, addressing a gap left by existing benchmarks that cover only commercial typefaces. GoogleFontsBench comprises 394 font variants across 32 Google Fonts families, a reproducible synthetic data generation pipeline (~575 images per variant, ~226K total), and a typographically-grounded evaluation metric (SWER) that weights errors by visual severity. We establish baselines using six fine-tuning strategies on a DINOv2 Vision Transformer backbone. Parameter-efficient adaptation with LoRA achieves 99.0% top-1 accuracy while training only 1% of the model's 87.2M parameters, with errors 140x less severe than random guessing. We release the benchmark, all trained models, and the full training pipeline as open-source resources.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
