Parameter-Efficient Fine-Tuning of DINOv2 for Large-Scale Font Classification

Daniel Chen; Zaria Zinn; Marcus Lowe

arXiv:2602.13889·cs.CV·April 6, 2026

Parameter-Efficient Fine-Tuning of DINOv2 for Large-Scale Font Classification

Daniel Chen, Zaria Zinn, Marcus Lowe

PDF

TL;DR

This paper presents GoogleFontsBench, a new benchmark for classifying web fonts, and demonstrates that parameter-efficient fine-tuning with LoRA on DINOv2 achieves high accuracy with minimal training parameters.

Contribution

It introduces a comprehensive web font classification benchmark and shows that LoRA-based fine-tuning on DINOv2 is highly effective and resource-efficient.

Findings

01

LoRA fine-tuning achieves 99.0% accuracy on font classification.

02

The benchmark includes 394 font variants and a synthetic data pipeline.

03

Open-source release of the benchmark, models, and training pipeline.

Abstract

We introduce GoogleFontsBench, the first public benchmark for classifying open-source web fonts, addressing a gap left by existing benchmarks that cover only commercial typefaces. GoogleFontsBench comprises 394 font variants across 32 Google Fonts families, a reproducible synthetic data generation pipeline (~575 images per variant, ~226K total), and a typographically-grounded evaluation metric (SWER) that weights errors by visual severity. We establish baselines using six fine-tuning strategies on a DINOv2 Vision Transformer backbone. Parameter-efficient adaptation with LoRA achieves 99.0% top-1 accuracy while training only 1% of the model's 87.2M parameters, with errors 140x less severe than random guessing. We release the benchmark, all trained models, and the full training pipeline as open-source resources.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.