Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

Claire Roman; Philippe Meyer

arXiv:2603.06180·cs.CV·March 9, 2026

Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

Claire Roman, Philippe Meyer

PDF

Open Access

TL;DR

This paper introduces a two-stage framework combining contrastive learning and teacher-student distillation to learn script similarity, enabling effective recognition and clustering of diverse writing systems without explicit evolutionary labels.

Contribution

It presents a novel two-stage method that leverages supervised contrastive learning and unsupervised distillation to discover script similarities and improve glyph recognition.

Findings

01

Effective few-shot glyph recognition achieved

02

Meaningful script clustering demonstrated

03

Bridges supervised and unsupervised learning for script analysis

Abstract

Learning similarity metrics for glyphs and writing systems faces a fundamental challenge: while individual graphemes within invented alphabets can be reliably labeled, the historical relationships between different scripts remain uncertain and contested. We propose a two-stage framework that addresses this epistemological constraint. First, we train an encoder with contrastive loss on labeled invented alphabets, establishing a teacher model with robust discriminative features. Second, we extend to historically attested scripts through teacher-student distillation, where the student learns unsupervised representations guided by the teacher's knowledge but free to discover latent cross-script similarities. The asymmetric setup enables the student to learn deformation-invariant embeddings while inheriting discriminative structure from clean examples. Our approach bridges supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Topic Modeling