CTPD: Cross Tokenizer Preference Distillation

Truong Nguyen; Phi Van Dat; Ngan Nguyen; Linh Ngo Van; Trung Le; Thanh Hong Nguyen

arXiv:2601.11865·cs.CL·January 21, 2026

CTPD: Cross Tokenizer Preference Distillation

Truong Nguyen, Phi Van Dat, Ngan Nguyen, Linh Ngo Van, Trung Le, Thanh Hong Nguyen

PDF

Open Access 1 Video

TL;DR

This paper introduces CTPD, a novel framework for transferring human preferences between language models with different tokenizers, enabling more effective and flexible model alignment.

Contribution

CTPD is the first unified approach for preference distillation across heterogeneous tokenizers, incorporating aligned span projection, cross-tokenizer importance sampling, and teacher-anchored references.

Findings

01

Significant performance improvements over existing methods.

02

Effective preference transfer across diverse tokenization schemes.

03

Theoretical grounding in importance sampling.

Abstract

While knowledge distillation has seen widespread use in pre-training and instruction tuning, its application to aligning language models with human preferences remains underexplored, particularly in the more realistic cross-tokenizer setting. The incompatibility of tokenization schemes between teacher and student models has largely prevented fine-grained, white-box distillation of preference information. To address this gap, we propose Cross-Tokenizer Preference Distillation (CTPD), the first unified framework for transferring human-aligned behavior between models with heterogeneous tokenizers. CTPD introduces three key innovations: (1) Aligned Span Projection, which maps teacher and student tokens to shared character-level spans for precise supervision transfer; (2) a cross-tokenizer adaptation of Token-level Importance Sampling (TIS-DPO) for improved credit assignment; and (3) a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CTPD: Cross Tokenizer Preference Distillation· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Intelligent Tutoring Systems and Adaptive Learning