Who Taught You That? Tracing Teachers in Model Distillation
Somin Wadhwa, Chantal Shaib, Silvio Amir, Byron C. Wallace

TL;DR
This paper investigates whether the outputs of student models can reveal their teacher models in the context of model distillation, with implications for understanding and potentially infringing on proprietary model capabilities.
Contribution
It introduces methods to identify teacher models from student outputs using lexical features, highlighting the limitations of n-gram similarity and the potential of PoS patterns.
Findings
n-gram similarity is unreliable for teacher identification
PoS templates can mimic teacher patterns in student outputs
discriminative models can partially infer teacher models from outputs
Abstract
Model distillation -- using outputs from a large teacher model to teach a small student model -- is a practical means of creating efficient models for a particular task. We ask: Can we identify a students' teacher based on its outputs? Such "footprints" left by teacher LLMs would be interesting artifacts. Beyond this, reliable teacher inference may have practical implications as actors seek to distill specific capabilities of massive proprietary LLMs into deployed smaller LMs, potentially violating terms of service. We consider practical task distillation targets including summarization, question answering, and instruction-following. We assume a finite set of candidate teacher models, which we treat as blackboxes. We design discriminative models that operate over lexical features. We find that -gram similarity alone is unreliable for identifying teachers, but part-of-speech (PoS)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducator Training and Historical Pedagogy · Teacher Education and Leadership Studies · Educational Assessment and Improvement
Methodstravel james · Sparse Evolutionary Training
