Model Stealing for Any Low-Rank Language Model

Allen Liu; Ankur Moitra

arXiv:2411.07536·cs.LG·November 13, 2024

Model Stealing for Any Low-Rank Language Model

Allen Liu, Ankur Moitra

PDF

Open Access

TL;DR

This paper presents an efficient algorithm for stealing low-rank language models, including Hidden Markov Models, using a theoretical framework that improves upon previous methods by removing fidelity restrictions.

Contribution

The paper introduces a novel algorithm for learning any low-rank distribution, advancing the theoretical understanding of model stealing for language models.

Findings

01

Successfully learns low-rank distributions with an efficient algorithm

02

Improves upon previous results by removing fidelity constraints

03

Uses convex optimization and barycentric spanners for model representation

Abstract

Model stealing, where a learner tries to recover an unknown model via carefully chosen queries, is a critical problem in machine learning, as it threatens the security of proprietary models and the privacy of data they are trained on. In recent years, there has been particular interest in stealing large language models (LLMs). In this paper, we aim to build a theoretical understanding of stealing language models by studying a simple and mathematically tractable setting. We study model stealing for Hidden Markov Models (HMMs), and more generally low-rank language models. We assume that the learner works in the conditional query model, introduced by Kakade, Krishnamurthy, Mahajan and Zhang. Our main result is an efficient algorithm in the conditional query model, for learning any low-rank distribution. In other words, our algorithm succeeds at stealing any language model whose output…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques