Stealing the Decoding Algorithms of Language Models

Ali Naseh; Kalpesh Krishna; Mohit Iyyer; Amir Houmansadr

arXiv:2303.04729·cs.LG·December 5, 2023·1 cites

Stealing the Decoding Algorithms of Language Models

Ali Naseh, Kalpesh Krishna, Mohit Iyyer, Amir Houmansadr

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that an attacker can cheaply and effectively steal the decoding algorithms and their hyperparameters from popular language models via API access, posing security concerns.

Contribution

It introduces the first method showing how to steal decoding algorithms and hyperparameters from language models with minimal cost and effort.

Findings

01

Successful theft of decoding algorithms from GPT-2, GPT-3, and GPT-Neo.

02

Low-cost attacks costing less than a few dollars per model.

03

Effective attack across multiple popular language models.

Abstract

A key component of generating text from modern language models (LM) is the selection and tuning of decoding algorithms. These algorithms determine how to generate text from the internal probability distribution generated by the LM. The process of choosing a decoding algorithm and tuning its hyperparameters takes significant time, manual effort, and computation, and it also requires extensive human evaluation. Therefore, the identity and hyperparameters of such decoding algorithms are considered to be extremely valuable to their owners. In this work, we show, for the first time, that an adversary with typical API access to an LM can steal the type and hyperparameters of its decoding algorithms at very low monetary costs. Our attack is effective against popular LMs used in text generation APIs, including GPT-2, GPT-3 and GPT-Neo. We demonstrate the feasibility of stealing such information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spin-umass/stealing-the-decoding-algorithms-of-language-models
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Privacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · GPT-Neo · Linear Layer · Discriminative Fine-Tuning · Residual Connection · GPT-2 · Multi-Head Attention · Byte Pair Encoding