Reverse-Engineering Decoding Strategies Given Blackbox Access to a   Language Generation System

Daphne Ippolito; Nicholas Carlini; Katherine Lee; Milad Nasr; Yun; William Yu

arXiv:2309.04858·cs.LG·September 12, 2023·1 cites

Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System

Daphne Ippolito, Nicholas Carlini, Katherine Lee, Milad Nasr, Yun, William Yu

PDF

Open Access 1 Repo

TL;DR

This paper develops methods to identify the decoding strategy used in blackbox language models, which helps in detecting generated text and understanding biases introduced by decoding choices.

Contribution

It introduces techniques to reverse-engineer decoding strategies like top-k and nucleus sampling from blackbox APIs, including proprietary systems like ChatGPT.

Findings

01

Decoding strategies can be accurately identified in various models.

02

Reveals biases caused by decoding truncation.

03

Applicable to both open-source and commercial models.

Abstract

Neural language models are increasingly deployed into APIs and websites that allow a user to pass in a prompt and receive generated text. Many of these systems do not reveal generation parameters. In this paper, we present methods to reverse-engineer the decoding method used to generate text (i.e., top- $k$ or nucleus sampling). Our ability to discover which decoding strategy was used has implications for detecting generated text. Additionally, the process of discovering the decoding strategy can reveal biases caused by selecting decoding settings which severely truncate a model's predicted distributions. We perform our attack on several families of open-source language models, as well as on production systems (e.g., ChatGPT).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eleutherai/pythia
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques