Artificial Interrogation for Attributing Language Models

Farhan Dhanani; Muhammad Rafi

arXiv:2211.10877·cs.CL·November 22, 2022

Artificial Interrogation for Attributing Language Models

Farhan Dhanani, Muhammad Rafi

PDF

Open Access 1 Repo

TL;DR

This paper addresses the challenge of attributing fine-tuned language models to their base models by developing interrogation strategies and multiple response similarity measures, achieving effective model attribution under restricted API access.

Contribution

The paper introduces four novel approaches for model attribution using response similarity metrics, advancing the methodology for identifying model origins in a restricted API setting.

Findings

01

Response similarity metrics effectively distinguish models

02

Transformer-based classifiers improve attribution accuracy

03

Multiple approaches outperform baseline methods

Abstract

This paper presents solutions to the Machine Learning Model Attribution challenge (MLMAC) collectively organized by MITRE, Microsoft, Schmidt-Futures, Robust-Intelligence, Lincoln-Network, and Huggingface community. The challenge provides twelve open-sourced base versions of popular language models developed by well-known organizations and twelve fine-tuned language models for text generation. The names and architecture details of fine-tuned models were kept hidden, and participants can access these models only through the rest APIs developed by the organizers. Given these constraints, the goal of the contest is to identify which fine-tuned models originated from which base model. To solve this challenge, we have assumed that fine-tuned models and their corresponding base versions must share a similar vocabulary set with a matching syntactical writing style that resonates in their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

farhandhanani/mlmac
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification

MethodsMulti-Head Attention · Softmax · Layer Normalization · Adam · Linear Layer · Dense Connections · Residual Connection · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing