Gibberish is All You Need for Membership Inference Detection in   Contrastive Language-Audio Pretraining

Ruoxi Cheng; Yizhong Ding; Shuirong Cao; Shitong Shao; Zhiqiang Wang

arXiv:2410.18371·cs.SD·November 5, 2024

Gibberish is All You Need for Membership Inference Detection in Contrastive Language-Audio Pretraining

Ruoxi Cheng, Yizhong Ding, Shuirong Cao, Shitong Shao, Zhiqiang Wang

PDF

Open Access

TL;DR

This paper introduces USMID, a novel text-only membership inference attack on CLAP models that uses gibberish text and anomaly detection to identify if a speaker's data was in the training set, enhancing privacy leakage detection.

Contribution

It presents USMID, a new text-only membership inference method for CLAP models that does not require audio data or shadow models, improving privacy leakage detection.

Findings

01

USMID outperforms baseline methods in detection accuracy

02

Gibberish text effectively reveals training data membership

03

The approach works across various CLAP architectures and datasets

Abstract

Audio can disclose PII, particularly when combined with related text data. Therefore, it is essential to develop tools to detect privacy leakage in Contrastive Language-Audio Pretraining(CLAP). Existing MIAs need audio as input, risking exposure of voiceprint and requiring costly shadow models. We first propose PRMID, a membership inference detector based probability ranking given by CLAP, which does not require training shadow models but still requires both audio and text of the individual as input. To address these limitations, we then propose USMID, a textual unimodal speaker-level membership inference detector, querying the target model using only text data. We randomly generate textual gibberish that are clearly not in training dataset. Then we extract feature vectors from these texts using the CLAP model and train a set of anomaly detectors on them. During inference, the feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems

MethodsSparse Evolutionary Training