Black Boxes, White Noise: Similarity Detection for Neural Functions

Farima Farmahinifarahani (University of California at Irvine; USA),; Cristina V. Lopes (University of California at Irvine; USA)

arXiv:2302.10005·cs.PL·February 21, 2023

Black Boxes, White Noise: Similarity Detection for Neural Functions

Farima Farmahinifarahani (University of California at Irvine, USA),, Cristina V. Lopes (University of California at Irvine, USA)

PDF

TL;DR

This paper proposes a novel method for assessing the similarity of deep neural network functions using random inputs and correlation metrics, enabling comparison without canonical inputs.

Contribution

It introduces a new approach to compare DNN functions by generating random inputs and analyzing output correlations, addressing a gap in existing similarity detection techniques.

Findings

01

Spearman's rank correlation is most effective for similarity detection.

02

The method works well even without canonical inputs.

03

Empirical evaluation on over 56,000 classifiers demonstrates its robustness.

Abstract

Similarity, or clone, detection has important applications in copyright violation, software theft, code search, and the detection of malicious components. There is now a good number of open source and proprietary clone detectors for programs written in traditional programming languages. However, the increasing adoption of deep learning models in software poses a challenge to these tools: these models implement functions that are inscrutable black boxes. As more software includes these DNN functions, new techniques are needed in order to assess the similarity between deep learning components of software. Previous work has unveiled techniques for comparing the representations learned at various layers of deep neural network models by feeding canonical inputs to the models. Our goal is to be able to compare DNN functions when canonical inputs are not available -- because they may not be in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.