A First Step Towards Content Protecting Plagiarism Detection

Cornelius Ihle; Moritz Schubotz; Norman Meuschke; Bela Gipp

arXiv:2005.11504·cs.CR·May 26, 2020

A First Step Towards Content Protecting Plagiarism Detection

Cornelius Ihle, Moritz Schubotz, Norman Meuschke, Bela Gipp

PDF

1 Repo

TL;DR

This paper introduces a privacy-preserving plagiarism detection method using Private Set Intersection, maintaining detection effectiveness while protecting sensitive content from disclosure.

Contribution

It presents the first content-protecting plagiarism detection approach employing Private Set Intersection to prevent content disclosure.

Findings

01

Content-protecting method matches original detection effectiveness.

02

The approach makes content disclosure attacks practically infeasible.

03

Initial results demonstrate feasibility of privacy-preserving plagiarism detection.

Abstract

Plagiarism detection systems are essential tools for safeguarding academic and educational integrity. However, today's systems require disclosing the full content of the input documents and the document collection to which the input documents are compared. Moreover, the systems are centralized and under the control of individual, typically commercial providers. This situation raises procedural and legal concerns regarding the confidentiality of sensitive data, which can limit or prohibit the use of plagiarism detection services. To eliminate these weaknesses of current systems, we seek to devise a plagiarism detection approach that does not require a centralized provider nor exposing any content as cleartext. This paper presents the initial results of our research. Specifically, we employ Private Set Intersection to devise a content-protecting variant of the citation-based similarity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ag-gipp/20CppdData
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.