ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI   Detection for Text Origination

Navid Ayoobi; Lily Knab; Wen Cheng; David Pantoja; Hamidreza Alikhani,; Sylvain Flamant; Jin Kim; Arjun Mukherjee

arXiv:2409.14285·cs.CL·September 24, 2024·2 cites

ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination

Navid Ayoobi, Lily Knab, Wen Cheng, David Pantoja, Hamidreza Alikhani,, Sylvain Flamant, Jin Kim, Arjun Mukherjee

PDF

Open Access 1 Repo

TL;DR

This paper introduces a back-translation technique to manipulate AI-generated text, revealing vulnerabilities in detection systems and proposing a countermeasure that enhances robustness, supported by a large, diverse dataset.

Contribution

It presents a novel back-translation method to evade AI text detectors and a countermeasure to improve detection robustness, along with a large dataset for evaluation.

Findings

01

Back-translation significantly reduces detection accuracy.

02

Proposed countermeasure maintains high detection performance.

03

Large dataset enables comprehensive evaluation of detection methods.

Abstract

While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

navid-aub/esperanto-dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling