MAiDE-up: Multilingual Deception Detection of GPT-generated Hotel   Reviews

Oana Ignat; Xiaomeng Xu; Rada Mihalcea

arXiv:2404.12938·cs.CL·June 21, 2024

MAiDE-up: Multilingual Deception Detection of GPT-generated Hotel Reviews

Oana Ignat, Xiaomeng Xu, Rada Mihalcea

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces MAiDE-up, a multilingual dataset of hotel reviews, and analyzes linguistic features to improve AI-generated fake review detection across multiple languages and factors.

Contribution

It provides the first large-scale multilingual dataset for AI-generated hotel review detection and explores linguistic and contextual factors affecting model performance.

Findings

01

Language and sentiment significantly impact detection accuracy.

02

Multilingual models outperform English-only models.

03

Certain linguistic cues are strong indicators of AI-generated reviews.

Abstract

Deceptive reviews are becoming increasingly common, especially given the increase in performance and the prevalence of LLMs. While work to date has addressed the development of models to differentiate between truthful and deceptive human reviews, much less is known about the distinction between real reviews and AI-authored fake reviews. Moreover, most of the research so far has focused primarily on English, with very little work dedicated to other languages. In this paper, we compile and make publicly available the MAiDE-up dataset, consisting of 10,000 real and 10,000 AI-generated fake hotel reviews, balanced across ten languages. Using this dataset, we conduct extensive linguistic analyses to (1) compare the AI fake hotel reviews to real hotel reviews, and (2) identify the factors that influence the deception detection model performance. We explore the effectiveness of several models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

michigannlp/multilingual_reviews_deception
noneOfficial

Datasets

MichiganNLP/MAiDE-up
dataset· 47 dl
47 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDeception detection and forensic psychology · Sentiment Analysis and Opinion Mining · Topic Modeling