AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant   Reviews and Images on Social Media

Alessandro Gambetti; Qiwei Han

arXiv:2401.08825·cs.LG·January 18, 2024·1 cites

AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media

Alessandro Gambetti, Qiwei Han

PDF

Open Access

TL;DR

This paper introduces AiGen-FoodReview, a large multimodal dataset of real and machine-generated restaurant reviews and images, and evaluates detection models to distinguish authentic content from AI-generated fake reviews.

Contribution

It provides the first open-source multimodal dataset of authentic and AI-generated restaurant reviews and images, along with detection models and feature analysis for fake review identification.

Findings

01

Achieved 99.80% accuracy with multimodal detection using FLAVA.

02

Showed linguistic and visual features are effective in distinguishing fake reviews.

03

Demonstrated the utility of handcrafted features based on readability and photographic theories.

Abstract

Online reviews in the form of user-generated content (UGC) significantly impact consumer decision-making. However, the pervasive issue of not only human fake content but also machine-generated content challenges UGC's reliability. Recent advances in Large Language Models (LLMs) may pave the way to fabricate indistinguishable fake generated content at a much lower cost. Leveraging OpenAI's GPT-4-Turbo and DALL-E-2 models, we craft AiGen-FoodReview, a multi-modal dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated. We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA. We use attributes from readability and photographic theories to score reviews and images, respectively, demonstrating their utility as hand-crafted features in scalable and interpretable detection models, with comparable performance. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications

MethodsFLAVA