MOSAIC: Multiple Observers Spotting AI Content
Matthieu Dubois, Fran\c{c}ois Yvon, Pablo Piantanida

TL;DR
MOSAIC introduces an ensemble-based method combining multiple LLMs to improve the robustness and accuracy of detecting AI-generated text across various domains.
Contribution
The paper proposes a theoretically grounded ensemble approach that leverages multiple LLMs for more reliable AI content detection, surpassing fixed-model methods.
Findings
Effective detection across multiple LLMs and domains
Improved robustness over single-model approaches
Code and data publicly available
Abstract
The dissemination of Large Language Models (LLMs), trained at scale, and endowed with powerful text-generating abilities, has made it easier for all to produce harmful, toxic, faked or forged content. In response, various proposals have been made to automatically discriminate artificially generated from human-written texts, typically framing the problem as a binary classification problem. Early approaches evaluate an input document with a well-chosen detector LLM, assuming that low-perplexity scores reliably signal machine-made content. More recent systems instead consider two LLMs and compare their probability distributions over the document to further discriminate when perplexity alone cannot. However, using a fixed pair of models can induce brittleness in performance. We extend these approaches to the ensembling of several LLMs and derive a new, theoretically grounded approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
