EMMM, Explain Me My Model! Explainable Machine Generated Text Detection in Dialogues
Angela Yifei Yuan, Haoyi Li, Soyeon Caren Han, Christopher Leckie

TL;DR
EMMM is an explainability framework for detecting machine-generated text in dialogues, emphasizing interpretability for non-expert users while maintaining high accuracy and low latency.
Contribution
This paper introduces EMMM, a novel explanation-then-detection approach that enhances interpretability and usability of MGT detection in customer service dialogues.
Findings
70% human evaluators preferred EMMM explanations
Achieves competitive accuracy with low latency (<1 second)
Balances interpretability, speed, and detection performance
Abstract
The rapid adoption of large language models (LLMs) in customer service introduces new risks, as malicious actors can exploit them to conduct large-scale user impersonation through machine-generated text (MGT). Current MGT detection methods often struggle in online conversational settings, reducing the reliability and interpretability essential for trustworthy AI deployment. In customer service scenarios where operators are typically non-expert users, explanation become crucial for trustworthy MGT detection. In this paper, we propose EMMM, an explanation-then-detection framework that balances latency, accuracy, and non-expert-oriented interpretability. Experimental results demonstrate that EMMM provides explanations accessible to non-expert users, with 70\% of human evaluators preferring its outputs, while achieving competitive accuracy compared to state-of-the-art models and maintaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
