Evaluating Deception Detection Model Robustness To Linguistic Variation
Maria Glenski, Ellyn Ayton, Robin Cosbey, Dustin Arendt, and Svitlana, Volkova

TL;DR
This paper analyzes the robustness of deception detection models to linguistic variations, revealing that character-based defenses are most effective against adversarial attacks in misinformation detection.
Contribution
It provides an extensive evaluation of model robustness to linguistic manipulation, comparing multiple embeddings and defense strategies in deception detection.
Findings
Character or mixed ensemble models are most effective defenses.
Character perturbation attacks are more successful.
Models show high confidence misclassifications and failure modes.
Abstract
With the increasing use of machine-learning driven algorithmic judgements, it is critical to develop models that are robust to evolving or manipulated inputs. We propose an extensive analysis of model robustness against linguistic variation in the setting of deceptive news detection, an important task in the context of misinformation spread online. We consider two prediction tasks and compare three state-of-the-art embeddings to highlight consistent trends in model performance, high confidence misclassifications, and high impact failures. By measuring the effectiveness of adversarial defense strategies and evaluating model susceptibility to adversarial attacks using character- and word-perturbed text, we find that character or mixed ensemble models are the most effective defenses and that character perturbation-based attack tactics are more successful.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
