Training-Free and Interpretable Hateful Video Detection via Multi-stage Adversarial Reasoning

Shuonan Yang; Yuchen Zhang; Zeyu Fu

arXiv:2601.15115·cs.CV·January 22, 2026

Training-Free and Interpretable Hateful Video Detection via Multi-stage Adversarial Reasoning

Shuonan Yang, Yuchen Zhang, Zeyu Fu

PDF

Open Access

TL;DR

MARS is a training-free, multi-stage adversarial reasoning framework that enhances hateful video detection by providing reliable, interpretable, and explainable results, outperforming existing methods on real-world datasets.

Contribution

This paper introduces MARS, a novel training-free framework that combines evidence and counter-evidence reasoning for interpretable hateful video detection.

Findings

01

Achieves up to 10% improvement over other training-free methods.

02

Outperforms state-of-the-art training-based methods on one dataset.

03

Provides human-understandable justifications for decisions.

Abstract

Hateful videos pose serious risks by amplifying discrimination, inciting violence, and undermining online safety. Existing training-based hateful video detection methods are constrained by limited training data and lack of interpretability, while directly prompting large vision-language models often struggle to deliver reliable hate detection. To address these challenges, this paper introduces MARS, a training-free Multi-stage Adversarial ReaSoning framework that enables reliable and interpretable hateful content detection. MARS begins with the objective description of video content, establishing a neutral foundation for subsequent analysis. Building on this, it develops evidence-based reasoning that supports potential hateful interpretations, while in parallel incorporating counter-evidence reasoning to capture plausible non-hateful perspectives. Finally, these perspectives are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning