Reproducibility Report: Contextualizing Hate Speech Classifiers with   Post-hoc Explanation

Kiran Purohit; Owais Iqbal; Ankan Mullick

arXiv:2105.11412·cs.CL·May 25, 2021

Reproducibility Report: Contextualizing Hate Speech Classifiers with Post-hoc Explanation

Kiran Purohit, Owais Iqbal, Ankan Mullick

PDF

Open Access

TL;DR

This report assesses the reproducibility of a hate speech classification method that uses post-hoc explanations to understand model decisions, focusing on the method's validity and experimental results.

Contribution

It provides a detailed reproducibility analysis of the proposed method and evaluates the validity of its experimental findings.

Findings

01

Reproducibility of the method was successfully verified.

02

The original results were largely confirmed.

03

Insights into model explanations were gained.

Abstract

The presented report evaluates Contextualizing Hate Speech Classifiers with Post-hoc Explanation paper within the scope of ML Reproducibility Challenge 2020. Our work focuses on both aspects constituting the paper: the method itself and the validity of the stated results. In the following sections, we have described the paper, related works, algorithmic frameworks, our experiments and evaluations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning · Topic Modeling