Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation
Mohammad Hosseini, Mahmudul Hasan

TL;DR
This paper introduces a lightweight deep learning ensemble for content moderation that achieves higher accuracy and faster inference by combining simple visual features and multiple small models, especially effective for violence detection.
Contribution
The paper presents a novel ensemble architecture of small, simple models with narrowed-down features, improving accuracy and efficiency over traditional large models for violence content moderation.
Findings
7.64x faster inference compared to ResNet-50
Significant accuracy improvements on explosion detection dataset
Applicable to image and video violence detection tasks
Abstract
To address the increasing need for efficient and accurate content moderation, we propose an efficient and lightweight deep classification ensemble structure. Our approach is based on a combination of simple visual features, designed for high-accuracy classification of violent content with low false positives. Our ensemble architecture utilizes a set of lightweight models with narrowed-down color features, and we apply it to both images and videos. We evaluated our approach using a large dataset of explosion and blast contents and compared its performance to popular deep learning models such as ResNet-50. Our evaluation results demonstrate significant improvements in prediction accuracy, while benefiting from 7.64x faster inference and lower computation cost. While our approach is tailored to explosion detection, it can be applied to other similar content moderation and violence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning
