Vilio: State-of-the-art Visio-Linguistic Models applied to Hateful Memes
Niklas Muennighoff

TL;DR
Vilio is a unified framework implementing state-of-the-art visio-linguistic models, optimized for hate speech detection in memes, achieving high performance and providing a user-friendly starting point for related tasks.
Contribution
It introduces a unified code-base for multiple visio-linguistic models tailored for hateful meme detection, demonstrating competitive performance.
Findings
Achieved 2nd place in Hateful Memes Challenge
Ensemble of 5 models improves detection accuracy
Provides accessible code for visio-linguistic tasks
Abstract
This work presents Vilio, an implementation of state-of-the-art visio-linguistic models and their application to the Hateful Memes Dataset. The implemented models have been fitted into a uniform code-base and altered to yield better performance. The goal of Vilio is to provide a user-friendly starting point for any visio-linguistic problem. An ensemble of 5 different V+L models implemented in Vilio achieves 2nd place in the Hateful Memes Challenge out of 3,300 participants. The code is available at https://github.com/Muennighoff/vilio.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Digital Games and Media · Humor Studies and Applications
