LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features
Rodrigo Cu\'ellar-Hidalgo, Julio de Jes\'us Guerrero-Zambrano, Dominic, Forest, Gerardo Reyes-Salgado, Juan-Manuel Torres-Moreno

TL;DR
This paper evaluates the effectiveness of probabilistic and vector space modeling methods in classifying social media documents as aggressive, gender biased, or communal, without relying on linguistic features, demonstrating competitive performance in a multilingual shared task.
Contribution
It introduces a feature-agnostic approach using VSM and probabilistic methods for social language classification, tested in a multilingual shared task setting.
Findings
Effective classification of social network documents into targeted categories.
Competitive results achieved in the ComMA@ICON'21 shared task.
Identifies relevant configurations for language and bias detection.
Abstract
This work aims to evaluate the ability that both probabilistic and state-of-the-art vector space modeling (VSM) methods provide to well known machine learning algorithms to identify social network documents to be classified as aggressive, gender biased or communally charged. To this end, an exploratory stage was performed first in order to find relevant settings to test, i.e. by using training and development samples, we trained multiple algorithms using multiple vector space modeling and probabilistic methods and discarded the less informative configurations. These systems were submitted to the competition of the ComMA@ICON'21 Workshop on Multilingual Gender Biased and Communal Language Identification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Authorship Attribution and Profiling · Text Readability and Simplification
