Voice@SRIB at SemEval-2020 Task 9 and 12: Stacked Ensembling method for   Sentiment and Offensiveness detection in Social Media

Abhishek Singh; Surya Pratap Singh Parmar

arXiv:2007.10021·cs.CL·October 13, 2020·1 cites

Voice@SRIB at SemEval-2020 Task 9 and 12: Stacked Ensembling method for Sentiment and Offensiveness detection in Social Media

Abhishek Singh, Surya Pratap Singh Parmar

PDF

Open Access 2 Repos

TL;DR

This paper presents a stacked ensembling approach using code-mixed embeddings for sentiment and offensiveness detection in social media, achieving high F1 scores and demonstrating the effectiveness of hyper-parameter tuning and data pre-processing.

Contribution

The paper introduces a novel ensembling method with code-mixed embeddings for social media sentiment and offensiveness detection, improving performance on SemEval tasks.

Findings

01

Achieved 0.886 F1-Macro on OffensEval Greek subtask.

02

Ranked third in Spanglish competition with 0.756 F1-score.

03

Showed that hyper-parameter tuning and data pre-processing significantly improve results.

Abstract

In social-media platforms such as Twitter, Facebook, and Reddit, people prefer to use code-mixed language such as Spanish-English, Hindi-English to express their opinions. In this paper, we describe different models we used, using the external dataset to train embeddings, ensembling methods for Sentimix, and OffensEval tasks. The use of pre-trained embeddings usually helps in multiple tasks such as sentence classification, and machine translation. In this experiment, we haveused our trained code-mixed embeddings and twitter pre-trained embeddings to SemEval tasks. We evaluate our models on macro F1-score, precision, accuracy, and recall on the datasets. We intend to show that hyper-parameter tuning and data pre-processing steps help a lot in improving the scores. In our experiments, we are able to achieve 0.886 F1-Macro on OffenEval Greek language subtask post-evaluation, whereas the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Social Media and Politics · Advanced Malware Detection Techniques