# UBC-NLP at SemEval-2019 Task 6:Ensemble Learning of Offensive Content   With Enhanced Training Data

**Authors:** Arun Rajendran, Chiyu Zhang, Muhammad Abdul-Mageed

arXiv: 1906.03692 · 2019-06-11

## TL;DR

This paper presents an ensemble learning approach with enhanced training data to detect offensive content on Twitter, achieving top-10 rankings in SemEval-2019 tasks despite limited and imbalanced data.

## Contribution

It introduces data enhancement techniques combined with classical ensemble classifiers for improved offensive content detection on Twitter.

## Key findings

- Ranked 6th in sub-task B with 0.706 macro F1-score
- Ranked 9th in sub-task C with 0.587 macro F1-score
- Effective in limited, imbalanced data scenarios

## Abstract

We examine learning offensive content on Twitter with limited, imbalanced data. For the purpose, we investigate the utility of using various data enhancement methods with a host of classical ensemble classifiers. Among the 75 participating teams in SemEval-2019 sub-task B, our system ranks 6th (with 0.706 macro F1-score). For sub-task C, among the 65 participating teams, our system ranks 9th (with 0.587 macro F1-score).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.03692/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1906.03692/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1906.03692/full.md

---
Source: https://tomesphere.com/paper/1906.03692