A Comparison of Automatic Labelling Approaches for Sentiment Analysis

Sumana Biswas; Karen Young; and Josephine Griffith

arXiv:2211.02976·cs.CL·November 8, 2022

A Comparison of Automatic Labelling Approaches for Sentiment Analysis

Sumana Biswas, Karen Young, and Josephine Griffith

PDF

TL;DR

This paper evaluates three automatic sentiment labelling methods for Twitter data, demonstrating that Afinn achieves high accuracy and could reduce the need for costly human annotation in sentiment analysis tasks.

Contribution

The study compares three automatic sentiment labelling techniques on Twitter datasets, highlighting Afinn's superior performance and potential to replace manual labelling.

Findings

01

Afinn achieved up to 80.17% accuracy on DS-1.

02

Automatic labelling can effectively substitute manual labelling.

03

Using automatic labels for training yields comparable results to ground truth labels.

Abstract

Labelling a large quantity of social media data for the task of supervised machine learning is not only time-consuming but also difficult and expensive. On the other hand, the accuracy of supervised machine learning models is strongly related to the quality of the labelled data on which they train, and automatic sentiment labelling techniques could reduce the time and cost of human labelling. We have compared three automatic sentiment labelling techniques: TextBlob, Vader, and Afinn to assign sentiments to tweets without any human assistance. We compare three scenarios: one uses training and testing datasets with existing ground truth labels; the second experiment uses automatic labels as training and testing datasets; and the third experiment uses three automatic labelling techniques to label the training dataset and uses the ground truth labels for testing. The experiments were…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM