SentiWords: Deriving a High Precision and High Coverage Lexicon for   Sentiment Analysis

Lorenzo Gatti; Marco Guerini; Marco Turchi

arXiv:1510.09079·cs.CL·November 2, 2015

SentiWords: Deriving a High Precision and High Coverage Lexicon for Sentiment Analysis

Lorenzo Gatti, Marco Guerini, Marco Turchi

PDF

TL;DR

This paper introduces SentiWords, a high-coverage, high-precision sentiment lexicon created by blending various SentiWordNet-based methods within a learning framework, significantly improving sentiment analysis performance.

Contribution

It presents a novel ensemble approach that combines multiple SentiWordNet techniques and manual lexica to produce a superior sentiment lexicon with extensive coverage and accuracy.

Findings

01

SentiWords contains approximately 155,000 words.

02

The ensemble method outperforms individual SentiWordNet approaches.

03

Using SentiWords improves sentiment analysis accuracy over existing lexica.

Abstract

Deriving prior polarity lexica for sentiment analysis - where positive or negative scores are associated with words out of context - is a challenging task. Usually, a trade-off between precision and coverage is hard to find, and it depends on the methodology used to build the lexicon. Manually annotated lexica provide a high precision but lack in coverage, whereas automatic derivation from pre-existing knowledge guarantees high coverage at the cost of a lower precision. Since the automatic derivation of prior polarities is less time consuming than manual annotation, there has been a great bloom of these approaches, in particular based on the SentiWordNet resource. In this paper, we compare the most frequently used techniques based on SentiWordNet with newer ones and blend them in a learning framework (a so called 'ensemble method'). By taking advantage of manually built prior polarity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.