Improving Opinion-Target Extraction with Character-Level Word Embeddings

Soufian Jebbara; Philipp Cimiano

arXiv:1709.06317·cs.CL·September 20, 2017

Improving Opinion-Target Extraction with Character-Level Word Embeddings

Soufian Jebbara, Philipp Cimiano

PDF

TL;DR

This paper explores the use of character-level word embeddings to improve opinion target extraction in sentiment analysis, demonstrating a significant performance boost and analyzing the learned character patterns.

Contribution

It introduces character-level embeddings into opinion target extraction, showing their positive impact and providing insights into the learned character patterns.

Findings

01

3.3 points F1-score improvement over baseline

02

Character embeddings encode meaningful patterns

03

Enhanced handling of misspelled and domain-specific words

Abstract

Fine-grained sentiment analysis is receiving increasing attention in recent years. Extracting opinion target expressions (OTE) in reviews is often an important step in fine-grained, aspect-based sentiment analysis. Retrieving this information from user-generated text, however, can be difficult. Customer reviews, for instance, are prone to contain misspelled words and are difficult to process due to their domain-specific language. In this work, we investigate whether character-level models can improve the performance for the identification of opinion target expressions. We integrate information about the character structure of a word into a sequence labeling system using character-level word embeddings and show their positive impact on the system's performance. Specifically, we obtain an increase by 3.3 points F1-score with respect to our baseline model. In further experiments, we reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.