Improving Sentiment Analysis in Arabic Using Word Representation

Abdulaziz M. Alayba; Vasile Palade; Matthew England; Rahat Iqbal

arXiv:1803.00124·cs.CL·October 17, 2018

Improving Sentiment Analysis in Arabic Using Word Representation

Abdulaziz M. Alayba, Vasile Palade, Matthew England, Rahat Iqbal

PDF

TL;DR

This paper enhances Arabic sentiment analysis by constructing specialized Word2Vec embeddings from a large corpus and applying deep learning models, achieving high classification accuracy on health-related tweets.

Contribution

It introduces a new Arabic Word2Vec model trained on diverse newspaper data and demonstrates improved sentiment classification accuracy using deep neural networks.

Findings

01

Achieved 91%-95% sentiment classification accuracy.

02

Constructed a large Arabic Word2Vec corpus from newspapers.

03

Improved results over previous methods.

Abstract

The complexities of Arabic language in morphology, orthography and dialects makes sentiment analysis for Arabic more challenging. Also, text feature extraction from short messages like tweets, in order to gauge the sentiment, makes this task even more difficult. In recent years, deep neural networks were often employed and showed very good results in sentiment classification and natural language processing applications. Word embedding, or word distributing approach, is a current and powerful tool to capture together the closest words from a contextual text. In this paper, we describe how we construct Word2Vec models from a large Arabic corpus obtained from ten newspapers in different Arab countries. By applying different machine learning algorithms and convolutional neural networks with different text feature selections, we report improved accuracy of sentiment classification (91%-95%)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.