A Comparative Study of Feature Selection Methods for Dialectal Arabic Sentiment Classification Using Support Vector Machine
Omar Al-Harbi

TL;DR
This study evaluates various feature selection methods for dialectal Arabic sentiment classification using SVM, highlighting the impact of feature selection, term weighting, and preprocessing techniques on classification accuracy.
Contribution
It provides a comparative analysis of feature selection methods specifically for dialectal Arabic sentiment analysis, which is less explored compared to English.
Findings
SVM and correlation feature selection combined with uni-gram model yielded the best performance.
Feature selection methods significantly influence sentiment classification accuracy.
Preprocessing techniques like stemming and stop word removal affect results.
Abstract
Unlike other languages, the Arabic language has a morphological complexity which makes the Arabic sentiment analysis is a challenging task. Moreover, the presence of the dialects in the Arabic texts have made the sentiment analysis task is more challenging, due to the absence of specific rules that govern the writing or speaking system. Generally, one of the problems of sentiment analysis is the high dimensionality of the feature vector. To resolve this problem, many feature selection methods have been proposed. In contrast to the dialectal Arabic language, these selection methods have been investigated widely for the English language. This work investigated the effect of feature selection methods and their combinations on dialectal Arabic sentiment classification. The feature selection methods are Information Gain (IG), Correlation, Support Vector Machine (SVM), Gini Index (GI), and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies · Spam and Phishing Detection
MethodsSupport Vector Machine
