Persian Slang Text Conversion to Formal and Deep Learning of Persian Short Texts on Social Media for Sentiment Classification
Mohsen Khazeni, Mohammad Heydari, Amir Albadvi

TL;DR
This paper introduces PSC, a Persian slang converter, combined with deep learning models to improve sentiment analysis of social media texts by converting informal language into formal language, achieving high accuracy.
Contribution
The study presents a novel Persian slang conversion tool and integrates it with deep learning for enhanced sentiment classification of social media texts.
Findings
57% of conversational words were converted to formal language.
Achieved 81.91% accuracy in sentiment classification.
Utilized over 20 million texts for training and evaluation.
Abstract
The lack of a suitable tool for the analysis of conversational texts in the Persian language has made various analyses of these texts, including Sentiment Analysis, difficult. In this research, we tried to make the understanding of these texts easier for the machine by providing PSC, Persian Slang Converter, a tool for converting conversational texts into formal ones, and by using the most up-to-date and best deep learning methods along with the PSC, the sentiment learning of short Persian language texts for the machine in a better way. be made More than 10 million unlabeled texts from various social networks and movie subtitles (as Conversational texts) and about 10 million news texts (as formal texts) have been used for training unsupervised models and formal implementation of the tool. 60,000 texts from the comments of Instagram social network users with positive, negative, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Natural Language Processing Techniques · Spam and Phishing Detection
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · fastText
