Investigating the Effect of Segmentation Methods on Neural Model based Sentiment Analysis on Informal Short Texts in Turkish
Fatih Kurt, Dilek Kisa, Pinar Karagoz

TL;DR
This study examines how different segmentation techniques impact the performance of neural network models in sentiment analysis of informal Turkish short texts, comparing morphological, sub-word, tokenization, and hybrid methods.
Contribution
It introduces a comprehensive analysis of various segmentation approaches and their effects on CNN and RNN models for Turkish sentiment analysis.
Findings
Segmentation method significantly affects model accuracy
Hybrid segmentation approaches yield better results
CNN and RNN performance varies with segmentation type
Abstract
This work investigates segmentation approaches for sentiment analysis on informal short texts in Turkish. The two building blocks of the proposed work are segmentation and deep neural network model. Segmentation focuses on preprocessing of text with different methods. These methods are grouped in four: morphological, sub-word, tokenization, and hybrid approaches. We analyzed several variants for each of these four methods. The second stage focuses on evaluation of the neural model for sentiment analysis. The performance of each segmentation method is evaluated under Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) model proposed in the literature for sentiment classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Natural Language Processing Techniques
