Generating Word and Document Embeddings for Sentiment Analysis
Cem R{\i}fk{\i} Ayd{\i}n, Tunga G\"ung\"or, Ali Erkan

TL;DR
This paper introduces a method for generating domain-specific and language-agnostic word and document embeddings by combining contextual, supervised, and dictionary-based features, improving sentiment analysis performance.
Contribution
It presents a novel approach that integrates multiple feature types to produce more effective sentiment embeddings across domains and languages.
Findings
Improved sentiment classification accuracy on Turkish datasets.
Outperformed baseline word2vec models on English corpora.
Demonstrated cross-domain and multilingual applicability.
Abstract
Sentiments of words differ from one corpus to another. Inducing general sentiment lexicons for languages and using them cannot, in general, produce meaningful results for different domains. In this paper, we combine contextual and supervised information with the general semantic representations of words occurring in the dictionary. Contexts of words help us capture the domain-specific information and supervised scores of words are indicative of the polarities of those words. When we combine supervised features of words with the features extracted from their dictionary definitions, we observe an increase in the success rates. We try out the combinations of contextual, supervised, and dictionary-based approaches, and generate original vectors. We also combine the word2vec approach with hand-crafted features. We induce domain-specific sentimental vectors for two corpora, which are the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Advanced Text Analysis Techniques
