Will sentiment analysis need subculture? A new data augmentation approach
Zhenhua Wang, Simin He, Guang Xu, Ming Ren

TL;DR
This paper introduces a novel subculture-based data augmentation method for sentiment analysis, leveraging subcultural expressions to enrich training data and improve model performance, highlighting the influence of subcultural nuances on sentiment detection.
Contribution
It proposes the SCDA approach that generates diverse subcultural expressions to enhance sentiment analysis training data, addressing data scarcity and capturing subcultural sentiment nuances.
Findings
SCDA improves sentiment analysis accuracy
Subcultural expressions influence sentiment intensity
Potential linear reversibility of subcultural expressions
Abstract
Nowadays, the omnipresence of the Internet has fostered a subculture that congregates around the contemporary milieu. The subculture artfully articulates the intricacies of human feelings by ardently pursuing the allure of novelty, a fact that cannot be disregarded in the sentiment analysis. This paper aims to enrich data through the lens of subculture, to address the insufficient training data faced by sentiment analysis. To this end, a new approach of subculture-based data augmentation (SCDA) is proposed, which engenders enhanced texts for each training text by leveraging the creation of specific subcultural expression generators. The extensive experiments attest to the effectiveness and potential of SCDA. The results also shed light on the phenomenon that disparate subcultural expressions elicit varying degrees of sentiment stimulation. Moreover, an intriguing conjecture arises,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Human Mobility and Location-Based Analysis · Digital Communication and Language
