Exploring The Contribution of Unlabeled Data in Financial Sentiment Analysis
Jimmy SJ. Ren, Wei Wang, Jiawei Wang, Stephen Shaoyi Liao

TL;DR
This paper investigates how unlabeled data impacts financial sentiment analysis, revealing that effective feature selection can leverage unlabeled data to improve classification performance despite known challenges.
Contribution
It introduces a feature selection framework that considers both labeled and unlabeled data, addressing bias-variance trade-offs in semi-supervised learning for sentiment analysis.
Findings
Unlabeled data can degrade performance due to bias-variance issues.
Effective feature selection can mitigate performance degradation.
Financial sentiment analysis benefits from improved semi-supervised methods.
Abstract
With the proliferation of its applications in various industries, sentiment analysis by using publicly available web data has become an active research area in text classification during these years. It is argued by researchers that semi-supervised learning is an effective approach to this problem since it is capable to mitigate the manual labeling effort which is usually expensive and time-consuming. However, there was a long-term debate on the effectiveness of unlabeled data in text classification. This was partially caused by the fact that many assumptions in theoretic analysis often do not hold in practice. We argue that this problem may be further understood by adding an additional dimension in the experiment. This allows us to address this problem in the perspective of bias and variance in a broader view. We show that the well-known performance degradation issue caused by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies · Stock Market Forecasting Methods
