Constructing Financial Sentimental Factors in Chinese Market Using Natural Language Processing
Junfeng Jiang, Jiahao Li

TL;DR
This paper develops an NLP-based method to construct financial sentiment indicators from Chinese news and comments, demonstrating their strong correlation with market movements, especially during turbulent periods.
Contribution
It introduces a novel integrated algorithm combining web crawling, Chinese NLP techniques, and a finance-specific sentiment lexicon to evaluate market sentiment.
Findings
Significant correlation between sentiment factors and Chinese market movements.
Adjusted sentiment factor shows stronger correlation than standard.
Model effectively guides investment decisions during market turbulence.
Abstract
In this paper, we design an integrated algorithm to evaluate the sentiment of Chinese market. Firstly, with the help of the web browser automation, we crawl a lot of news and comments from several influential financial websites automatically. Secondly, we use techniques of Natural Language Processing(NLP) under Chinese context, including tokenization, Word2vec word embedding and semantic database WordNet, to compute Senti-scores of these news and comments, and then construct the sentimental factor. Here, we build a finance-specific sentimental lexicon so that the sentimental factor can reflect the sentiment of financial market but not the general sentiments as happiness, sadness, etc. Thirdly, we also implement an adjustment of the standard sentimental factor. Our experimental performance shows that there is a significant correlation between our standard sentimental factor and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Sentiment Analysis and Opinion Mining · Topic Modeling
MethodsStochastic Steady-state Embedding
