Financial Sentiment Analysis: Leveraging Actual and Synthetic Data for   Supervised Fine-tuning

Abraham Atsiwo

arXiv:2412.09859·cs.LG·December 16, 2024

Financial Sentiment Analysis: Leveraging Actual and Synthetic Data for Supervised Fine-tuning

Abraham Atsiwo

PDF

1 Repo

TL;DR

This paper proposes a novel approach to financial sentiment analysis by combining actual and synthetic data, using specialized models to improve accuracy and F1 scores on financial datasets.

Contribution

It introduces new models and methods for generating longer financial sentences and determining sentiment, enhancing performance over existing models.

Findings

01

Improved accuracy and F1 scores on financial sentiment datasets.

02

Synthetic data augmentation enhances model performance.

03

Longer sentence context improves sentiment analysis results.

Abstract

The Efficient Market Hypothesis (EMH) highlights the essence of financial news in stock price movement. Financial news comes in the form of corporate announcements, news titles, and other forms of digital text. The generation of insights from financial news can be done with sentiment analysis. General-purpose language models are too general for sentiment analysis in finance. Curated labeled data for fine-tuning general-purpose language models are scare, and existing fine-tuned models for sentiment analysis in finance do not capture the maximum context width. We hypothesize that using actual and synthetic data can improve performance. We introduce BertNSP-finance to concatenate shorter financial sentences into longer financial sentences, and finbert-lc to determine sentiment from digital text. The results show improved performance on the accuracy and the f1 score for the financial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

abraham-atsiwo/filbert-lc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.