LLM-Based Financial Sentiment Analysis in Arabic: Evidence from Saudi Markets
Mona H. Albaqawi, Eman M. Albalkhi, Joud A. Albaiti, Enrico Lopedoto

TL;DR
This paper develops a large-scale Arabic NLP framework for financial sentiment analysis in Saudi markets, integrating news and social media to analyze investor sentiment and its relation to stock market behavior.
Contribution
It introduces a novel multi-stage pipeline for constructing an Arabic financial sentiment dataset and demonstrates its effectiveness for market analysis.
Findings
Created a dataset of 84K annotated samples for Arabic financial sentiment.
Achieved reliable sentiment analysis aligned with stock market movements.
Demonstrated scalability and accuracy of the proposed framework.
Abstract
Investor sentiment shapes financial markets, yet modeling sentiment in Arabic financial contexts remains challenging due to linguistic complexity and limited resources. We present an Arabic NLP framework for large-scale financial sentiment analysis tailored to the Saudi market, integrating official financial news and social media to capture institutional and public investor sentiment. The framework constructs a large Arabic financial corpus through a multi-stage pipeline encompassing data collection, cleaning, deduplication, entity linking, and sentiment annotation. Transformer-based NER combined with a curated company lexicon links textual mentions to canonical company identifiers, with sentiment labels assigned using a five-class scheme. The resulting dataset of 84K samples supports company-level sentiment aggregation and analysis of sentiment dynamics relative to stock market…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
