FinMarBa: A Market-Informed Dataset for Financial Sentiment Classification

Baptiste Lefort; Eric Benhamou; Beatrice Guez; Jean-Jacques Ohana; Ethan Setrouk; Alban Etienne

arXiv:2507.22932·cs.CL·August 1, 2025

FinMarBa: A Market-Informed Dataset for Financial Sentiment Classification

Baptiste Lefort, Eric Benhamou, Beatrice Guez, Jean-Jacques Ohana, Ethan Setrouk, Alban Etienne

PDF

Open Access

TL;DR

This paper introduces a hierarchical framework combining lightweight Large Language Models with Deep Reinforcement Learning for financial portfolio optimization, achieving significant returns and outperforming benchmarks.

Contribution

It presents a novel hierarchical RL architecture integrating sentiment analysis with market data, with scalable cross-modal processing and open-source implementation.

Findings

01

26% annualized return on test data

02

Sharpe ratio of 1.2 outperforming benchmarks

03

Effective integration of sentiment signals with market indicators

Abstract

This paper presents a novel hierarchical framework for portfolio optimization, integrating lightweight Large Language Models (LLMs) with Deep Reinforcement Learning (DRL) to combine sentiment signals from financial news with traditional market indicators. Our three-tier architecture employs base RL agents to process hybrid data, meta-agents to aggregate their decisions, and a super-agent to merge decisions based on market data and sentiment analysis. Evaluated on data from 2018 to 2024, after training on 2000-2017, the framework achieves a 26% annualized return and a Sharpe ratio of 1.2, outperforming equal-weighted and S&P 500 benchmarks. Key contributions include scalable cross-modal integration, a hierarchical RL structure for enhanced stability, and open-source reproducibility.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · FinTech, Crowdfunding, Digital Finance · Financial Distress and Bankruptcy Prediction