Comparing Normalization Methods for Portfolio Optimization with Reinforcement Learning
Caio de Souza Barbosa Costa, Anna Helena Reali Costa

TL;DR
This paper investigates how different normalization methods impact reinforcement learning agents in portfolio optimization across various markets, revealing that normalization can sometimes hinder performance.
Contribution
It provides a comparative analysis of normalization techniques in reinforcement learning for finance, highlighting the potential drawbacks of standard normalization practices.
Findings
Normalization can degrade RL agent performance in portfolio tasks.
Performance varies across markets like IBOVESPA, NYSE, and cryptocurrencies.
Certain normalization methods outperform standard practices in specific contexts.
Abstract
Recently, reinforcement learning has achieved remarkable results in various domains, including robotics, games, natural language processing, and finance. In the financial domain, this approach has been applied to tasks such as portfolio optimization, where an agent continuously adjusts the allocation of assets within a financial portfolio to maximize profit. Numerous studies have introduced new simulation environments, neural network architectures, and training algorithms for this purpose. Among these, a domain-specific policy gradient algorithm has gained significant attention in the research community for being lightweight, fast, and for outperforming other approaches. However, recent studies have shown that this algorithm can yield inconsistent results and underperform, especially when the portfolio does not consist of cryptocurrencies. One possible explanation for this issue is that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
