Beyond ESG Scores: Learning Dynamic Constraints for Sequential Portfolio Optimization
Xin Li, Yan Ke, Longbing Cao

TL;DR
This paper introduces a novel method for integrating ESG constraints into sequential portfolio optimization by learning dynamic ESG costs from multimodal evidence, avoiding static ESG scores.
Contribution
It proposes MACF and MACF-X, mechanisms that incorporate ESG constraints dynamically without altering the core financial policy, improving ESG compliance while maintaining performance.
Findings
MACF-X reduces tail ESG budget pressure.
Dynamic evidence inputs improve ESG constraint effectiveness.
Static ESG scores perform no better than noise baselines.
Abstract
ESG-aware portfolio optimization is increasingly important for sustainable capital allocation, yet most learning-based methods still operationalize ESG by appending static scores to the policy observation or reward. This creates a mismatch for sequential control: ESG scores are noisy, provider-dependent, low-frequency, and temporally misaligned with sequential portfolio decisions, while financial evidence suggests that ESG is better treated as a portfolio preference, risk-exposure, or hedge dimension than as a robust alpha factor. We propose to impose ESG constraints without modifying the financial policy's observation or reward, using a Multimodal Action-Conditioned Constraint Field (MACF) that learns mechanism-specific ESG costs from point-in-time multimodal evidence and contemplated portfolio transitions. We then introduce MACF-X, a family of optimizer-specific adapters that converts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
