Autoregressive Policy Optimization for Constrained Allocation Tasks

David Winkel; Niklas Strau{\ss}; Maximilian Bernhard; Zongyue Li,; Thomas Seidl; Matthias Schubert

arXiv:2409.18735·cs.AI·September 30, 2024

Autoregressive Policy Optimization for Constrained Allocation Tasks

David Winkel, Niklas Strau{\ss}, Maximilian Bernhard, Zongyue Li,, Thomas Seidl, Matthias Schubert

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an autoregressive policy optimization method for constrained allocation tasks, effectively handling linear constraints and outperforming existing constrained reinforcement learning approaches across multiple domains.

Contribution

The paper presents a novel autoregressive sampling approach with a de-biasing mechanism for constrained allocation, advancing the capabilities of reinforcement learning in complex constrained environments.

Findings

01

Outperforms existing CRL methods on portfolio optimization

02

Effective in workload distribution tasks

03

Demonstrates robustness on synthetic benchmarks

Abstract

Allocation tasks represent a class of problems where a limited amount of resources must be allocated to a set of entities at each time step. Prominent examples of this task include portfolio optimization or distributing computational workloads across servers. Allocation tasks are typically bound by linear constraints describing practical requirements that have to be strictly fulfilled at all times. In portfolio optimization, for example, investors may be obligated to allocate less than 30\% of the funds into a certain industrial sector in any investment period. Such constraints restrict the action space of allowed allocations in intricate ways, which makes learning a policy that avoids constraint violations difficult. In this paper, we propose a new method for constrained allocation tasks based on an autoregressive process to sequentially sample allocations for each entity. In addition,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

niklasdbs/paspo
pytorchOfficial

Videos

Autoregressive Policy Optimization for Constrained Allocation Tasks· slideslive

Taxonomy

TopicsAuction Theory and Applications

MethodsSparse Evolutionary Training