Covariance-Aware Transformers for Quadratic Programming and Decision Making
Kutay Tire, Yufan Zhang, Ege Onur Taga, Samet Oymak

TL;DR
This paper demonstrates how transformer models, enhanced with covariance information, can directly solve quadratic programming problems and improve decision-making tasks like portfolio optimization, outperforming traditional methods.
Contribution
It introduces a covariance-aware transformer framework that can solve various quadratic programs and improves decision-making models by explicitly incorporating second-order statistics.
Findings
Transformers can solve unconstrained quadratic programs via linear attention.
The proposed method outperforms classical predict-then-optimize approaches in portfolio optimization.
Explicit covariance integration enhances transformer performance in decision-making tasks.
Abstract
We explore the use of transformers for solving quadratic programs and how this capability benefits decision-making problems that involve covariance matrices. We first show that the linear attention mechanism can provably solve unconstrained QPs by tokenizing the matrix variables (e.g.~ of the objective ) row-by-row and emulating gradient descent iterations. Furthermore, by incorporating MLPs, a transformer block can solve (i) -penalized QPs by emulating iterative soft-thresholding and (ii) -constrained QPs when equipped with an additional feedback loop. Our theory motivates us to introduce Time2Decide: a generic method that enhances a time series foundation model (TSFM) by explicitly feeding the covariance matrix between the variates. We empirically find that Time2Decide uniformly outperforms the base TSFM model for the classical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Risk and Portfolio Optimization
