Imitation Learning for Combinatorial Optimisation under Uncertainty
Prakash Gawas, Antoine Legrain, Louis-Martin Rousseau

TL;DR
This paper develops a taxonomy of experts in imitation learning for combinatorial optimization under uncertainty and proposes a flexible Dataset Aggregation framework evaluated on a dynamic assignment problem.
Contribution
It introduces a systematic taxonomy of expert types and a generalized DAgger framework for improved imitation learning in uncertain combinatorial problems.
Findings
Stochastic experts lead to better policies than deterministic or full-information experts.
Interactive learning enhances solution quality with fewer demonstrations.
Aggregated deterministic experts are effective when stochastic optimization is computationally hard.
Abstract
Imitation learning (IL) provides a data-driven framework for approximating policies for large-scale combinatorial optimisation problems formulated as sequential decision problems (SDPs), where exact solution methods are computationally intractable. A central but underexplored aspect of IL in this context is the role of the \emph{expert} that generates training demonstrations. Existing studies employ a wide range of expert constructions, yet lack a unifying framework to characterise their modelling assumptions, computational properties, and impact on learning performance. This paper introduces a systematic taxonomy of experts for imitation learning in combinatorial optimisation under uncertainty. The literature is classified along three principal dimensions: (i) treatment of uncertainty; (ii) level of optimality, distinguishing task-optimal and approximate experts; and (iii) interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
