TL;DR
This paper introduces deep learning methods, including neural networks and sampling techniques, to efficiently solve bilevel programs with binary decision variables, a class traditionally difficult due to computational complexity.
Contribution
The paper develops a novel neural network-based approach and an enhanced sampling method for solving bilevel programs with binary variables, extending the applicability of deep learning to discrete bilevel optimization.
Findings
Neural networks can approximate the lower-level value function effectively.
Input supermodular neural networks show superior representational capacity.
The proposed methods outperform traditional approaches in numerical experiments.
Abstract
Bilevel programs (BPs) find a wide range of applications in fields such as energy, transportation, and machine learning. As compared to BPs with continuous (linear/convex) optimization problems in both levels, the BPs with discrete decision variables have received much less attention, largely due to the ensuing computational intractability and the incapability of gradient-based algorithms for handling discrete optimization formulations. In this paper, we develop deep learning techniques to address this challenge. Specifically, we consider a BP with binary tender, wherein the upper and lower levels are linked via binary variables. We train a neural network to approximate the optimal value of the lower-level problem, as a function of the binary tender. Then, we obtain a single-level reformulation of the BP through a mixed-integer representation of the value function. Furthermore, we…
Peer Reviews
Decision·ICLR 2024 poster
This paper gives a good attempt of incorporating ML (especially neural networks) methods to facilitate the solution of traditional mathematical optimization problems, which in my opinion is an area that deserves more attention from the community. Overall, this paper is clearly written, easy to understand, and the theoretical results in Section 3 are very neat. I also find it to be very impressive that I cannot find any typo throughout this paper and the propositions also appear to be of their ow
Main concern: 1. The lack of ablation study, especially on the enhanced sampling part. For instance, why do you want to solve the quadratic programming problem (5) to get the samples? I understand that matrix Q is selected to be PSD is for the polynomial-solvability, but what is the main reason of solving the quadratic program in the first place? If we replaced this enhanced sampling with some other more naive sampling methods, how would it affect the experiment results? 2. Limitation of the exp
1. This work proposes an approximation-based method for Bilevel programs (BPs) with discrete decision variables, which is interesting. 2. an input supermodular neural network (ISNN) is proposed, which ensures a supermodular mapping from input to output. 3. an enhanced sampling method is proposed for solving high-dimensional BPs.
1. The author should conduct some complexity analysis, such as time complexity [1], to show the effectiveness of the proposed method. 2. This work employs neural networks to learn and approximate the value function $\phi(x)$. However, training the neural networks is more computationally complex than directly approximating the lower-level optimization problem [2]. What is the advantage of the proposed method over the polyhedral approximation in [2]? 3. Since one of the key contributions in this
- The authors propose a machine learning-based algorithm for solving bilevel programs with binary variables in the outer level problem. - The authors are able to provide some theoretical analysis for proposed neural networks with binary inputs. - The paper is easy to follow.
1) The numerical experiments lack sufficient evidence to demonstrate the advantages of the proposed algorithm. It appears that the algorithm can only achieve optimality for test instances with n = 10. The authors solely conduct comparisons with the baseline method MiBS based on computational times, without assessing solution quality, such as the optimality gap. 2) The test instances are only from the authors’ synthetic generated ones. Some public datasets like MIBLP-XU and IBLP-FIS in Taherneja
Videos
