Differentiable Knapsack and Top-k Operators via Dynamic Programming
Germain Vivier-Ardisson, Micha\"el E. Sander, Axel Parmentier, Mathieu Blondel

TL;DR
This paper introduces a unified, differentiable framework for knapsack and top-k operators using dynamic programming, enabling their integration into neural networks with efficient algorithms and theoretical insights.
Contribution
It presents a novel approach to make knapsack and top-k operators differentiable via smoothing of dynamic programming recursions, with efficient algorithms and theoretical characterizations.
Findings
Effective differentiable relaxations for knapsack and top-k operators.
Theoretical proof of Shannon entropy as the unique permutation-equivariant regularizer.
Successful application to decision-focused learning, RL, and discrete VAEs.
Abstract
Knapsack and Top-k operators are useful for selecting discrete subsets of variables. However, their integration into neural networks is challenging as they are piecewise constant, yielding gradients that are zero almost everywhere. In this paper, we propose a unified framework casting these operators as dynamic programs, and derive differentiable relaxations by smoothing the underlying recursions. On the algorithmic side, we develop efficient parallel algorithms supporting both deterministic and stochastic forward passes, and vector-Jacobian products for the backward pass. On the theoretical side, we prove that Shannon entropy is the unique regularization choice yielding permutation-equivariant operators, and characterize regularizers inducing sparse selections. Finally, on the experimental side, we demonstrate our framework on a decision-focused learning benchmark, a constrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Complexity and Algorithms in Graphs
