Differentiable Combinatorial Losses through Generalized Gradients of   Linear Programs

Xi Gao; Han Zhang; Aliakbar Panahi; Tom Arodz

arXiv:1910.08211·cs.LG·October 5, 2020

Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs

Xi Gao, Han Zhang, Aliakbar Panahi, Tom Arodz

PDF

Open Access

TL;DR

This paper introduces a method for differentiating through combinatorial optimization problems, like sequence alignment and classification, enabling end-to-end training with structured objectives.

Contribution

It presents a way to perform gradient descent on combinatorial algorithms expressed as linear programs, bridging the gap between training objectives and inference goals.

Findings

01

Effective sequence-to-sequence training with differentiable alignment.

02

Improved weakly supervised image classification results.

03

Demonstrated efficiency of gradient-based optimization over combinatorial problems.

Abstract

When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference. For example, in sequence-to-sequence modeling we are interested in high-quality translated sentences, but training typically uses maximum likelihood at the word level. The natural training-time loss would involve a combinatorial problem -- dynamic programming-based global sequence alignment -- but solutions to combinatorial problems are not differentiable with respect to their input parameters, so surrogate, differentiable losses are used instead. Here, we show how to perform gradient descent over combinatorial optimization algorithms that involve continuous parameters, for example edge weights, and can be efficiently expressed as linear programs. We demonstrate usefulness of gradient descent over combinatorial optimization in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsSoftmax