Efficient Learning of Discrete-Continuous Computation Graphs

David Friede; Mathias Niepert

arXiv:2307.14193·cs.LG·July 27, 2023·1 cites

Efficient Learning of Discrete-Continuous Computation Graphs

David Friede, Mathias Niepert

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes complex stochastic computation graphs with multiple discrete components, identifies training challenges, and proposes strategies like Gumbel noise scaling and dropout residuals to enable effective training and better generalization.

Contribution

It introduces novel training strategies for complex discrete-continuous models, overcoming gradient issues and enabling training of more intricate stochastic computation graphs.

Findings

01

Proposed Gumbel noise scaling improves training stability.

02

Dropout residual connections enhance optimization of discrete-continuous models.

03

Complex models outperform simpler counterparts on benchmark datasets.

Abstract

Numerous models for supervised and reinforcement learning benefit from combinations of discrete and continuous model components. End-to-end learnable discrete-continuous models are compositional, tend to generalize better, and are more interpretable. A popular approach to building discrete-continuous computation graphs is that of integrating discrete probability distributions into neural networks using stochastic softmax tricks. Prior work has mainly focused on computation graphs with a single discrete component on each of the graph's execution paths. We analyze the behavior of more complex stochastic computations graphs with multiple sequential discrete components. We show that it is challenging to optimize the parameters of these models, mainly due to small gradients and local minima. We then propose two new strategies to overcome these challenges. First, we show that increasing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nec-research/dccg
pytorchOfficial

Videos

Efficient Learning of Discrete-Continuous Computation Graphs· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Graph Neural Networks · Face and Expression Recognition

MethodsDropout · Softmax