How Could Polyhedral Theory Harness Deep Learning?

Thiago Serra; Christian Tjandraatmadja; Srikumar Ramalingam

arXiv:1806.06365·math.OC·June 19, 2018

How Could Polyhedral Theory Harness Deep Learning?

Thiago Serra, Christian Tjandraatmadja, Srikumar Ramalingam

PDF

Open Access

TL;DR

This paper explores how polyhedral theory and mixed-integer representability could provide an analytical framework for designing optimal deep learning architectures, moving beyond empirical methods.

Contribution

It proposes leveraging polyhedral theory to analytically guide neural network architecture design, offering a new theoretical perspective.

Findings

01

Identifies potential of polyhedral theory in neural architecture design

02

Suggests analytical methods as alternatives to empirical tuning

03

Highlights promising research directions in deep learning theory

Abstract

The holy grail of deep learning is to come up with an automatic method to design optimal architectures for different applications. In other words, how can we effectively dimension and organize neurons along the network layers based on the computational resources, input size, and amount of training data? We outline promising research directions based on polyhedral theory and mixed-integer representability that may offer an analytical approach to this question, in contrast to the empirical techniques often employed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Advanced Neural Network Applications