Conditional Information Gain Networks

Ufuk Can Bi\c{c}ici; Cem Keskin; Lale Akarun

arXiv:1807.09534·cs.CV·July 27, 2018

Conditional Information Gain Networks

Ufuk Can Bi\c{c}ici, Cem Keskin, Lale Akarun

PDF

TL;DR

Conditional Information Gain Networks enable deep neural networks to execute conditionally, reducing computation by skipping parts of the model based on input, trained with differentiable information gain mechanisms.

Contribution

The paper introduces a novel conditional computation framework using differentiable information gain for end-to-end training of neural networks.

Findings

01

Achieves comparable or better accuracy than standard CNNs on MNIST and Fashion MNIST.

02

Uses significantly fewer parameters while maintaining performance.

03

Demonstrates effective conditional execution in deep neural networks.

Abstract

Deep neural network models owe their representational power to the high number of learnable parameters. It is often infeasible to run these largely parametrized deep models in limited resource environments, like mobile phones. Network models employing conditional computing are able to reduce computational requirements while achieving high representational power, with their ability to model hierarchies. We propose Conditional Information Gain Networks, which allow the feed forward deep neural networks to execute conditionally, skipping parts of the model based on the sample and the decision mechanisms inserted in the architecture. These decision mechanisms are trained using cost functions based on differentiable Information Gain, inspired by the training procedures of decision trees. These information gain based decision mechanisms are differentiable and can be trained end-to-end using a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.