Making Sense of Hidden Layer Information in Deep Networks by Learning   Hierarchical Targets

Abhinav Tushar

arXiv:1505.00384·cs.NE·September 27, 2016

Making Sense of Hidden Layer Information in Deep Networks by Learning Hierarchical Targets

Abhinav Tushar

PDF

Open Access

TL;DR

This paper introduces a hierarchical target learning architecture in deep networks with hidden layer branches, improving accuracy and flexibility by enforcing multi-level information flow, demonstrated on a text classification task.

Contribution

It presents a novel deep network design with hidden layer branches for hierarchical target learning, enhancing accuracy and modularity over traditional models.

Findings

01

Improved accuracy on 20 Newsgroups dataset.

02

Effective enforcement of hierarchical information in hidden layers.

03

Flexible inference with multi-level targets.

Abstract

This paper proposes an architecture for deep neural networks with hidden layer branches that learn targets of lower hierarchy than final layer targets. The branches provide a channel for enforcing useful information in hidden layer which helps in attaining better accuracy, both for the final layer and hidden layers. The shared layers modify their weights using the gradients of all cost functions higher than the branching layer. This model provides a flexible inference system with many levels of targets which is modular and can be used efficiently in situations requiring different levels of results according to complexity. This paper applies the idea to a text classification task on 20 Newsgroups data set with two level of hierarchical targets and a comparison is made with training without the use of hidden layer branches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Natural Language Processing Techniques