# Preserving Intermediate Objectives: One Simple Trick to Improve Learning   for Hierarchical Models

**Authors:** Abhilasha Ravichander, Shruti Rijhwani, Rajat Kulshreshtha, Chirag, Nagpal, Tadas Baltru\v{s}aitis, Louis-Philippe Morency

arXiv: 1706.07867 · 2017-06-27

## TL;DR

This paper introduces a simple yet effective method for improving hierarchical neural models by preserving intermediate objectives, demonstrated on speaker trait prediction with a 25% accuracy improvement.

## Contribution

The authors propose a novel algorithm that enables backpropagation through hierarchical structures, enhancing learning in models with task hierarchies.

## Key findings

- Achieved 25% relative error reduction on POM dataset.
- Demonstrated improved performance over state-of-the-art methods.
- Validated the effectiveness of preserving intermediate objectives.

## Abstract

Hierarchical models are utilized in a wide variety of problems which are characterized by task hierarchies, where predictions on smaller subtasks are useful for trying to predict a final task. Typically, neural networks are first trained for the subtasks, and the predictions of these networks are subsequently used as additional features when training a model and doing inference for a final task. In this work, we focus on improving learning for such hierarchical models and demonstrate our method on the task of speaker trait prediction. Speaker trait prediction aims to computationally identify which personality traits a speaker might be perceived to have, and has been of great interest to both the Artificial Intelligence and Social Science communities. Persuasiveness prediction in particular has been of interest, as persuasive speakers have a large amount of influence on our thoughts, opinions and beliefs. In this work, we examine how leveraging the relationship between related speaker traits in a hierarchical structure can help improve our ability to predict how persuasive a speaker is. We present a novel algorithm that allows us to backpropagate through this hierarchy. This hierarchical model achieves a 25% relative error reduction in classification accuracy over current state-of-the art methods on the publicly available POM dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.07867/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1706.07867/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1706.07867/full.md

---
Source: https://tomesphere.com/paper/1706.07867