# Effectiveness of Hierarchical Softmax in Large Scale Classification   Tasks

**Authors:** Abdul Arfat Mohammed, Venkatesh Umaashankar

arXiv: 1812.05737 · 2018-12-17

## TL;DR

This paper evaluates the efficiency of Hierarchical Softmax in large-scale classification tasks, comparing it with traditional Softmax on LSHTC datasets, and finds that Hierarchical Softmax's performance declines as class numbers grow.

## Contribution

The study provides an empirical comparison of Softmax and Hierarchical Softmax on large datasets, highlighting the limitations of Hierarchical Softmax in high-class scenarios.

## Key findings

- Hierarchical Softmax is computationally efficient for large-scale data.
- Performance of Hierarchical Softmax degrades with increasing number of classes.
- Traditional Softmax maintains better accuracy on large class sets.

## Abstract

Typically, Softmax is used in the final layer of a neural network to get a probability distribution for output classes. But the main problem with Softmax is that it is computationally expensive for large scale data sets with large number of possible outputs. To approximate class probability efficiently on such large scale data sets we can use Hierarchical Softmax. LSHTC datasets were used to study the performance of the Hierarchical Softmax. LSHTC datasets have large number of categories. In this paper we evaluate and report the performance of normal Softmax Vs Hierarchical Softmax on LSHTC datasets. This evaluation used macro f1 score as a performance measure. The observation was that the performance of Hierarchical Softmax degrades as the number of classes increase.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.05737/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1812.05737/full.md

## References

9 references — full list in the complete paper: https://tomesphere.com/paper/1812.05737/full.md

---
Source: https://tomesphere.com/paper/1812.05737