# The Effect of Downstream Classification Tasks for Evaluating Sentence   Embeddings

**Authors:** Peter Potash

arXiv: 1904.02228 · 2019-05-28

## TL;DR

This paper examines how the characteristics of label distributions in downstream classification tasks influence the effectiveness of sentence embeddings, highlighting the impact of label complexity on evaluation outcomes.

## Contribution

It provides an analysis of how label distribution properties affect the evaluation of sentence embeddings in classification tasks, offering insights into the limitations of current evaluation methods.

## Key findings

- Sentences with more labels across tasks have higher reconstruction loss.
- Label distribution characteristics significantly influence embedding evaluation.
- Evaluation effectiveness depends on the overall label distribution across sentences.

## Abstract

One popular method for quantitatively evaluating the utility of sentence embeddings involves using them in downstream language processing tasks that require sentence representations as input. One simple such task is classification, where the sentence representations are used to train and test models on several classification datasets. We argue that by evaluating sentence representations in such a manner, the goal of the representations becomes learning a low-dimensional factorization of a sentence-task label matrix. We show how characteristics of this matrix can affect the ability for a low-dimensional factorization to perform as sentence representations in a suite of classification tasks. Primarily, sentences that have more labels across all possible classification tasks have a higher reconstruction loss, however the general nature of this effect is ultimately dependent on the overall distribution of labels across all possible sentences.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.02228/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1904.02228/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1904.02228/full.md

---
Source: https://tomesphere.com/paper/1904.02228