Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks
Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, Yoav Goldberg

TL;DR
This paper introduces a framework for analyzing sentence embeddings by predicting specific structural aspects, helping to understand what information these representations encode and how different methods compare.
Contribution
It proposes a novel set of auxiliary prediction tasks to evaluate and compare various sentence embedding techniques in terms of structural information captured.
Findings
Different embedding methods vary in their ability to encode sentence length, word content, and order.
Higher dimensional vectors generally improve prediction accuracy.
Analysis reveals strengths and limitations of common sentence encoding approaches.
Abstract
There is a lot of research interest in encoding variable length sentences into fixed length vectors, in a way that preserves the sentence meanings. Two common methods include representations based on averaging word vectors, and representations based on the hidden states of recurrent neural networks such as LSTMs. The sentence vectors are used as features for subsequent machine learning tasks or for pre-training in the context of deep learning. However, not much is known about the properties that are encoded in these sentence representations and about the language information they capture. We propose a framework that facilitates better understanding of the encoded representations. We define prediction tasks around isolated aspects of sentence structure (namely sentence length, word content, and word order), and score representations by the ability to train a classifier to solve each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection
