Deep Clustering of Text Representations for Supervision-free Probing of Syntax
Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

TL;DR
This paper introduces a novel unsupervised deep clustering method for interpreting text representations and inducing syntax without supervision, demonstrating its effectiveness across multiple languages and syntactic tasks.
Contribution
The work presents a joint transformation and clustering approach for high-dimensional text representations, enabling supervision-free syntax probing and induction with state-of-the-art results.
Findings
Effective unsupervised syntax induction across multiple languages
Higher layers improve unsupervised probing performance
Multilingual BERT contains significant syntactic knowledge
Abstract
We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax. As these representations are high-dimensional, out-of-the-box methods like KMeans do not work well. Thus, our approach jointly transforms the representations into a lower-dimensional cluster-friendly space and clusters them. We consider two notions of syntax: Part of speech Induction (POSI) and constituency labelling (CoLab) in this work. Interestingly, we find that Multilingual BERT (mBERT) contains surprising amount of syntactic knowledge of English; possibly even as much as English BERT (EBERT). Our model can be used as a supervision-free probe which is arguably a less-biased way of probing. We find that unsupervised probes show benefits from higher layers as compared to supervised probes. We further note that our unsupervised probe utilizes EBERT and mBERT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsmBERT
