No Training Required: Exploring Random Encoders for Sentence Classification
John Wieting, Douwe Kiela

TL;DR
This paper investigates the effectiveness of random, untrained encoders for sentence classification, revealing they serve as strong baselines and highlighting the need for proper evaluation protocols.
Contribution
It demonstrates that random encoders perform competitively with trained models and provides guidelines for more accurate baseline comparisons in sentence classification.
Findings
Random encoders are surprisingly effective baselines.
Modern sentence embeddings gain little over random methods.
Recommendations for improved experimental evaluation protocols.
Abstract
We explore various methods for computing sentence representations from pre-trained word embeddings without any training, i.e., using nothing but random parameterizations. Our aim is to put sentence embeddings on more solid footing by 1) looking at how much modern sentence embeddings gain over random methods---as it turns out, surprisingly little; and by 2) providing the field with more appropriate baselines going forward---which are, as it turns out, quite strong. We also make important observations about proper experimental protocol for sentence classification evaluation, together with recommendations for future research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
