Efficient Purely Convolutional Text Encoding

Szymon Malik; Adrian Lancucki; Jan Chorowski

arXiv:1808.01160·cs.CL·August 6, 2018

Efficient Purely Convolutional Text Encoding

Szymon Malik, Adrian Lancucki, Jan Chorowski

PDF

Open Access 1 Repo

TL;DR

This paper introduces a lightweight convolutional model for efficient sentence embedding creation, reducing training time and parameters while improving auto-encoding accuracy, and demonstrating competitive performance on NLP benchmarks.

Contribution

The paper presents a novel, optimized convolutional architecture for fixed-size sentence embeddings that outperforms previous models in efficiency and accuracy.

Findings

01

Reduced training time and parameters compared to prior models

02

Improved auto-encoding accuracy on byte-level text

03

Outperforms bag-of-words embeddings on SentEval tasks

Abstract

In this work, we focus on a lightweight convolutional architecture that creates fixed-size vector embeddings of sentences. Such representations are useful for building NLP systems, including conversational agents. Our work derives from a recently proposed recursive convolutional architecture for auto-encoding text paragraphs at byte level. We propose alternations that significantly reduce training time, the number of parameters, and improve auto-encoding accuracy. Finally, we evaluate the representations created by our model on tasks from SentEval benchmark suite, and show that it can serve as a better, yet fairly low-resource alternative to popular bag-of-words embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

smalik169/recursive-convolutional-autoencoder
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining