Byte-Level Recursive Convolutional Auto-Encoder for Text

Xiang Zhang; Yann LeCun

arXiv:1802.01817·cs.CL·February 7, 2018·1 cites

Byte-Level Recursive Convolutional Auto-Encoder for Text

Xiang Zhang, Yann LeCun

PDF

Open Access 1 Repo

TL;DR

This paper introduces a byte-level recursive convolutional auto-encoder for text that enables scalable, non-sequential text generation, outperforming recurrent models in auto-encoding tasks across multiple languages.

Contribution

It presents a novel deep convolutional auto-encoder with recursive architecture and residual connections for byte-level text representation and generation.

Findings

01

Outperforms recurrent models in auto-encoding accuracy

02

Works effectively across multiple languages including Arabic, Chinese, and English

03

Uses a deep multi-stage convolutional architecture with up to 160 layers

Abstract

This article proposes to auto-encode text at byte-level using convolutional networks with a recursive architecture. The motivation is to explore whether it is possible to have scalable and homogeneous text generation at byte-level in a non-sequential fashion through the simple task of auto-encoding. We show that non-sequential text generation from a fixed-length representation is not only possible, but also achieved much better auto-encoding results than recurrent networks. The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers. Each encoder or decoder contains a shared group of modules that consists of either pooling or upsampling layers, making the network recursive in terms of abstraction levels in representation. Results for 6 large-scale paragraph datasets are reported, in 3 languages…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

smalik169/recursive-convolutional-autoencoder
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling