Unsupervised Learning of Sequence Representations by Autoencoders

Wenjie Pei; David M.J. Tax

arXiv:1804.00946·cs.CV·April 30, 2018·1 cites

Unsupervised Learning of Sequence Representations by Autoencoders

Wenjie Pei, David M.J. Tax

PDF

Open Access

TL;DR

This paper introduces the Integrated Sequence Autoencoder (ISA), an unsupervised model that learns fixed-length representations of variable-length sequence data by combining global and local reconstruction mechanisms, improving feature extraction for downstream tasks.

Contribution

The paper proposes a novel autoencoder architecture that integrates global and local sequence reconstruction with a stop feature to enhance representation quality.

Findings

01

Effective in capturing both apparent and high-level features.

02

Improves semi-supervised learning performance with unlabeled data.

03

Can discriminate high-level style information like speaker identity.

Abstract

Sequence data is challenging for machine learning approaches, because the lengths of the sequences may vary between samples. In this paper, we present an unsupervised learning model for sequence data, called the Integrated Sequence Autoencoder (ISA), to learn a fixed-length vectorial representation by minimizing the reconstruction error. Specifically, we propose to integrate two classical mechanisms for sequence reconstruction which takes into account both the global silhouette information and the local temporal dependencies. Furthermore, we propose a stop feature that serves as a temporal stamp to guide the reconstruction process, which results in a higher-quality representation. The learned representation is able to effectively summarize not only the apparent features, but also the underlying and high-level style information. Take for example a speech sequence sample: our ISA model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Topic Modeling