Unsupervised Learning of Sequence Representations by Autoencoders
Wenjie Pei, David M.J. Tax

TL;DR
This paper introduces the Integrated Sequence Autoencoder (ISA), an unsupervised model that learns fixed-length representations of variable-length sequence data by combining global and local reconstruction mechanisms, improving feature extraction for downstream tasks.
Contribution
The paper proposes a novel autoencoder architecture that integrates global and local sequence reconstruction with a stop feature to enhance representation quality.
Findings
Effective in capturing both apparent and high-level features.
Improves semi-supervised learning performance with unlabeled data.
Can discriminate high-level style information like speaker identity.
Abstract
Sequence data is challenging for machine learning approaches, because the lengths of the sequences may vary between samples. In this paper, we present an unsupervised learning model for sequence data, called the Integrated Sequence Autoencoder (ISA), to learn a fixed-length vectorial representation by minimizing the reconstruction error. Specifically, we propose to integrate two classical mechanisms for sequence reconstruction which takes into account both the global silhouette information and the local temporal dependencies. Furthermore, we propose a stop feature that serves as a temporal stamp to guide the reconstruction process, which results in a higher-quality representation. The learned representation is able to effectively summarize not only the apparent features, but also the underlying and high-level style information. Take for example a speech sequence sample: our ISA model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Topic Modeling
