Unsupervised Learning of Word-Sequence Representations from Scratch via Convolutional Tensor Decomposition
Furong Huang, Animashree Anandkumar

TL;DR
This paper introduces a novel unsupervised framework combining convolutional tensor decomposition and deconvolution to learn universal word-sequence embeddings from scratch, addressing challenges of context-awareness and efficiency.
Contribution
It proposes a new two-phased ConvDic+DeconvDec framework that effectively learns word-sequence dictionaries and embeddings without pre-training or external data.
Findings
More accurate and efficient than alternating minimization methods.
Embeddings are effective across multiple downstream tasks.
Framework handles varying sentence lengths without additional complexity.
Abstract
Unsupervised text embeddings extraction is crucial for text understanding in machine learning. Word2Vec and its variants have received substantial success in mapping words with similar syntactic or semantic meaning to vectors close to each other. However, extracting context-aware word-sequence embedding remains a challenging task. Training over large corpus is difficult as labels are difficult to get. More importantly, it is challenging for pre-trained models to obtain word-sequence embeddings that are universally good for all downstream tasks or for any new datasets. We propose a two-phased ConvDic+DeconvDec framework to solve the problem by combining a word-sequence dictionary learning model with a word-sequence embedding decode model. We propose a convolutional tensor decomposition mechanism to learn good word-sequence phrase dictionary in the learning phase. It is proved to be more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
