Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization
Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Hsin-Hsi Chen

TL;DR
This paper introduces a novel word embedding and translation-based language modeling approach that combines the strengths of existing methods and improves interpretability, specifically applied to extractive speech summarization.
Contribution
It proposes a new word embedding technique that merges prediction and count-based advantages with clearer interpretability, and applies it to develop an effective language model for speech summarization.
Findings
Enhanced word representations improve summarization quality
The proposed methods outperform existing approaches in evaluations
Clearer interpretation of embeddings aids understanding and further development
Abstract
Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words, sentences and documents in context. Celebrated methods can be categorized as prediction-based and count-based methods according to the training objectives and model architectures. Their pros and cons have been extensively analyzed and evaluated in recent studies, but there is relatively less work continuing the line of research to develop an enhanced learning method that brings together the advantages of the two model families. In addition, the interpretation of the learned word representations still remains somewhat opaque. Motivated by the observations and considering the pressing need, this paper presents a novel method for learning the word…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
