Patterns versus Characters in Subword-aware Neural Language Modeling

Rustem Takhanov; Zhenisbek Assylbekov

arXiv:1709.00541·cs.CL·September 5, 2017

Patterns versus Characters in Subword-aware Neural Language Modeling

Rustem Takhanov, Zhenisbek Assylbekov

PDF

Open Access 1 Repo

TL;DR

This paper introduces pattern-based subword representations for neural language models, outperforming character-based models by capturing internal word structures more effectively.

Contribution

It proposes a novel pattern extraction method using CRFs with l1 regularization, improving word representations in language modeling tasks.

Findings

01

Pattern-based models outperform character-based models by 2-20 perplexity points.

02

Pattern embeddings match the performance of complex character-based architectures.

03

Using patterns enhances the representation of internal word structure.

Abstract

Words in some natural languages can have a composite structure. Elements of this structure include the root (that could also be composite), prefixes and suffixes with which various nuances and relations to other words can be expressed. Thus, in order to build a proper word representation one must take into account its internal structure. From a corpus of texts we extract a set of frequent subwords and from the latter set we select patterns, i.e. subwords which encapsulate information on character $n$ -gram regularities. The selection is made using the pattern-based Conditional Random Field model with $l_{1}$ regularization. Further, for every word we construct a new sequence over an alphabet of patterns. The new alphabet's symbols confine a local statistical context stronger than the characters, therefore they allow better representations in $R^{n}$ and are better building blocks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zh3nis/pat-sum
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis