Introducing the Hidden Neural Markov Chain framework

Elie Azeraf; Emmanuel Monfrini; Emmanuel Vignon; Wojciech Pieczynski

arXiv:2102.11038·cs.CL·February 23, 2021

Introducing the Hidden Neural Markov Chain framework

Elie Azeraf, Emmanuel Monfrini, Emmanuel Vignon, Wojciech Pieczynski

PDF

Open Access

TL;DR

This paper introduces the Hidden Neural Markov Chain framework, a novel neural model based on Hidden Markov Models, demonstrating superior performance on sequence labeling tasks compared to traditional RNNs.

Contribution

The paper presents a new family of neural models based on HMMs, expanding beyond RNNs for sequential data processing, and shows their effectiveness in NLP tasks.

Findings

01

Proposed models outperform RNN and BiRNN in sequence labeling tasks.

02

Models achieve best results regardless of architecture or embedding method.

03

Potential to compete with BiLSTM and BiGRU in NLP applications.

Abstract

Nowadays, neural network models achieve state-of-the-art results in many areas as computer vision or speech processing. For sequential data, especially for Natural Language Processing (NLP) tasks, Recurrent Neural Networks (RNNs) and their extensions, the Long Short Term Memory (LSTM) network and the Gated Recurrent Unit (GRU), are among the most used models, having a "term-to-term" sequence processing. However, if many works create extensions and improvements of the RNN, few have focused on developing other ways for sequential data processing with neural networks in a "term-to-term" way. This paper proposes the original Hidden Neural Markov Chain (HNMC) framework, a new family of sequential neural models. They are not based on the RNN but on the Hidden Markov Model (HMM), a probabilistic graphical model. This neural extension is possible thanks to the recent Entropic Forward-Backward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Bidirectional LSTM · Bidirectional GRU