A Tree Search Algorithm for Sequence Labeling
Yadi Lao, Jun Xu, Yanyan Lan, Jiafeng Guo, Sheng Gao, Xueqi Cheng

TL;DR
This paper introduces MM-Tag, a novel reinforcement learning model using Monte Carlo tree search for sequence tagging, which outperforms existing methods like CRF and LSTM-based models on chunking tasks.
Contribution
The paper is the first to apply MCTS-enhanced MDP to sequence tagging, integrating reinforcement learning with LSTMs for improved accuracy.
Findings
MM-Tag outperforms state-of-the-art baselines.
MCTS improves exploration in sequence tagging.
Reinforcement learning effectively trains the model.
Abstract
In this paper we propose a novel reinforcement learning based model for sequence tagging, referred to as MM-Tag. Inspired by the success and methodology of the AlphaGo Zero, MM-Tag formalizes the problem of sequence tagging with a Monte Carlo tree search (MCTS) enhanced Markov decision process (MDP) model, in which the time steps correspond to the positions of words in a sentence from left to right, and each action corresponds to assign a tag to a word. Two long short-term memory networks (LSTM) are used to summarize the past tag assignments and words in the sentence. Based on the outputs of LSTMs, the policy for guiding the tag assignment and the value for predicting the whole tagging accuracy of the whole sentence are produced. The policy and value are then strengthened with MCTS, which takes the produced raw policy and value as inputs, simulates and evaluates the possible tag…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Conditional Random Field
