Improving Joint Layer RNN based Keyphrase Extraction by Using Syntactical Features
Miftahul Mahfuzh, Sidik Soleman, Ayu Purwarianti

TL;DR
This paper enhances joint layer RNN for Indonesian Twitter keyphrase extraction by incorporating syntactical features and data augmentation, resulting in improved accuracy and F1 scores.
Contribution
It introduces a modified JRNN model that uses syntactical features and data augmentation for better keyphrase extraction from social media texts.
Findings
Achieved 0.9597 accuracy and 0.7691 F1 score.
Outperformed baseline keyphrase extraction methods.
Effective use of syntactical features and data augmentation.
Abstract
Keyphrase extraction as a task to identify important words or phrases from a text, is a crucial process to identify main topics when analyzing texts from a social media platform. In our study, we focus on text written in Indonesia language taken from Twitter. Different from the original joint layer recurrent neural network (JRNN) with output of one sequence of keywords and using only word embedding, here we propose to modify the input layer of JRNN to extract more than one sequence of keywords by additional information of syntactical features, namely part of speech, named entity types, and dependency structures. Since JRNN in general requires a large amount of data as the training examples and creating those examples is expensive, we used a data augmentation method to increase the number of training examples. Our experiment had shown that our method outperformed the baseline methods.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
