Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Dhruva Sahrawat, Debanjan Mahata, Mayank Kulkarni, Haimin Zhang,, Rakesh Gosangi, Amanda Stent, Agniv Sharma, Yaman Kumar, Rajiv Ratn Shah,, Roger Zimmermann

TL;DR
This paper introduces a sequence labeling approach using BiLSTM-CRF with contextualized embeddings like BERT for keyphrase extraction from scholarly articles, demonstrating improved performance over traditional methods.
Contribution
It presents a novel architecture combining BiLSTM-CRF with contextualized embeddings and provides comprehensive evaluation and analysis on multiple benchmark datasets.
Findings
Contextualized embeddings outperform fixed embeddings.
BiLSTM-CRF with contextualized embeddings outperforms fine-tuning models.
Genre-specific embeddings like SciBERT improve results.
Abstract
In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings. We evaluate the proposed architecture using both contextualized and fixed word embedding models on three different benchmark datasets (Inspec, SemEval 2010, SemEval 2017) and compare with existing popular unsupervised and supervised techniques. Our results quantify the benefits of (a) using contextualized embeddings (e.g. BERT) over fixed word embeddings (e.g. Glove); (b) using a BiLSTM-CRF architecture with contextualized word embeddings over fine-tuning the contextualized word embedding model directly, and (c) using genre-specific contextualized embeddings (SciBERT). Through error analysis, we also provide some insights into why particular models work better than others.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
