Prioritization of COVID-19-related literature via unsupervised keyphrase   extraction and document representation learning

Bla\v{z} \v{S}krlj; Marko Juki\v{c}; Nika Er\v{z}en; Senja; Pollak; Nada Lavra\v{c}

arXiv:2110.08874·cs.IR·October 19, 2021

Prioritization of COVID-19-related literature via unsupervised keyphrase extraction and document representation learning

Bla\v{z} \v{S}krlj, Marko Juki\v{c}, Nika Er\v{z}en, Senja, Pollak, Nada Lavra\v{c}

PDF

TL;DR

This paper presents an unsupervised keyphrase extraction and document embedding approach to prioritize and explore COVID-19-related scientific literature efficiently, enabling interactive search and analysis without manual annotation.

Contribution

The authors introduce a novel unsupervised method for annotating COVID-19 literature to facilitate document retrieval and exploration in a learned embedding space.

Findings

01

Effective unsupervised keyphrase extraction for COVID-19 literature

02

Web-based interactive search system demonstrated in case studies

03

Improved exploration of scientific papers in medicinal chemistry

Abstract

The COVID-19 pandemic triggered a wave of novel scientific literature that is impossible to inspect and study in a reasonable time frame manually. Current machine learning methods offer to project such body of literature into the vector space, where similar documents are located close to each other, offering an insightful exploration of scientific papers and other knowledge sources associated with COVID-19. However, to start searching, such texts need to be appropriately annotated, which is seldom the case due to the lack of human resources. In our system, the current body of COVID-19-related literature is annotated using unsupervised keyphrase extraction, facilitating the initial queries to the latent space containing the learned document embeddings (low-dimensional representations). The solution is accessible through a web server capable of interactive search, term ranking, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.