COVID-19 Multidimensional Kaggle Literature Organization
Maksim E. Eren, Nick Solovyev, Chris Hamer, Renee McDonald, Boian S., Alexandrov, Charles Nicholas

TL;DR
This paper introduces a multidimensional tensor factorization approach to organize COVID-19 research literature, enabling simultaneous grouping of articles, journals, authors, and keywords for better information retrieval.
Contribution
It extends previous clustering methods by applying tensor decomposition to analyze multiple dimensions of COVID-19 literature simultaneously, revealing hidden patterns.
Findings
Effective grouping of articles, journals, authors, and keywords
Enhanced visualization of research topics and relationships
Demonstrated method on CORD-19 dataset with interactive tools
Abstract
The unprecedented outbreak of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), or COVID-19, continues to be a significant worldwide problem. As a result, a surge of new COVID-19 related research has followed suit. The growing number of publications requires document organization methods to identify relevant information. In this paper, we expand upon our previous work with clustering the CORD-19 dataset by applying multi-dimensional analysis methods. Tensor factorization is a powerful unsupervised learning method capable of discovering hidden patterns in a document corpus. We show that a higher-order representation of the corpus allows for the simultaneous grouping of similar articles, relevant journals, authors with similar research interests, and topic keywords. These groupings are identified within and among the latent components extracted via tensor decomposition. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
