Analyzing the State of Computer Science Research with the DBLP Discovery Dataset
Lennart K\"ull

TL;DR
This paper presents CS-Insights, an open-access web system analyzing 5 million CS publications from the DBLP dataset, revealing trends, impact, and topic shifts in computer science research over two decades.
Contribution
It introduces the CS-Insights system for flexible, visual scientometric analysis of CS publications using the open DBLP Discovery Dataset.
Findings
Rapid growth in publications, authors, and venues in last 20 years.
Authors tend to join the field recently, indicating ongoing expansion.
Conference publications are declining relative to journals, with journals receiving more citations.
Abstract
The number of scientific publications continues to rise exponentially, especially in Computer Science (CS). However, current solutions to analyze those publications restrict access behind a paywall, offer no features for visual analysis, limit access to their data, only focus on niches or sub-fields, and/or are not flexible and modular enough to be transferred to other datasets. In this thesis, we conduct a scientometric analysis to uncover the implicit patterns hidden in CS metadata and to determine the state of CS research. Specifically, we investigate trends of the quantity, impact, and topics for authors, venues, document types (conferences vs. journals), and fields of study (compared to, e.g., medicine). To achieve this we introduce the CS-Insights system, an interactive web application to analyze CS publications with various dashboards, filters, and visualizations. The data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Visualization and Analytics · Research Data Management Practices
