Domain-topic models with chained dimensions: charting an emergent domain of a major oncology conference
Alexandre Hannud Abdo, Jean-Philippe Cointet, Pascale Bourret, Alberto, Cambrosio

TL;DR
This paper introduces a novel graph-based modeling approach called domain-topic and domain-chained models for analyzing bibliographic corpora, enabling detailed mapping of research domains and their metadata over time.
Contribution
It presents the development of domain-topic and domain-chained models using stochastic block models to analyze document clusters and metadata in science mapping.
Findings
Identified growing research domains in oncology conferences over decades.
Linked domain evolution to the emergence of 'oncopolicy' as a major concern.
Provided interactive tools for exploring document and metadata clusters.
Abstract
This paper presents a contribution to the study of bibliographic corpora in the context of science mapping. Starting from a graph representation of documents and their textual dimension, we observe that stochastic block models (SBMs) can provide a simultaneous clustering of documents and words that we call a domain-topic model. Previous work by (Gerlach et al., 2018) investigated the resulting topics, or word clusters, while ours focuses on the study of the document clusters, which we call domains. To enable the synthetic description and interactive navigation of domains, we introduce measures and interfaces relating both types of clusters, which reflect the structure of the graph and the model. We then present a procedure that, starting from the document clusters, extends the block model to also cluster arbitrary metadata attributes of the documents. We call this procedure a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
