Bibliographic Analysis on Research Publications using Authors, Categorical Labels and the Citation Network
Kar Wai Lim, Wray Buntine

TL;DR
This paper introduces the Citation Network Topic Model (CNTM), a novel nonparametric approach that integrates authors, topics, and citation networks to analyze research publications, demonstrating improved clustering and visualization capabilities.
Contribution
The paper presents CNTM, a new nonparametric model combining bibliographic elements with an efficient inference algorithm, enhancing research publication analysis.
Findings
Improved model fitting and document clustering over baselines.
Effective visualization of author-topics network.
Successful incorporation of supervision for better clustering.
Abstract
Bibliographic analysis considers the author's research areas, the citation network and the paper content among other things. In this paper, we combine these three in a topic model that produces a bibliographic model of authors, topics and documents, using a nonparametric extension of a combination of the Poisson mixed-topic link model and the author-topic model. This gives rise to the Citation Network Topic Model (CNTM). We propose a novel and efficient inference algorithm for the CNTM to explore subsets of research publications from CiteSeerX. The publication datasets are organised into three corpora, totalling to about 168k publications with about 62k authors. The queried datasets are made available online. In three publicly available corpora in addition to the queried datasets, our proposed model demonstrates an improved performance in both model fitting and document clustering,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
