Co-factor analysis of citation networks

Alex Hayes; Karl Rohe

arXiv:2408.14604·stat.ME·October 7, 2025·J. Comput. Graph. Stat.

Co-factor analysis of citation networks

Alex Hayes, Karl Rohe

PDF

TL;DR

This paper introduces a novel co-factor embedding method for citation networks, enabling the analysis of how papers cite and are cited, overcoming challenges posed by the asymmetric and incomplete nature of citation data.

Contribution

We develop a co-factor model for asymmetric citation matrices with missing data, framing estimation as a matrix completion problem, and apply it to analyze a comprehensive statistics literature dataset.

Findings

01

Identified interpretable co-factors corresponding to statistical subfields

02

Demonstrated the effectiveness of the estimator through simulations

03

Produced the most comprehensive topic model of the statistics literature to date

Abstract

One compelling use of citation networks is to characterize papers by their relationships to the surrounding literature. We propose a method to characterize papers by embedding them into two distinct "co-factor" spaces: one describing how papers send citations, and the other describing how papers receive citations. This approach presents several challenges. First, older documents cannot cite newer documents, and thus it is not clear that co-factors are even identifiable. We resolve this challenge by developing a co-factor model for asymmetric adjacency matrices with missing lower triangles and showing that identification is possible. We then frame estimation as a matrix completion problem and develop a specialized implementation of matrix completion because prior implementations are memory bound in our setting. Simulations show that our estimator has promising finite sample properties,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.