Extracting Information About Publication Venues Using Citation-Informed Transformers

Brian D. Zimmerman; Joshua Folkins; Olga Vechtomova

arXiv:2506.08199·cs.DL·June 11, 2025

Extracting Information About Publication Venues Using Citation-Informed Transformers

Brian D. Zimmerman, Joshua Folkins, Olga Vechtomova

PDF

Open Access

TL;DR

This paper investigates how scientific document embeddings, generated by the SPECTER model, can reveal trends and similarities among computer science publication venues over time.

Contribution

It introduces a method to analyze venue similarity and research trends using citation-informed embeddings, highlighting convergence among venues.

Findings

01

Some venues are indistinguishable based on embeddings.

02

Certain venues show increasing similarity over time.

03

Embeddings reveal research trend convergence.

Abstract

Scientific document embeddings contain a variety of rich features which can be harnessed for downstream tasks such as recommendation, ranking, and clustering. We explore which tangible insights can be drawn from scientific document embeddings to understand trends in computer science research featured across nine well-known venues. We collect approximately 60,000 scientific documents published between 2015 and 2023 and analyze their embeddings, which we produce with the SPECTER pre-trained language model. In particular, we examine whether similarity between two venues can be measured using the embeddings of the scientific documents they admit for publication. Our findings indicate that some venues within computer science are indistinguishable when only considering the distributions of their document embeddings. We additionally examine whether any two venues are becoming increasingly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Biomedical Text Mining and Ontologies · Topic Modeling