Score: 1

Extracting Information About Publication Venues Using Citation-Informed Transformers

Published: June 9, 2025 | arXiv ID: 2506.08199v1

By: Brian D. Zimmerman, Joshua Folkins, Olga Vechtomova

Potential Business Impact:

Shows how computer science topics are changing.

Scientific document embeddings contain a variety of rich features which can be harnessed for downstream tasks such as recommendation, ranking, and clustering. We explore which tangible insights can be drawn from scientific document embeddings to understand trends in computer science research featured across nine well-known venues. We collect approximately 60,000 scientific documents published between 2015 and 2023 and analyze their embeddings, which we produce with the SPECTER pre-trained language model. In particular, we examine whether similarity between two venues can be measured using the embeddings of the scientific documents they admit for publication. Our findings indicate that some venues within computer science are indistinguishable when only considering the distributions of their document embeddings. We additionally examine whether any two venues are becoming increasingly similar over time and identify a trend of convergence within some venues in our analysis. We discuss the implications of these results and the potential impact on new scientific contributions.

Country of Origin
🇨🇦 Canada

Repos / Data Links

Page Count
5 pages

Category
Computer Science:
Digital Libraries