Towards Finding Non-obvious Papers: An Analysis of Citation Recommender Systems
Haofeng Jia, Erik Saule

TL;DR
This paper analyzes citation recommendation systems, revealing their limitations in finding loosely connected papers, and proposes combining multiple data sources and algorithms to improve recommendation diversity and efficiency.
Contribution
It introduces a novel analysis of citation graph properties and proposes combining author, venue, and keyword data to enhance citation recommendation methods.
Findings
Existing methods mainly find highly connected papers
Loosely connected papers are underrepresented in recommendations
Combining multiple data sources improves recommendation diversity
Abstract
As science advances, the academic community has published millions of research papers. Researchers devote time and effort to search relevant manuscripts when writing a paper or simply to keep up with current research. In this paper, we consider the problem of citation recommendation by extending a set of known-to-be-relevant references. Our analysis shows the degrees of cited papers in the subgraph induced by the citations of a paper, called projection graph, follow a power law distribution. Existing popular methods are only good at finding the long tail papers, the ones that are highly connected to others. In other words, the majority of cited papers are loosely connected in the projection graph but they are not going to be found by existing methods. To address this problem, we propose to combine author, venue and keyword information to interpret the citation behavior behind those…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Information Retrieval and Search Behavior
