Detection of metadata manipulations: Finding sneaked references in the scholarly literature
Lonni Besan\c{c}on, Guillaume Cabanac, Cyril Labb\'e, Alexander, Magazinov, Jules di Scala, Dominika Tkaczyk, Kathryn Weber-Boer

TL;DR
This paper uncovers a new form of metadata manipulation involving sneaked references in scholarly publications, evaluates methods to detect them automatically, and discusses scaling detection across literature.
Contribution
It introduces the concept of sneaked references, provides a dataset of such references in a specific journal, and assesses three detection methods for identifying these manipulations.
Findings
Identified 80,205 sneaked references in IJISRT metadata.
Compared three methods for automatic detection of sneaked references.
Explored scalability of detection techniques across scholarly literature.
Abstract
We report evidence of a new set of sneaked references discovered in the scientific literature. Sneaked references are references registered in the metadata of publications without being listed in reference section or in the full text of the actual publications where they ought to be found. We document here 80,205 references sneaked in metadata of the International Journal of Innovative Science and Research Technology (IJISRT). These sneaked references are registered with Crossref and all cite -- thus benefit -- this same journal. Using this dataset, we evaluate three different methods to automatically identify sneaked references. These methods compare reference lists registered with Crossref against the full text or the reference lists extracted from PDF files. In addition, we report attempts to scale the search for sneaked references to the scholarly literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLibrary Science and Information Systems · Semantic Web and Ontologies
