Methods for estimating the size of Google Scholar
Enrique Orduna-Malea, Juan M. Ayllon, Alberto Martin-Martin, Emilio, Delgado Lopez-Cozar

TL;DR
This paper evaluates various methods to estimate Google Scholar's size, revealing an approximate count of 160-165 million documents despite methodological limitations and search inconsistencies.
Contribution
It introduces and compares multiple empirical and internal methods for estimating Google Scholar's size, highlighting their limitations and uncertainties.
Findings
Estimated size of Google Scholar is around 160-165 million documents.
All methods face significant limitations due to search functionality inconsistencies.
Disparate values obtained from different estimation methods highlight uncertainties.
Abstract
The emergence of academic search engines (mainly Google Scholar and Microsoft Academic Search) that aspire to index the entirety of current academic knowledge has revived and increased interest in the size of the academic web. The main objective of this paper is to propose various methods to estimate the current size (number of indexed documents) of Google Scholar (May 2014) and to determine its validity, precision and reliability. To do this, we present, apply and discuss three empirical methods: an external estimate based on empirical studies of Google Scholar coverage, and two internal estimate methods based on direct, empty and absurd queries, respectively. The results, despite providing disparate values, place the estimated size of Google Scholar at around 160 to 165 million documents. However, all the methods show considerable limitations and uncertainties due to inconsistencies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
