Exploring the academic invisible web
Dirk Lewandowski, Philipp Mayr

TL;DR
This paper critically reviews Bergman's 2001 estimate of the Deep Web, introduces the concept of the Academic Invisible Web, and provides a new size estimate highlighting the need for collaborative indexing efforts.
Contribution
It challenges previous size estimates of the Deep Web, introduces the Academic Invisible Web concept, and offers a revised size estimate based on a literature review and informetric analysis.
Findings
Bergman's size estimate of the Invisible Web is highly questionable
A new size estimate for the Academic Invisible Web is provided
No single library can index the entire Academic Invisible Web
Abstract
Purpose: To provide a critical review of Bergman's 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on informetric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman's size estimate of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimate is given. Research limitations/implications:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
