SkyServer Traffic Report - The First Five Years
Vik Singh, Jim Gray, Ani Thakar, Alexander S. Szalay, Jordan Raddick,, Bill Boroski, Svetlana Lebedeva, Brian Yanny

TL;DR
The SkyServer traffic report analyzes five years of web and SQL usage data to understand user engagement, data access patterns, and the effectiveness of educational and scientific data services.
Contribution
It provides a comprehensive analysis of SkyServer's traffic, usage patterns, and introduces a novel method for correcting incorrect SQL queries using a corpus of correct statements.
Findings
SkyServer attracted over a million visitors and generated extensive data queries.
The site successfully supported educational activities and scientific data access.
A new approach was developed to suggest correct SQL queries based on user input.
Abstract
The SkyServer is an Internet portal to the Sloan Digital Sky Survey Catalog Archive Server. From 2001 to 2006, there were a million visitors in 3 million sessions generating 170 million Web hits, 16 million ad-hoc SQL queries, and 62 million page views. The site currently averages 35 thousand visitors and 400 thousand sessions per month. The Web and SQL logs are public. We analyzed traffic and sessions by duration, usage pattern, data product, and client type (mortal or bot) over time. The analysis shows (1) the site's popularity, (2) the educational website that delivered nearly fifty thousand hours of interactive instruction, (3) the relative use of interactive, programmatic, and batch-local access, (4) the success of offering ad-hoc SQL, personal database, and batch job access to scientists as part of the data publication, (5) the continuing interest in "old" datasets, (6) the usage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Complex Network Analysis Techniques · Data Visualization and Analytics
