Paper to Screen: Processing Historical Scans in the ADS
Donna M. Thompson, Alberto Accomazzi, Guenther Eichhorn, Carolyn, Grant, Edwin Henneken, Michael J. Kurtz, Elizabeth Bohlen, Stephen S. Murray

TL;DR
This paper discusses the challenges and solutions involved in processing and digitizing historical observatory publications for inclusion in the NASA Astrophysics Data System, enhancing searchability of scanned historical literature.
Contribution
It introduces methods to handle unpaginated and bibliographically incomplete scanned pages for improved digital archiving and search functions.
Findings
Developed techniques for processing unpaginated scans
Enhanced metadata extraction for historical documents
Improved searchability of scanned observatory reports
Abstract
The NASA Astrophysics Data System in conjunction with the Wolbach Library at the Harvard-Smithsonian Center for Astrophysics is working on a project to microfilm historical observatory publications. The microfilm is then scanned for inclusion in the ADS. The ADS currently contains over 700,000 scanned pages of volumes of historical literature. Many of these volumes lack clear pagination or other bibliographic data that are necessary to take advantage of the searching capabilities of the ADS. This paper will address some of the interesting challenges that needed to be resolved during the processing of the Observatory Reports included in the ADS.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHistory and Developments in Astronomy · Space Technology and Applications · Diverse Historical and Scientific Studies
