Towards Machine Learning-Based Meta-Studies: Applications to Cosmological Parameters
Tom Crossland, Pontus Stenetorp, Daisuke Kawata, Sebastian Riedel,, Thomas D. Kitching, Anurag Deshpande, Tom Kimpson, Choong Ling Liew-Cain,, Christian Pedersen, Davide Piras, Monu Sharma

TL;DR
This paper introduces a machine learning model for automatically extracting astrophysical measurement data from literature, creating a large database, and providing an online tool for researchers to analyze cosmological parameters over time.
Contribution
The paper presents a novel NLP-based system for extracting astrophysical measurements from literature and an interactive database with analysis capabilities for cosmological research.
Findings
Successfully extracted over 231,000 measurements from 248,000 articles.
Enabled analysis of historical trends in cosmological parameters.
Demonstrated the impact of landmark publications on parameter estimates.
Abstract
We develop a new model for automatic extraction of reported measurement values from the astrophysical literature, utilising modern Natural Language Processing techniques. We use this model to extract measurements present in the abstracts of the approximately 248,000 astrophysics articles from the arXiv repository, yielding a database containing over 231,000 astrophysical numerical measurements. Furthermore, we present an online interface (Numerical Atlas) to allow users to query and explore this database, based on parameter names and symbolic representations, and download the resulting datasets for their own research uses. To illustrate potential use cases we then collect values for nine different cosmological parameters using this tool. From these results we can clearly observe the historical trends in the reported values of these quantities over the past two decades, and see the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Time Series Analysis and Forecasting · Big Data Technologies and Applications
