pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy
Kartheik G. Iyer, Mikaeel Yunus, Charles O'Neill, Christine Ye, Alina, Hyk, Kiera McCormick, Ioana Ciuca, John F. Wu, Alberto Accomazzi, Simone, Astarita, Rishabh Chakrabarty, Jesse Cranney, Anjalie Field, Tirthankar, Ghosal, Michele Ginolfi, Marc Huertas-Company, Maja Jablonska

TL;DR
Pathfinder is a machine learning framework that leverages large language models to enable semantic search and knowledge discovery in astronomical literature, improving navigation and synthesis of vast research corpora.
Contribution
It introduces a novel semantic search approach using LLMs for astronomy literature review, addressing jargon, temporal, and citation complexities, with versatile applications beyond simple retrieval.
Findings
Effective semantic search demonstrated through case studies
Performance evaluated with custom benchmarks
Enhanced literature exploration and knowledge synthesis
Abstract
The exponential growth of astronomical literature poses significant challenges for researchers navigating and synthesizing general insights or even domain-specific knowledge. We present Pathfinder, a machine learning framework designed to enable literature review and knowledge discovery in astronomy, focusing on semantic searching with natural language instead of syntactic searches with keywords. Utilizing state-of-the-art large language models (LLMs) and a corpus of 350,000 peer-reviewed papers from the Astrophysics Data System (ADS), Pathfinder offers an innovative approach to scientific inquiry and literature exploration. Our framework couples advanced retrieval techniques with LLM-based synthesis to search astronomical literature by semantic context as a complement to currently existing methods that use keywords or citation graphs. It addresses complexities of jargon, named…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence · Time Series Analysis and Forecasting · Astronomical Observations and Instrumentation
