Large Language Models and Knowledge Graphs for Astronomical Entity Disambiguation
Golnaz Shapurian

TL;DR
This paper explores using GPT-4 and knowledge graph clustering to extract and disambiguate astronomical entities from text, demonstrating an effective method for organizing complex astronomical information.
Contribution
It introduces a novel approach combining large language models and clustering algorithms for entity disambiguation in astronomical texts.
Findings
Effective entity disambiguation achieved
Knowledge graph clustering reveals meaningful astronomical groupings
Demonstrates potential for automated astronomical data analysis
Abstract
This paper presents an experiment conducted during a hackathon, focusing on using large language models (LLMs) and knowledge graph clustering to extract entities and relationships from astronomical text. The study demonstrates an approach to disambiguate entities that can appear in various contexts within the astronomical domain. By collecting excerpts around specific entities and leveraging the GPT-4 language model, relevant entities and relationships are extracted. The extracted information is then used to construct a knowledge graph, which is clustered using the Leiden algorithm. The resulting Leiden communities are utilized to identify the percentage of association of unknown excerpts to each community, thereby enabling disambiguation. The experiment showcases the potential of combining LLMs and knowledge graph clustering techniques for information extraction in astronomical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSAS software applications and methods · Molecular spectroscopy and chirality · Web Data Mining and Analysis
MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
