Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT
Ahan Bhatt, Nandan Vaghela, Kush Dudhia

TL;DR
This study compares GPT-4, LLaMA 2, and BERT in generating knowledge graphs from unstructured data, highlighting GPT-4's superior semantic and structural performance and LLaMA 2's efficiency in domain-specific contexts.
Contribution
It introduces a novel LLM-based method for direct KG generation from unstructured data, bypassing traditional pipelines, and provides a comprehensive evaluation of different models' capabilities.
Findings
GPT-4 achieves highest semantic fidelity and structural accuracy.
LLaMA 2 performs well in lightweight, domain-specific graphs.
BERT reveals challenges in entity-relationship modeling.
Abstract
Knowledge Graphs (KGs) are essential for the functionality of GraphRAGs, a form of Retrieval-Augmented Generative Systems (RAGs) that excel in tasks requiring structured reasoning and semantic understanding. However, creating KGs for GraphRAGs remains a significant challenge due to accuracy and scalability limitations of traditional methods. This paper introduces a novel approach leveraging large language models (LLMs) like GPT-4, LLaMA 2 (13B), and BERT to generate KGs directly from unstructured data, bypassing traditional pipelines. Using metrics such as Precision, Recall, F1-Score, Graph Edit Distance, and Semantic Similarity, we evaluate the models' ability to generate high-quality KGs. Results demonstrate that GPT-4 achieves superior semantic fidelity and structural accuracy, LLaMA 2 excels in lightweight, domain-specific graphs, and BERT provides insights into challenges in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Linear Layer · Linear Warmup With Linear Decay · Multi-Head Attention · Weight Decay · WordPiece
