Generating Knowledge Graphs from Large Language Models: A Comparative   Study of GPT-4, LLaMA 2, and BERT

Ahan Bhatt; Nandan Vaghela; Kush Dudhia

arXiv:2412.07412·cs.CL·December 11, 2024

Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

Ahan Bhatt, Nandan Vaghela, Kush Dudhia

PDF

Open Access

TL;DR

This study compares GPT-4, LLaMA 2, and BERT in generating knowledge graphs from unstructured data, highlighting GPT-4's superior semantic and structural performance and LLaMA 2's efficiency in domain-specific contexts.

Contribution

It introduces a novel LLM-based method for direct KG generation from unstructured data, bypassing traditional pipelines, and provides a comprehensive evaluation of different models' capabilities.

Findings

01

GPT-4 achieves highest semantic fidelity and structural accuracy.

02

LLaMA 2 performs well in lightweight, domain-specific graphs.

03

BERT reveals challenges in entity-relationship modeling.

Abstract

Knowledge Graphs (KGs) are essential for the functionality of GraphRAGs, a form of Retrieval-Augmented Generative Systems (RAGs) that excel in tasks requiring structured reasoning and semantic understanding. However, creating KGs for GraphRAGs remains a significant challenge due to accuracy and scalability limitations of traditional methods. This paper introduces a novel approach leveraging large language models (LLMs) like GPT-4, LLaMA 2 (13B), and BERT to generate KGs directly from unstructured data, bypassing traditional pipelines. Using metrics such as Precision, Recall, F1-Score, Graph Edit Distance, and Semantic Similarity, we evaluate the models' ability to generate high-quality KGs. Results demonstrate that GPT-4 achieves superior semantic fidelity and structural accuracy, LLaMA 2 excels in lightweight, domain-specific graphs, and BERT provides insights into challenges in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Linear Layer · Linear Warmup With Linear Decay · Multi-Head Attention · Weight Decay · WordPiece