Prompting or Fine-tuning? Exploring Large Language Models for Causal   Graph Validation

Yuni Susanti; Nina Holsmoelle

arXiv:2406.16899·cs.CL·April 16, 2025

Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation

Yuni Susanti, Nina Holsmoelle

PDF

Open Access

TL;DR

This paper investigates the use of Large Language Models to evaluate causal graphs, comparing prompting and fine-tuning methods, and finds that fine-tuned models outperform prompting approaches in accuracy.

Contribution

It introduces a systematic comparison of prompting versus fine-tuning LLMs for causal relation evaluation, highlighting the superior performance of fine-tuned models.

Findings

01

Fine-tuned models outperform prompting methods by up to 20.5 F1 points.

02

Fine-tuning yields better causal inference accuracy even with smaller models.

03

LLMs can effectively evaluate causality in biomedical and general domains.

Abstract

This study explores the capability of Large Language Models (LLMs) to evaluate causality in causal graphs generated by conventional statistical causal discovery methods-a task traditionally reliant on manual assessment by human subject matter experts. To bridge this gap in causality assessment, LLMs are employed to evaluate the causal relationships by determining whether a causal connection between variable pairs can be inferred from textual context. Our study compares two approaches: (1) prompting-based method for zero-shot and few-shot causal inference and, (2) fine-tuning language models for the causal relation prediction task. While prompt-based LLMs have demonstrated versatility across various NLP tasks, our experiments on biomedical and general-domain datasets show that fine-tuned models consistently outperform them, achieving up to a 20.5-point improvement in F1 score-even when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Data Quality and Management · Semantic Web and Ontologies