Causality for Large Language Models

Anpeng Wu; Kun Kuang; Minqin Zhu; Yingrong Wang; Yujia Zheng; Kairong; Han; Baohong Li; Guangyi Chen; Fei Wu; Kun Zhang

arXiv:2410.15319·cs.CL·October 22, 2024·5 cites

Causality for Large Language Models

Anpeng Wu, Kun Kuang, Minqin Zhu, Yingrong Wang, Yujia Zheng, Kairong, Han, Baohong Li, Guangyi Chen, Fei Wu, Kun Zhang

PDF

Open Access 1 Repo

TL;DR

This paper explores integrating causality into large language models to improve their reliability, reduce biases, and enable more genuine understanding, covering training, fine-tuning, and evaluation stages.

Contribution

It provides a comprehensive survey of methods and future directions for embedding causality into LLMs beyond prompt engineering and superficial causal knowledge activation.

Findings

01

Identifies limitations of current prompt-based causal methods

02

Proposes integrating causality into LLM training and fine-tuning

03

Outlines six future research directions for causally-informed LLMs

Abstract

Recent breakthroughs in artificial intelligence have driven a paradigm shift, where large language models (LLMs) with billions or trillions of parameters are trained on vast datasets, achieving unprecedented success across a series of language tasks. However, despite these successes, LLMs still rely on probabilistic modeling, which often captures spurious correlations rooted in linguistic patterns and social stereotypes, rather than the true causal relationships between entities and events. This limitation renders LLMs vulnerable to issues such as demographic biases, social stereotypes, and LLM hallucinations. These challenges highlight the urgent need to integrate causality into LLMs, moving beyond correlation-driven paradigms to build more reliable and ethically aligned AI systems. While many existing surveys and studies focus on utilizing prompt engineering to activate LLMs for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

causal-machine-learning-lab/awesome-causal-llm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsFocus