TL;DR
This paper presents a causal attribution model to improve interpretability and causal reasoning in large language models through precise fine-tuning, utilizing do-operators for interventional analysis.
Contribution
It introduces a novel causal attribution approach using do-operators and demonstrates its effectiveness in enhancing LLMs' causal reasoning capabilities.
Findings
LLMs' causal discovery effectiveness depends on context and domain knowledge.
The proposed fine-tuned LLM correctly leverages knowledge and numerical data.
Causal attribution scores improve interpretability and reasoning accuracy.
Abstract
This paper introduces a causal attribution model to enhance the interpretability of large language models (LLMs) and improve their causal reasoning abilities via precise fine-tuning. Despite LLMs' proficiency in diverse tasks, their reasoning processes often remain black box, and thus restrict targeted enhancement. We propose a novel causal attribution model that utilizes "do-operators" for constructing interventional scenarios, allowing us to quantify the contribution of different components in LLMs's causal reasoning process systematically. By assessing the proposed attribution scores through causal discovery tasks across various domains, we demonstrate that LLMs' effectiveness in causal discovery heavily relies on provided context and domain-specific knowledge but can also utilize numerical data with limited calculations in correlation, not causation. This motivates the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
