Evaluating Large Language Models for Causal Modeling
Houssam Razouk, Leonie Benischke, Georg Niess, Roman Kern

TL;DR
This paper explores how large language models can assist in causal modeling tasks, showing their strengths in knowledge distillation and the importance of domain context for model performance.
Contribution
It introduces two novel tasks for causal knowledge extraction using LLMs and compares their effectiveness with sparse expert models.
Findings
LLMs outperform sparse models in distilling causal variables.
Sparse models excel at identifying interaction entities.
Performance depends on the domain where entities are generated.
Abstract
In this paper, we consider the process of transforming causal domain knowledge into a representation that aligns more closely with guidelines from causal data science. To this end, we introduce two novel tasks related to distilling causal domain knowledge into causal variables and detecting interaction entities using LLMs. We have determined that contemporary LLMs are helpful tools for conducting causal modeling tasks in collaboration with human experts, as they can provide a wider perspective. Specifically, LLMs, such as GPT-4-turbo and Llama3-70b, perform better in distilling causal domain knowledge into causal variables compared to sparse expert models, such as Mixtral-8x22b. On the contrary, sparse expert models such as Mixtral-8x22b stand out as the most effective in identifying interaction entities. Finally, we highlight the dependency between the domain where the entities are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
