CausalTAD: Injecting Causal Knowledge into Large Language Models for Tabular Anomaly Detection
Ruiqi Wang, Ruikang Liu, Runyu Chen, Haoxiang Suo, Zhiyi Peng, Zhuo Tang, Changjian Chen

TL;DR
CausalTAD improves tabular anomaly detection by integrating causal relationships into large language models, reordering and reweighting columns to enhance detection accuracy, outperforming existing methods across numerous datasets.
Contribution
The paper introduces a novel approach that injects causal knowledge into LLMs for better anomaly detection in tabular data, addressing the limitations of previous text-based conversion methods.
Findings
Outperforms state-of-the-art methods on over 30 datasets
Effective causal reordering improves anomaly detection accuracy
Reweighting strategy enhances the contribution of relevant columns
Abstract
Detecting anomalies in tabular data is critical for many real-world applications, such as credit card fraud detection. With the rapid advancements in large language models (LLMs), state-of-the-art performance in tabular anomaly detection has been achieved by converting tabular data into text and fine-tuning LLMs. However, these methods randomly order columns during conversion, without considering the causal relationships between them, which is crucial for accurately detecting anomalies. In this paper, we present CausalTaD, a method that injects causal knowledge into LLMs for tabular anomaly detection. We first identify the causal relationships between columns and reorder them to align with these causal relationships. This reordering can be modeled as a linear ordering problem. Since each column contributes differently to the causal relationships, we further propose a reweighting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Anomaly Detection Techniques and Applications · Topic Modeling
