Towards Trustworthy and Aligned Machine Learning: A Data-centric Survey with Causality Perspectives
Haoyang Liu, Maheep Chaudhary, Haohan Wang

TL;DR
This survey reviews recent advances in trustworthy machine learning from a data-centric perspective, unifying various methods through causal frameworks and exploring their application to large pretrained models.
Contribution
It introduces a unified language based on Pearl's causal hierarchy to connect diverse trustworthy ML methods and extends this understanding to large pretrained models.
Findings
Causal hierarchy provides a unifying framework for trustworthy ML methods.
Connections between ERM and techniques like fine-tuning and prompting are established.
A cohesive understanding of robustness, interpretability, and fairness methods is developed.
Abstract
The trustworthiness of machine learning has emerged as a critical topic in the field, encompassing various applications and research areas such as robustness, security, interpretability, and fairness. The last decade saw the development of numerous methods addressing these challenges. In this survey, we systematically review these advancements from a data-centric perspective, highlighting the shortcomings of traditional empirical risk minimization (ERM) training in handling challenges posed by the data. Interestingly, we observe a convergence of these methods, despite being developed independently across trustworthy machine learning subfields. Pearl's hierarchy of causality offers a unifying framework for these techniques. Accordingly, this survey presents the background of trustworthy machine learning development using a unified set of concepts, connects this language to Pearl's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
