Towards Modelling and Verification of Social Explainable AI
Damian Kurpiewski, Wojciech Jamroga, Teofil Sidoruk

TL;DR
This paper introduces a formal approach to model and verify Social Explainable AI (SAI), focusing on its resilience against malicious activities using strategic ability logics and the STV model checker.
Contribution
It pioneers the application of multi-agent strategic ability models and formal verification techniques to assess SAI's robustness against attacks.
Findings
First formal modeling of SAI environments
Verification of resistance to compromised AI modules
Use of STV model checker for strategic ability analysis
Abstract
Social Explainable AI (SAI) is a new direction in artificial intelligence that emphasises decentralisation, transparency, social context, and focus on the human users. SAI research is still at an early stage. Consequently, it concentrates on delivering the intended functionalities, but largely ignores the possibility of unwelcome behaviours due to malicious or erroneous activity. We propose that, in order to capture the breadth of relevant aspects, one can use models and logics of strategic ability, that have been developed in multi-agent systems. Using the STV model checker, we take the first step towards the formal modelling and verification of SAI environments, in particular of their resistance to various types of attacks by compromised AI modules.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Ethics and Social Impacts of AI · Access Control and Trust
