Does Symbolic Knowledge Prevent Adversarial Fooling?
Stefano Teso

TL;DR
This paper investigates how incorporating symbolic knowledge into neural models can improve performance but may also inadvertently propagate adversarial vulnerabilities due to the constraints imposed.
Contribution
It highlights the unintended consequence that symbolic constraints in neural models can facilitate the spread of adversarial effects, challenging assumptions about their robustness.
Findings
Symbolic constraints can propagate adversarial effects.
Injecting symbolic knowledge does not always prevent adversarial fooling.
Constraints may have unintended negative consequences on model robustness.
Abstract
Arguments in favor of injecting symbolic knowledge into neural architectures abound. When done right, constraining a sub-symbolic model can substantially improve its performance and sample complexity and prevent it from predicting invalid configurations. Focusing on deep probabilistic (logical) graphical models -- i.e., constrained joint distributions whose parameters are determined (in part) by neural nets based on low-level inputs -- we draw attention to an elementary but unintended consequence of symbolic knowledge: that the resulting constraints can propagate the negative effects of adversarial examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference
