Topological Signatures of Adversaries in Multimodal Alignments
Minh Vu, Geigh Zollicoffer, Huy Mai, Ben Nebgen, Boian Alexandrov,, Manish Bhattarai

TL;DR
This paper explores how adversarial attacks disrupt the topological alignment between image and text embeddings in multimodal models, proposing novel topological losses and detection methods to improve robustness.
Contribution
It introduces new topological-contrastive losses and a detection algorithm leveraging persistent homology to identify adversarial attacks in multimodal systems.
Findings
Adversarial attacks cause monotonic changes in topological signatures.
Proposed methods improve detection of adversarial samples.
Topological analysis enhances robustness in multimodal alignment.
Abstract
Multimodal Machine Learning systems, particularly those aligning text and image data like CLIP/BLIP models, have become increasingly prevalent, yet remain susceptible to adversarial attacks. While substantial research has addressed adversarial robustness in unimodal contexts, defense strategies for multimodal systems are underexplored. This work investigates the topological signatures that arise between image and text embeddings and shows how adversarial attacks disrupt their alignment, introducing distinctive signatures. We specifically leverage persistent homology and introduce two novel Topological-Contrastive losses based on Total Persistence and Multi-scale kernel methods to analyze the topological signatures introduced by adversarial perturbations. We observe a pattern of monotonic changes in the proposed topological losses emerging in a wide range of attacks on image-text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLiterature, Language, and Rhetoric Studies
