SLANT: Spurious Logo ANalysis Toolkit

Maan Qraitem; Piotr Teterwak; Kate Saenko; Bryan A. Plummer

arXiv:2406.01449·cs.CV·June 4, 2024

SLANT: Spurious Logo ANalysis Toolkit

Maan Qraitem, Piotr Teterwak, Kate Saenko, Bryan A. Plummer

PDF

Open Access

TL;DR

SLANT is a toolkit that identifies and analyzes spurious correlations between logos and model predictions in vision-language models, revealing vulnerabilities and suggesting mitigation strategies.

Contribution

We introduce SLANT, a semi-automatic toolkit for mining logos that cause spurious model correlations, highlighting new risks and defenses in vision-language models.

Findings

01

Logos can cause models to misclassify content as harmless or harmful.

02

Certain logos are correlated with negative adjectives and concepts.

03

Logos can be exploited as simple attacks against foundation models.

Abstract

Online content is filled with logos, from ads and social media posts to website branding and product placements. Consequently, these logos are prevalent in the extensive web-scraped datasets used to pretrain Vision-Language Models, which are used for a wide array of tasks (content moderation, object classification). While these models have been shown to learn harmful correlations in various tasks, whether these correlations include logos remains understudied. Understanding this is especially important due to logos often being used by public-facing entities like brands and government agencies. To that end, we develop SLANT: A Spurious Logo ANalysis Toolkit. Our key finding is that some logos indeed lead to spurious incorrect predictions, for example, adding the Adidas logo to a photo of a person causes a model classify the person as greedy. SLANT contains a semi-automatic mechanism for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning · Misinformation and Its Impacts