On Evaluating the Adversarial Robustness of Foundation Models for Multimodal Entity Linking

Fang Wang; Yongjie Wang; Zonghao Yang; Minghao Hu; Xiaoying Bai

arXiv:2508.15481·cs.IR·August 22, 2025

On Evaluating the Adversarial Robustness of Foundation Models for Multimodal Entity Linking

Fang Wang, Yongjie Wang, Zonghao Yang, Minghao Hu, Xiaoying Bai

PDF

Open Access

TL;DR

This paper evaluates the robustness of multimodal entity linking models against visual adversarial attacks, revealing their vulnerabilities and proposing a new method that significantly enhances their resilience, especially under adversarial conditions.

Contribution

It provides the first systematic assessment of MEL models' robustness to visual adversarial attacks and introduces LLM-RetLink, a novel approach that improves anti-interference capabilities.

Findings

01

Current MEL models lack robustness against visual perturbations.

02

Contextual semantic information can mitigate adversarial effects.

03

LLM-RetLink improves MEL accuracy by up to 35.7%, especially under attack.

Abstract

The explosive growth of multimodal data has driven the rapid development of multimodal entity linking (MEL) models. However, existing studies have not systematically investigated the impact of visual adversarial attacks on MEL models. We conduct the first comprehensive evaluation of the robustness of mainstream MEL models under different adversarial attack scenarios, covering two core tasks: Image-to-Text (I2T) and Image+Text-to-Text (IT2T). Experimental results show that current MEL models generally lack sufficient robustness against visual perturbations. Interestingly, contextual semantic information in input can partially mitigate the impact of adversarial perturbations. Based on this insight, we propose an LLM and Retrieval-Augmented Entity Linking (LLM-RetLink), which significantly improves the model's anti-interference ability through a two-stage process: first, extracting initial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Graph Neural Networks