AUV-Fusion: Cross-Modal Adversarial Fusion of User Interactions and Visual Perturbations Against VARS

Hai Ling; Tianchi Wang; Xiaohao Liu; Zhulin Tao; Lifang Yang; Xianglin Huang

arXiv:2507.22880·cs.IR·July 31, 2025

AUV-Fusion: Cross-Modal Adversarial Fusion of User Interactions and Visual Perturbations Against VARS

Hai Ling, Tianchi Wang, Xiaohao Liu, Zhulin Tao, Lifang Yang, Xianglin Huang

PDF

TL;DR

AUV-Fusion introduces a novel cross-modal adversarial attack framework that combines user interactions and visual perturbations to effectively compromise visual-aware recommender systems while maintaining stealth.

Contribution

It presents a new attack method leveraging high-order user modeling and cross-modal perturbations, improving attack effectiveness and stealth without fake profiles.

Findings

01

Enhances exposure of target cold-start items in VARS.

02

Maintains high stealth under rigorous detection.

03

Effective across diverse VARS architectures.

Abstract

Modern Visual-Aware Recommender Systems (VARS) exploit the integration of user interaction data and visual features to deliver personalized recommendations with high precision. However, their robustness against adversarial attacks remains largely underexplored, posing significant risks to system reliability and security. Existing attack strategies suffer from notable limitations: shilling attacks are costly and detectable, and visual-only perturbations often fail to align with user preferences. To address these challenges, we propose AUV-Fusion, a cross-modal adversarial attack framework that adopts high-order user preference modeling and cross-modal adversary generation. Specifically, we obtain robust user embeddings through multi-hop user-item interactions and transform them via an MLP into semantically aligned perturbations. These perturbations are injected onto the latent space of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.