Exposing and Defending the Achilles' Heel of Video Mixture-of-Experts

Songping Wang; Qinglong Liu; Yueming Lyu; Ning Li; Ziwen He; Caifeng Shan

arXiv:2602.01369·cs.CV·February 3, 2026

Exposing and Defending the Achilles' Heel of Video Mixture-of-Experts

Songping Wang, Qinglong Liu, Yueming Lyu, Ning Li, Ziwen He, Caifeng Shan

PDF

Open Access 3 Reviews

TL;DR

This paper investigates the adversarial vulnerabilities of video Mixture-of-Experts models at the component level, proposing targeted attacks and defenses that improve robustness and reduce inference costs.

Contribution

It introduces Temporal Lipschitz-Guided Attacks and joint adversarial training to expose and defend against component-wise and collaborative weaknesses in video MoE models.

Findings

01

Joint attacks significantly increase adversarial effects.

02

Proposed defenses improve robustness across datasets.

03

Inference cost is reduced by over 60%.

Abstract

Mixture-of-Experts (MoE) has demonstrated strong performance in video understanding tasks, yet its adversarial robustness remains underexplored. Existing attack methods often treat MoE as a unified architecture, overlooking the independent and collaborative weaknesses of key components such as routers and expert modules. To fill this gap, we propose Temporal Lipschitz-Guided Attacks (TLGA) to thoroughly investigate component-level vulnerabilities in video MoE models. We first design attacks on the router, revealing its independent weaknesses. Building on this, we introduce Joint Temporal Lipschitz-Guided Attacks (J-TLGA), which collaboratively perturb both routers and experts. This joint attack significantly amplifies adversarial effects and exposes the Achilles' Heel (collaborative weaknesses) of the MoE architecture. Based on these insights, we further propose Joint Temporal Lipschitz…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

1. Exploring the adversarial robustness of the video moe structure is beneficial for the secure deployment of the model. 2. This paper proposes an adversarial training method to improve adversarial robustness.

Weaknesses

1. My biggest concern is the generalization ability of the experimental results. The number of models is limited, and the results of multiple models cannot be represented in the main body. Most importantly, the clean accuracy in Table 2 is extremely low, lower than the performance of common models on UCF-101, meaning the models were not well trained. Attacking a model that is prone to misclassification is easier and will lead to problematic conclusions. 2. This paper is unclear in its expressio

Reviewer 02Rating 6Confidence 3

Strengths

- The structure of this paper is clear. The paper is well-written and easy to follow. - The work analyzes component-level vulnerabilities in video MoE architectures. The idea of differentiating between router and expert robustness is insightful and practically relevant. - The analysis from attack to defense provides a unified view of robustness analysis. The proposed method provides a comprehensive understanding of the studied problem. - The experiments are comprehensive, covering multiple backb

Weaknesses

- The new attack (TLGA) is basically a standard PGD-style adversarial attack with two small changes: a time-based adjustment of the step size, and an extra term related to the Lipschitz constant. What seems novel is the focus on attacking the router and experts separately in MoE models. - The loss, e.g., equation (7), is not clear how this actually constrains the Lipschitz constant or enforces smoothness. There’s no clear derivation or estimation of the Lipschitz bound 𝐾. - The three-stage adver

Reviewer 03Rating 4Confidence 3

Strengths

* Addresses an important and underexplored problem of adversarial robustness in video MoE models. * Provides novel component-level analysis (router and expert) that exposes previously uncharacterized weaknesses. * Presents strong empirical results across datasets and models with significant robustness gains and reduced inference cost.

Weaknesses

* Unclear threat model and attacker assumptions The paper does not explicitly define the threat model used in TLGA and J-TLGA. It is unclear whether the attacks assume a white-box, gray-box, or black-box setting, nor what information the adversary possesses (e.g., access to router parameters, gradients, or architecture details). Since the practicality and interpretability of robustness claims depend heavily on these assumptions, the authors should clearly specify the attacker’s knowledge scope

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition