Elytra: A Flexible Framework for Securing Large Vision Systems

Richard E. Neddo; Emmanuel Atindama; Zander W. Blasingame; Chen Liu

arXiv:2506.00661·cs.CV·March 10, 2026

Elytra: A Flexible Framework for Securing Large Vision Systems

Richard E. Neddo, Emmanuel Atindama, Zander W. Blasingame, Chen Liu

PDF

Open Access

TL;DR

Elytra introduces a flexible, efficient framework that uses low-rank adaptation to dynamically patch large vision models, significantly enhancing their robustness against adversarial attacks in autonomous systems.

Contribution

The paper presents Elytra, a novel framework employing low-rank adaptation for lightweight, dynamic security patches to improve large vision models' adversarial robustness.

Findings

01

Improves classification accuracy by up to 24.09% against adversarial examples.

02

Enables dynamic patching of large pre-trained vision models.

03

Offers a computationally efficient alternative to existing hardening methods.

Abstract

Adversarial attacks have emerged as a critical threat to autonomous driving systems. These attacks exploit the underlying neural network, allowing small, almost invisible, perturbations to alter the behavior of such systems in potentially malicious ways, e.g., causing a traffic sign classification network to misclassify a stop sign as a speed limit sign. Prior work in hardening such systems against adversarial attacks has looked at fine-tuning of the system or adding additional pre-processing steps to the input pipeline. Such solutions either have a hard time generalizing, require knowledge of adversarial attacks during training, or are computationally undesirable. Instead, we propose a framework called ELYTRA to take insights for parameter-efficient fine-tuning and use low-rank adaptation (LoRA) to train a lightweight security patch (or patches), enabling us to dynamically patch large…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings