Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency,   Performance, and Adversarial Robustness

Xiaojing Fan; Chunliang Tao

arXiv:2408.04585·cs.CL·September 17, 2024·3 cites

Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness

Xiaojing Fan, Chunliang Tao

PDF

Open Access

TL;DR

This study compares different large language models to understand the trade-offs between efficiency, accuracy, and robustness against adversarial attacks, revealing that simplified architectures can offer a good balance for resource-constrained, resilient applications.

Contribution

It introduces a framework for evaluating the trade-offs among efficiency, performance, and adversarial robustness in LLMs and provides extensive experimental insights on three models.

Findings

01

Simplified models achieve higher efficiency and comparable robustness.

02

Gated Linear Attention and MatMul-Free LM outperform in robustness on adversarial datasets.

03

Trade-offs exist between model complexity, accuracy, and adversarial resilience.

Abstract

With the increasing demand for practical applications of Large Language Models (LLMs), many attention-efficient models have been developed to balance performance and computational cost. However, the adversarial robustness of these models remains under-explored. In this work, we design a framework to investigate the trade-off between efficiency, performance, and adversarial robustness of LLMs and conduct extensive experiments on three prominent models with varying levels of complexity and efficiency -- Transformer++, Gated Linear Attention (GLA) Transformer, and MatMul-Free LM -- utilizing the GLUE and AdvGLUE datasets. The AdvGLUE dataset extends the GLUE dataset with adversarial samples designed to challenge model robustness. Our results show that while the GLA Transformer and MatMul-Free LM achieve slightly lower accuracy on GLUE tasks, they demonstrate higher efficiency and either…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecurity and Verification in Computing · Fault Detection and Control Systems · Adversarial Robustness in Machine Learning

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings · Dense Connections