Securing Transformer-based AI Execution via Unified TEEs and Crypto-protected Accelerators

Jiaqi Xue; Yifei Zhao; Mengxin Zheng; Fan Yao; Yan Solihin; Qian Lou

arXiv:2507.03278·cs.CR·July 15, 2025

Securing Transformer-based AI Execution via Unified TEEs and Crypto-protected Accelerators

Jiaqi Xue, Yifei Zhao, Mengxin Zheng, Fan Yao, Yan Solihin, Qian Lou

PDF

Open Access

TL;DR

TwinShield is a novel framework that enhances the security and efficiency of Transformer model inference by leveraging heterogeneous TEEs and crypto-protected accelerators, significantly reducing inference time while safeguarding data and models.

Contribution

The paper introduces TwinShield, a new system that securely offloads most Transformer inference computations to GPUs with dual protection, overcoming limitations of previous schemes.

Findings

01

Achieves 4.0x - 6.1x speedup over prior methods.

02

Offloads approximately 87% of computation to GPUs.

03

Provides dual protection for data and model during inference.

Abstract

Recent advances in Transformer models, e.g., large language models (LLMs), have brought tremendous breakthroughs in various artificial intelligence (AI) tasks, leading to their wide applications in many security-critical domains. Due to their unprecedented scale and prohibitively high development cost, these models have become highly valuable intellectual property for AI stakeholders and are increasingly deployed via machine learning as a service (MLaaS). However, MLaaS often runs on untrusted cloud infrastructure, exposing data and models to potential breaches. Mainstream protection mechanisms leverage trusted execution environments (TEEs) where confidentiality and integrity for secretive data are shielded using hardware-based encryption and integrity checking. Unfortunately, running model inference entirely within TEEs is subject to non-trivial slowdown, which is further exacerbated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhysical Unclonable Functions (PUFs) and Hardware Security · Radiation Effects in Electronics · Advanced Malware Detection Techniques