TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone

Xunjie Wang; Jiacheng Shi; Zihan Zhao; Yang Yu; Zhichao Hua; Jinyu Gu

arXiv:2511.13717·cs.CR·November 18, 2025

TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone

Xunjie Wang, Jiacheng Shi, Zihan Zhao, Yang Yu, Zhichao Hua, Jinyu Gu

PDF

Open Access

TL;DR

This paper presents TZ-LLM, a system that secures on-device large language models using Arm TrustZone, achieving significant reductions in latency and improvements in decoding speed through innovative memory and NPU management techniques.

Contribution

The paper introduces pipelined restoration and a co-driver design to efficiently protect LLMs within TrustZone, addressing memory and NPU sharing challenges.

Findings

01

TTFT reduced by up to 90.9%

02

Decoding speed increased by up to 23.2%

03

System successfully implemented on Arm devices

Abstract

Large Language Models (LLMs) deployed on mobile devices offer benefits like user privacy and reduced network latency, but introduce a significant security risk: the leakage of proprietary models to end users. To mitigate this risk, we propose a system design for protecting on-device LLMs using Arm Trusted Execution Environment (TEE), TrustZone. Our system addresses two primary challenges: (1) The dilemma between memory efficiency and fast inference (caching model parameters within TEE memory). (2) The lack of efficient and secure Neural Processing Unit (NPU) time-sharing between Rich Execution Environment (REE) and TEE. Our approach incorporates two key innovations. First, we employ pipelined restoration, leveraging the deterministic memory access patterns of LLM inference to prefetch parameters on demand, hiding memory allocation, I/O and decryption latency under computation time.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecurity and Verification in Computing · Adversarial Robustness in Machine Learning · Advanced Malware Detection Techniques