Safe and Practical GPU Acceleration in TrustZone
Heejin Park, Felix Xiaozhu Lin

TL;DR
This paper introduces CODY, a secure and efficient system for GPU acceleration in TrustZone, enabling fast recording and replay of GPU workloads without exposing complex GPU stacks inside the TEE.
Contribution
It proposes a novel distributed architecture combining mobile devices and cloud services for secure GPU workload recording and replay in TrustZone environments.
Findings
Recording workload time reduced by up to 95%
Replay delay 25% lower than native execution
Supports secure GPU acceleration without complex stack integration
Abstract
We present a holistic design for GPU-accelerated computation in TrustZone TEE. Without pulling the complex GPU software stack into the TEE, we follow a simple approach: record the CPU/GPU interactions ahead of time, and replay the interactions in the TEE at run time. This paper addresses the approach's key missing piece -- the recording environment, which needs both strong security and access to diverse mobile GPUs. To this end, we present a novel architecture called CODY, in which a mobile device (which possesses the GPU hardware) and a trustworthy cloud service (which runs the GPU software) exercise the GPU hardware/software in a collaborative, distributed fashion. To overcome numerous network round trips and long delays, CODY contributes optimizations specific to mobile GPUs: register access deferral, speculation, and metastate-only synchronization. With these optimizations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Data Security Solutions · Security and Verification in Computing · IoT and Edge/Fog Computing
