Gaia: Hybrid Hardware Acceleration for Serverless AI in the 3D Compute Continuum

Maximilian Reisecker; Cynthia Marcelino; Thomas Pusztai; Stefan Nastic

arXiv:2511.13728·cs.DC·November 19, 2025

Gaia: Hybrid Hardware Acceleration for Serverless AI in the 3D Compute Continuum

Maximilian Reisecker, Cynthia Marcelino, Thomas Pusztai, Stefan Nastic

PDF

Open Access

TL;DR

Gaia introduces a dynamic GPU management system for serverless AI workloads in heterogeneous environments, significantly reducing latency and improving efficiency by adaptively selecting optimal hardware acceleration modes.

Contribution

Gaia presents a novel platform-level GPU-as-a-service architecture with adaptive runtime management for serverless AI in the 3D compute continuum.

Findings

01

Reduces end-to-end latency by up to 95%.

02

Effectively selects optimal hardware acceleration modes.

03

Enables SLO-aware, cost-efficient serverless AI execution.

Abstract

Serverless computing offers elastic scaling and pay-per-use execution, making it well-suited for AI workloads. As these workloads run in heterogeneous environments such as the Edge-Cloud-Space 3D Continuum, they often require intensive parallel computation, which GPUs can perform far more efficiently than CPUs. However, current platforms struggle to manage hardware acceleration effectively, as static user-device assignments fail to ensure SLO compliance under varying loads or placements, and one-time dynamic selections often lead to suboptimal or cost-inefficient configurations. To address these issues, we present Gaia, a GPU-as-a-service model and architecture that makes hardware acceleration a platform concern. Gaia combines (i) a lightweight Execution Mode Identifier that inspects function code at deploy time to emit one of four execution modes, and a Dynamic Function Runtime that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Security and Verification in Computing