AI Sessions for Network-Exposed AI-as-a-Service
Mohaned Chraiti, Merve Saimler

TL;DR
This paper introduces Network-Exposed AI-as-a-Service (NE-AIaaS), a framework that enables enforceable latency, mobility, and resource guarantees for cloud-based AI inference through new session primitives and protocols.
Contribution
It proposes AI Sessions (AIS) and AI Service Profiles (ASP) to provide explicit lifecycle management, resource reservation, and mobility support for AI inference services.
Findings
Design of AI Session primitives with explicit failure semantics
Protocols for resource reservation and session migration
Compatibility with existing network and edge computing standards
Abstract
Cloud-based Artificial Intelligence (AI) inference is increasingly latency- and context-sensitive, yet today's AI-as-a-Service is typically consumed as an application-chosen endpoint, leaving the network to provide only best-effort transport. This decoupling prevents enforceable tail-latency guarantees, compute-aware admission control, and continuity under mobility. This paper proposes Network-Exposed AI-as-a-Service (NE-AIaaS) built around a new service primitive: the AI Session (AIS)-a contractual object that binds model identity, execution placement, transport Quality-of-Service (QoS), and consent/charging scope into a single lifecycle with explicit failure semantics. We introduce the AI Service Profile (ASP), a compact contract that expresses task modality and measurable service objectives (e.g., time-to-first-response/token, p99 latency, success probability) alongside privacy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Software-Defined Networks and 5G · Software System Performance and Reliability
