AI Sessions for Network-Exposed AI-as-a-Service

Mohaned Chraiti; Merve Saimler

arXiv:2602.15288·cs.NI·February 19, 2026

AI Sessions for Network-Exposed AI-as-a-Service

Mohaned Chraiti, Merve Saimler

PDF

Open Access

TL;DR

This paper introduces Network-Exposed AI-as-a-Service (NE-AIaaS), a framework that enables enforceable latency, mobility, and resource guarantees for cloud-based AI inference through new session primitives and protocols.

Contribution

It proposes AI Sessions (AIS) and AI Service Profiles (ASP) to provide explicit lifecycle management, resource reservation, and mobility support for AI inference services.

Findings

01

Design of AI Session primitives with explicit failure semantics

02

Protocols for resource reservation and session migration

03

Compatibility with existing network and edge computing standards

Abstract

Cloud-based Artificial Intelligence (AI) inference is increasingly latency- and context-sensitive, yet today's AI-as-a-Service is typically consumed as an application-chosen endpoint, leaving the network to provide only best-effort transport. This decoupling prevents enforceable tail-latency guarantees, compute-aware admission control, and continuity under mobility. This paper proposes Network-Exposed AI-as-a-Service (NE-AIaaS) built around a new service primitive: the AI Session (AIS)-a contractual object that binds model identity, execution placement, transport Quality-of-Service (QoS), and consent/charging scope into a single lifecycle with explicit failure semantics. We introduce the AI Service Profile (ASP), a compact contract that expresses task modality and measurable service objectives (e.g., time-to-first-response/token, p99 latency, success probability) alongside privacy and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Software-Defined Networks and 5G · Software System Performance and Reliability