From Intents to Actions: Agentic AI in Autonomous Networks

Burak Demirel; Pablo Soldati; Yu Wang

arXiv:2602.01271·cs.LG·February 3, 2026

From Intents to Actions: Agentic AI in Autonomous Networks

Burak Demirel, Pablo Soldati, Yu Wang

PDF

Open Access 3 Reviews

TL;DR

This paper presents an agentic AI framework for autonomous networks that interprets high-level intents, reasons over trade-offs, and autonomously executes control actions to meet diverse service requirements.

Contribution

It introduces a novel multi-agent AI system combining language models, optimization, and reinforcement learning to translate network intents into actionable control strategies.

Findings

01

Enables autonomous interpretation of diverse network intents.

02

Balances multiple objectives near Pareto optimality.

03

Adapts to changing network conditions dynamically.

Abstract

Telecommunication networks are increasingly expected to operate autonomously while supporting heterogeneous services with diverse and often conflicting intents -- that is, performance objectives, constraints, and requirements specific to each service. However, transforming high-level intents -- such as ultra-low latency, high throughput, or energy efficiency -- into concrete control actions (i.e., low-level actuator commands) remains beyond the capability of existing heuristic approaches. This work introduces an Agentic AI system for intent-driven autonomous networks, structured around three specialized agents. A supervisory interpreter agent, powered by language models, performs both lexical parsing of intents into executable optimization templates and cognitive refinement based on feedback, constraint feasibility, and evolving network conditions. An optimizer agent converts these…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

The idea is interesting, especially given the current momentum behind agentic systems.

Weaknesses

The use of agentic systems—and, in particular, intent-based expressions of service requirements—has been discussed in 3GPP SA2. However, there is no consensus to adopt this approach in 6G systems. Even at this early stage, the indications suggest that intent-based service provisioning is not a requirement for 6G. It is therefore necessary to provide stronger justification for why future telecom networks should be agentic and intent-based. What are the pros and cons of such a design? Is flexibi

Reviewer 02Rating 4Confidence 3

Strengths

I particularly enjoy the organization of this paper. It comes with 1. a clear organization of motivations, problem settings, related concepts, and experiments; and 2. a detailed supplementary material which itself can be treated as a distinct technical report.

Weaknesses

From my perspective, this is a typical **application-oriented work** that presents details about incorporating LLMs into a well-established large-scale RANs for human intent translation. In this case, it raises several concerns: 1. **Marginal novelty**. As stated earlier, the concept of using LLMs as a human-machine interface has been broadly explored and well implemented in many other areas, which shadows the novelty of this particular paper. 2. **Experiment Design**. From my understanding, the

Reviewer 03Rating 4Confidence 4

Strengths

1. The triadic agent design—with explicit separation of interpretation, preference planning, and control—fits the realities of RAN timescales. 2. Key technical elements include (a) a dual‑LLM interpreter for schema‑compliant intent parsing and constraint reasoning under tight hardware budgets; (b) PAX‑BO, a preference‑aligned constrained BO routine for steering the controller; and (c) D‑EQL, a distributed envelope Q‑learning algorithm that combines actor–learner decoupling, sharded prioritized r

Weaknesses

1. D‑EQL essentially marries EQL with APE‑X/distributed PER and a cosine‑stability term; PAX‑BO employs standard GP‑BO with a trust region. These are well‑engineered combinations, but theoretical or algorithmic novelty appears limited. A stronger case would include ablations showing which D‑EQL elements (preference partitioning, hindsight priority refresh, cosine loss) are necessary for the RAN workload and why. 2. All results are from a single high‑fidelity simulator; there is little coverage o

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware-Defined Networks and 5G · Reinforcement Learning in Robotics · Age of Information Optimization