STDPG: A Spatio-Temporal Deterministic Policy Gradient Agent for Dynamic Routing in SDN
Juan Chen, Zhiwen Xiao, Huanlai Xing, Penglin Dai, Shouxi Luo,, Muhammad Azhar Iqbal

TL;DR
This paper introduces STDPG, a novel spatio-temporal DRL agent for SDN routing that leverages CNN-LSTM-TAM and prioritized experience replay to improve learning efficiency and routing performance.
Contribution
It proposes a new spatio-temporal neural network architecture combined with prioritized experience replay for dynamic SDN routing, addressing high complexity and ignoring spatial-temporal traffic features.
Findings
STDPG outperforms existing DRL agents in reducing end-to-end delay.
The CNN-LSTM-TAM architecture effectively captures spatial-temporal traffic features.
Prioritized experience replay accelerates training convergence.
Abstract
Dynamic routing in software-defined networking (SDN) can be viewed as a centralized decision-making problem. Most of the existing deep reinforcement learning (DRL) agents can address it, thanks to the deep neural network (DNN)incorporated. However, fully-connected feed-forward neural network (FFNN) is usually adopted, where spatial correlation and temporal variation of traffic flows are ignored. This drawback usually leads to significantly high computational complexity due to large number of training parameters. To overcome this problem, we propose a novel model-free framework for dynamic routing in SDN, which is referred to as spatio-temporal deterministic policy gradient (STDPG) agent. Both the actor and critic networks are based on identical DNN structure, where a combination of convolutional neural network (CNN) and long short-term memory network (LSTM) with temporal attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware-Defined Networks and 5G · Advanced Memory and Neural Computing · Advanced Computing and Algorithms
MethodsPrioritized Experience Replay · Experience Replay · Memory Network
