Optimising Call Centre Operations using Reinforcement Learning: Value Iteration versus Proximal Policy Optimisation
Kwong Ho Li, Wathsala Karunarathne

TL;DR
This paper compares model-based value iteration and model-free proximal policy optimisation for call centre routing, demonstrating PPO's superior performance in reducing wait times and staff idle time through simulation experiments.
Contribution
It introduces a simulation environment combining DES and OpenAI Gym for RL in call routing and compares two RL methods within a Skills-Based Routing framework.
Findings
PPO outperforms VI in reward and efficiency after 1,000 episodes.
PPO achieves the lowest client waiting and staff idle times.
Model-free RL requires longer training but yields better results.
Abstract
This paper investigates the application of Reinforcement Learning (RL) to optimise call routing in call centres to minimise client waiting time and staff idle time. Two methods are compared: a model-based approach using Value Iteration (VI) under known system dynamics, and a model-free approach using Proximal Policy Optimisation (PPO) that learns from experience. For the model-based approach, a theoretical model is used, while a simulation model combining Discrete Event Simulation (DES) with the OpenAI Gym environment is developed for model-free learning. Both models frame the problem as a Markov Decision Process (MDP) within a Skills-Based Routing (SBR) framework, with Poisson client arrivals and exponentially distributed service and abandonment times. For policy evaluation, random, VI, and PPO policies are evaluated using the simulation model. After 1,000 test episodes, PPO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Advanced Queuing Theory Analysis · Scheduling and Optimization Algorithms
