MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation

Rifny Rachman; Josh Tingey; Richard Allmendinger; Wei Pan; Pradyumn Shukla; Bahrul Ilmi Nasution

arXiv:2603.05760·cs.LG·March 9, 2026

MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation

Rifny Rachman, Josh Tingey, Richard Allmendinger, Wei Pan, Pradyumn Shukla, Bahrul Ilmi Nasution

PDF

Open Access

TL;DR

MIRACL introduces a hierarchical meta-reinforcement learning framework that enables efficient, few-shot adaptation for multi-objective, multi-echelon supply chain optimisation, outperforming traditional methods in dynamic environments.

Contribution

It is the first to integrate Meta-MORL with structured subproblem decomposition and Pareto-based meta-learning for combinatorial optimisation tasks.

Findings

01

MIRACL achieves up to 10% higher hypervolume than baselines.

02

MIRACL attains 5% better expected utility in experiments.

03

The framework demonstrates robust adaptation in diverse supply chain scenarios.

Abstract

Multi-objective reinforcement learning (MORL) is effective for multi-echelon combinatorial supply chain optimisation, where tasks involve high dimensionality, uncertainty, and competing objectives. However, its deployment in dynamic environments is hindered by the need for task-specific retraining and substantial computational cost. We introduce MIRACL (Meta multI-objective Reinforcement leArning with Composite Learning), a hierarchical Meta-MORL framework that allows for a few-shot generalisation across diverse tasks. MIRACL decomposes each task into structured subproblems for efficient policy adaptation and meta-learns a global policy across tasks using a Pareto-based adaptation strategy to encourage diversity in meta-training and fine-tuning. To our knowledge, this is the first integration of Meta-MORL with such mechanisms in combinatorial optimisation. Although validated in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Reinforcement Learning in Robotics · Vehicle Routing Optimization Methods