Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer
Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

TL;DR
This paper introduces a novel decoupled surrogate approach for multi-expert learning-to-defer systems, addressing optimization issues in existing methods and providing theoretical guarantees of stability and improved performance.
Contribution
It proposes a decoupled surrogate model with a theoretical excess-risk bound, outperforming existing methods in stability and effectiveness as the expert pool expands.
Findings
The new method remains stable as the expert pool grows.
It preserves rare specialists effectively.
It outperforms standalone classifiers on multiple benchmarks.
Abstract
A learning-to-defer (L2D) system decides, for each input, whether to predict on its own or to hand it to one of several available experts. The very well established recipe trains classifier and router jointly by treating the classes and experts as competing actions in one shared -action geometry. Subsequent work has proposed a series of incremental fixes within this geometry; we show that each still suffers, to varying severity, from an optimization-level pathology (target distortion, gradient amplification, winner-take-all starvation, set-mass collapse, or class--expert coupling) even under statistical consistency. We step outside the augmented-action family entirely and propose a decoupled surrogate: a softmax classifier head and an independent sigmoid head per expert, mirroring the two natural objects of the problem. We show that per-sample updates are then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
