Contextual Bandits in Payment Processing: Non-uniform Exploration and Supervised Learning

Akhila Vangara; Alex Egg

arXiv:2412.00569·cs.LG·July 11, 2025

Contextual Bandits in Payment Processing: Non-uniform Exploration and Supervised Learning

Akhila Vangara, Alex Egg

PDF

Open Access

TL;DR

This paper investigates the use of regression oracles in non-uniform exploration for payment processing, revealing performance improvements but also challenges like policy oscillation and data shift issues.

Contribution

It provides a detailed analysis of regression oracle-based approaches in real-world payment systems, highlighting their benefits and limitations within the ERM framework.

Findings

01

Regression oracles improve initial policy performance.

02

Policy performance can degrade over iterations due to data shifts.

03

Oscillation effects can cause fluctuations in policy effectiveness.

Abstract

Uniform random exploration in decision-making systems supports off-policy learning via supervision but incurs high regret, making it impractical for many applications. Conversely, non-uniform exploration offers better immediate performance but lacks support for off-policy learning. Recent research suggests that regression oracles can bridge this gap by combining non-uniform exploration with supervised learning. In this paper, we analyze these approaches within a real-world industrial context at Adyen, a large global payments processor characterized by batch logged delayed feedback, short-term memory, and dynamic action spaces under the Empirical Risk Minimization (ERM) framework. Our analysis reveals that while regression oracles significantly improve performance, they introduce challenges due to rigid algorithmic assumptions. Specifically, we observe that as a policy improves,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCustomer churn and segmentation · Organizational and Employee Performance · Technology Adoption and User Behaviour