Sequential Resource Trading Using Comparison-Based Gradient Estimation
Surya Murthy, Mustafa O. Karabag, Ufuk Topcu

TL;DR
This paper introduces a comparison-based gradient estimation algorithm for sequential resource trading between two agents with unknown utilities, ensuring improvements and convergence to Pareto optimality.
Contribution
It presents a novel comparison-based method for resource trading that guarantees utility improvements and convergence to the Pareto front, even with limited feedback.
Findings
The algorithm guarantees each accepted trade strictly improves both agents' utilities.
It converges asymptotically to the Pareto front under mild assumptions.
It outperforms standard baselines in societal benefit with fewer offers.
Abstract
We study sequential multi-issue trading between two greedily rational agents who exchange resources from a finite set of categories. Each agent's utility depends on its allocation, but the offering agent does not know the responding agent's utility function and receives only accept or reject feedback. We propose a comparison-based algorithm that interprets acceptance and rejection responses as pairwise state comparisons, allowing the offering agent to iteratively estimate the responding agent's gradient. Rejected offers prune the space of feasible gradient directions, enabling systematic refinement of possibly mutually beneficial trades. The algorithm guarantees that each accepted trade strictly improves both agents' utilities and, after finitely many rejected offers, either identifies a mutually beneficial trade or certifies that the current allocation is weakly Pareto optimal. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
