A Match in Time Saves Nine: Deterministic Online Matching With Delays
Marcin Bienkowski, Artur Kraska, Pawe{\l} Schmidt

TL;DR
This paper introduces the first deterministic online algorithm for Min-cost Perfect Matching with Delays, achieving a polynomial competitive ratio independent of metric parameters, advancing online matching theory.
Contribution
It presents the first deterministic algorithm for MPMD with a polynomial competitive ratio, not requiring prior knowledge of the metric space.
Findings
Deterministic algorithm with $O(m^{2.46})$ competitive ratio.
Algorithm does not depend on metric space parameters.
First such deterministic solution for MPMD.
Abstract
We consider the problem of online Min-cost Perfect Matching with Delays (MPMD) introduced by Emek et al. (STOC 2016). In this problem, an even number of requests appear in a metric space at different times and the goal of an online algorithm is to match them in pairs. In contrast to traditional online matching problems, in MPMD all requests appear online and an algorithm can match any pair of requests, but such decision may be delayed (e.g., to find a better match). The cost is the sum of matching distances and the introduced delays. We present the first deterministic online algorithm for this problem. Its competitive ratio is , where is the number of requests. This is polynomial in the number of metric space points if all requests are given at different points. In particular, the bound does not depend on other parameters of the metric, such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\Copyright
Marcin Bienkowski, Artur Kraska, Paweł Schmidt
\ArticleNoA \DOIPrefix
A Match in Time Saves Nine: Deterministic Online Matching With Delays111Partially supported by Polish National Science Centre grant 2016/22/E/ST6/00499.
Marcin Bienkowski
Institute of Computer Science, University of Wrocław, Poland
Artur Kraska
Institute of Computer Science, University of Wrocław, Poland
Paweł Schmidt
Institute of Computer Science, University of Wrocław, Poland
Abstract.
We consider the problem of online Min-cost Perfect Matching with Delays (MPMD) introduced by Emek et al. (STOC 2016). In this problem, an even number of requests appear in a metric space at different times and the goal of an online algorithm is to match them in pairs. In contrast to traditional online matching problems, in MPMD all requests appear online and an algorithm can match any pair of requests, but such decision may be delayed (e.g., to find a better match). The cost is the sum of matching distances and the introduced delays.
We present the first deterministic online algorithm for this problem. Its competitive ratio is , where is the number of requests. This is polynomial in the number of metric space points if all requests are given at different points. In particular, the bound does not depend on other parameters of the metric, such as its aspect ratio. Unlike previous (randomized) solutions for the MPMD problem, our algorithm does not need to know the metric space in advance.
Key words and phrases:
online matching, delays, rent-or-buy, competitive analysis
1991 Mathematics Subject Classification:
F.1.2 Modes of Computation: Online computation, F.2.2 Nonnumerical Algorithms and Problems
1. Introduction
In this paper, we give a deterministic online algorithm for the problem of Min-cost Perfect Matching with Delays (MPMD) [22, 5]. For an informal description, imagine that there are human players who are logging in real time into a gaming website, each wanting to play chess against another human player. The system pairs the players according to their known capabilities, such as playing strength. A decision with whom to match a given player can be delayed until a reasonable match is found. That is, the website tries to simultaneously minimize two objectives: the waiting times of players and their dissimilarity, i.e., each player would like to play with another one with similar capabilities. An algorithm running the website has to work online, without the knowledge about future player arrivals and make its decision irrevocably: once two players are paired, they remain paired forever.
1.1. Problem definition
More formally, in the MPMD problem there is a metric space with a distance function , both known from the beginning to an online algorithm. An online part of the input is a sequence of requests , where point corresponds to a player in our informal description above and is the time of its arrival. Clearly, . The integer is not known a priori to an online algorithm. At any time , an online algorithm may decide to match any pair of requests and that have already arrived ( and ) and have not been matched yet. The cost incurred by such matching edge is , i.e., is the sum of the connection cost and the waiting costs of these two requests.
The goal is to eventually match all requests and minimize the total cost. We use a typical yardstick to measure the performance: a competitive ratio [13], defined as the maximum, over all inputs, of the ratios between the cost of an online algorithm and the cost of an optimal offline solution Opt that knows the entire input sequence in advance.
1.2. Previous work
The MPMD problem was introduced by Emek et al. [22], who presented a randomized -competitive algorithm. There, is the number of points in the metric space and is its aspect ratio (the ratio between the largest and the smallest distance in ). The competitive ratio was subsequently improved by Azar et al. [5] to . They showed that the ratio of any randomized algorithm is at least . The currently best lower bound of for randomized solutions was given by Ashlagi et al. [3].
So far, the construction of a competitive deterministic algorithm for general metric spaces remained an open problem. It was hypothesized that competitive ratios achievable by deterministic algorithms might be superpolynomial in (cf. Section 5 of [5]). Deterministic algorithms were known only for simple spaces: Azar et al. [5] gave an -competitive algorithm for trees and Emek et al. [23] constructed a -competitive deterministic solution for two-point metric (the competitive ratio is best possible for such metric).
1.3. Our contribution
In this paper, we give the first deterministic algorithm for any metric space, whose competitive ratio is , where is the number of requests. Typically, for our gaming application, is smaller than (although in full generality it can be also larger if multiple requests arrive at the same point of the metric space ). While previous solutions to the MPMD problem [22, 5] required to be finite and known a priori (to approximate it first by a random HST tree [24] or a random HST tree with reduced height [8]), our solution works even when is revealed in online manner. That is, we require only that, together with any request , an online algorithm learns the distances from to all previous, not yet matched requests.
Our online algorithm Alg uses a simple, local, semi-greedy scheme to find a suitable matching pair. In the analysis, we fix a final perfect matching of Opt and observe what happens when we gradually add matching edges that Alg creates during its execution. That is, we trace the evolution of alternating paths and cycles in time. To bound the cost of Alg, we charge the cost of an edge that Alg is adding against the cost of already existing matching edges from the same alternating path. Interestingly, our charging argument on alternating cycles bears some resemblance to the analyses of algorithms for the problems that are not directly related to MPMD: online metric (bipartite) matching on line metrics [2] and offline greedy matching [40].
1.4. Related work
Originally, matching problems have been studied in variants where delaying decisions was not permitted. The setting most similar to the MPMD problem is called online metric bipartite matching. In involves offline points given to an algorithm at the beginning and requests presented in online manner that need to be matched (immediately after their arrival) to offline points. Both points and requests lie in a common metric space and the goal is to minimize the weight of a perfect matching created by an algorithm. For general metric spaces, the best randomized solution is -competitive [7, 26, 37], and the deterministic algorithms achieve the optimal competitive ratio of [27, 32]. Interestingly, even for line metrics [2, 25, 33], the best known deterministic algorithm attains a competitive ratio that is polynomial in [2].
In comparison, in the MPMD problem considered in this paper, all requests appear in online manner, is not known to an algorithm, and we allow to match any pair of them. That said, there is also a bipartite variant of the MPMD problem, in which all requests appear online, but of them are negative and are positive. An algorithm may then only match pairs of requests of different polarities [4, 3].
The MPMD problem can be cast as augmenting min-cost perfect matching with a time axis, allowing the algorithm to delay its decisions, but penalizing the delays. There are many other problems that use this paradigm: most notably the ski-rental problem and its continuous counterpart, the spin-block problem [29], where a purchase decision can be delayed until renting cost becomes sufficiently large. Such rent-or-buy (wait-or-act) trade-offs are also found in other areas, for example in aggregating messages in computer networks [1, 11, 21, 28, 31, 39], in aggregating orders in supply-chain management [9, 10, 14, 15, 17, 18] or in some scheduling variants [6].
Finally, there is a vast amount of work devoted to other online matching variant, where offline points and online requests are connected by graph edges and the goal is to maximize the weight or the cardinality of the produced matching. These types of matching problems have been studied since the seminal work of Karp et al. [30] and are motivated by applications to online auctions [12, 16, 19, 20, 30, 34, 36, 38]. They were also studied under stochastic assumptions on the input, see, e.g., a survey by Mehta [35].
2. Algorithm
We will identify requests with the points at which they arrive. To this end, we assume that all requested points are different, but we allow distances between different metric points to be zero. For any request , we denote the time of its arrival by .
Our algorithm is parameterized with real numbers and , whose exact values will be optimized later. For any request , we define its waiting time at time as
[TABLE]
and its budget at time as
[TABLE]
Our online algorithm Alg matches two requests and at time as soon as the following two conditions are satisfied.
- •
Budget sufficiency: .
- •
Budget balance: and .
Note that the budget balance condition is equivalent to relations on waiting times, i.e., and .
If the conditions above are met simultaneously for many point pairs, we break ties arbitrarily, and process them in any order. Note that at the time when and become matched, the sum of their budgets may exceed . For example, this occurs when appears at time strictly larger than : they are then matched by Alg as soon as the budget balance condition becomes true.
The observation below follows immediately by the definition of Alg.
Observation \thetheorem.
Fix time and two requests and , such that and . Assume that neither nor has been matched by Alg strictly before time . Then exactly one of the following conditions holds:
- •
,
- •
and ,
- •
and .
3. Analysis
To analyze the performance of Alg, we look at matchings generated by Alg and by an optimal offline algorithm Opt. If points and were matched at time by Alg, then we say that Alg creates a (matching) edge . Its cost is
[TABLE]
We call an Alg-edge. The of an edge in the solution of Opt (an Opt-edge) is defined analogously. In an optimal solution, however, the matching time is always equal to the arrival time of the later of two matched requests.
We now consider a dynamically changing graph consisting of requested points, Opt-edges and Alg-edges. For the analysis, we assume that it changes in the following way: all requested points and all Opt-edges are present in the graph from the beginning, but the Alg-edges are added to the graph in steps, in the order they are created by Alg.
At all times, the matching edges present in the graph form alternating paths or cycles (i.e., paths or cycles whose edges are interleaved Alg-edges and Opt-edges). Furthermore, any node-maximal alternating path starts and ends with Opt-edges. Assume now that a matching edge created by Alg is added to the graph. It may either connect the ends of two different alternating paths, thus creating a single longer alternating path or connect the ends of one alternating path, generating an alternating cycle. In the former case, we call edge non-final, in the latter case — final. Note that at the end of the Alg execution, when Alg-edges are added, the graph contains only alternating cycles.
We extend the notion of cost to alternating path and cycles. For any cycle , is simply the sum of costs of its edges: the cost of an Opt-edge on such cycle is the cost paid by Opt and the cost of an Alg-edge is that of Alg. We also define , and as the costs of Opt-edges, Alg-edges and non-final Alg-edges on cycle , respectively. Clearly, . We define the same notions for alternating paths; as a path does not contain final Alg-edges, .
An alternating path is called -step maximal alternating path if it exists in the graph after Alg matched pairs and it cannot be extended, i.e., it ends with two requests that are not yet matched by the first Alg-edges.
3.1. Tree construction
To facilitate the analysis, along with the graph, we create a dynamically changing forest of binary trees, where each leaf of corresponds to an Opt-edge and each internal (non-leaf) node of to a non-final Alg-edge (and vice versa). After Alg matched pairs, each subtree of corresponds to a -step maximal alternating path or to an alternating cycle. More precisely, at the beginning, consists of single nodes representing Opt-edges. Afterwards, whenever an Alg-edge is created, we perform the following operation on .
- •
When a non-final Alg-edge is added to the graph, we look at the two alternating paths and that end with and , respectively. We take the corresponding trees and of . We add a node (representing edge ) to and make and its subtrees.
- •
When a final Alg-edge is added to the graph, it turns an alternating path into an alternating cycle . We then simply say that the tree that corresponded to , now corresponds to .
An example of the graph and the associated forest is presented in Figure 1.
For any tree node , we define its weight as the cost of the corresponding matching edge, i.e., the cost of an Opt-edge for a leaf and the cost of a non-final Alg-edge for a non-leaf node. For any node , by we denote the tree rooted at . We extend the notion of weight in a natural manner to all subtrees of . In these terms, the weight of a tree in is equal to the total cost of the corresponding alternating path. (If represents an alternating cycle , then its weight is equal to the cost of minus the cost of the final Alg-edge from .)
Note that we consistently used terms “points” and “edges” for objects that Alg and Opt are operating on in the metric space . On the other hand, the term “nodes” will always refer to tree nodes in and we will not use the term “edge” to denote an edge in .
3.2. Outline of the analysis
Our approach to bounding the cost of Alg is now as follows. We look at the forest at the end of Alg execution. The corresponding graph contains only alternating cycles. The cost of non-final Alg-edges is then, by the definition, equal to the total weight of internal (non-leaf) nodes of , while the cost of Opt-edges is equal to the total weight of leaves of . Hence, our goal is to relate the total weight of any tree to the weight of its leaves.
The central piece of our analysis is showing that for any internal node with children and , it holds that , where is a constant depending on parameters and (see Corollary 3.5). Using this relation, we will bound the total weight of any tree by times the total weight of its leaves. This implies the same bound on the ratio between non-final Alg-edges and Opt-edges on each alternating cycle.
Finally, we show that the cost of final Alg-edges incurs at most an additional constant factor in the total cost of Alg.
3.3. Cost of non-final ALG-edges
As described in Section 3.1, when Alg adds a -th Alg-edge to the graph, and this edge is non-final, joins two -step maximal alternating paths and . We will bound by a constant (depending on and ) times . We start with bounding the waiting cost of Alg related to one endpoint of .
Lemma 3.1**.**
Let be the -th Alg-edge added at time , such that is non-final. Let ) be the -step maximal alternating path ending at . Then, .
Proof 3.2**.**
First we lower-bound the cost of an alternating path . We look at any edge from . Its cost (no matter whether paid by Alg or Opt) is certainly larger than . Therefore, using triangle inequality (on distances and times), we obtain
[TABLE]
Therefore, in our proof we will simply bound using either or .
Recall that Alg matches at time . Consider the state of at time . If has not been presented to Alg yet (), then , and the lemma follows.
In the remaining part of the proof, we assume that was already presented to the algorithm (). As is a -step maximal alternating path, is not matched by Alg right after Alg creates -th matching edge. The earliest time when may become matched is when Alg creates the next, -th matching edge, i.e., at time . Therefore is not matched before time .
Now observe that there must be a reason for which requests and have not been matched with each other before time . Roughly speaking, either the sum of budgets of requests and does not suffice to cover the cost of or one of them waits significantly longer than the other. Formally, we apply Observation 2 to pair obtaining three possible cases. In each of the cases we bound appropriately.
**Case 1 (insufficient budgets).: **
If , then by non-negativity of , it follows that .
**Case 2 ( waited much longer than ).: **
If and , then . Therefore, .
**Case 3 ( waited much longer than ).: **
If and , then . Thus, .
Lemma 3.3**.**
Let be the -th Alg-edge, such that is non-final. Let and be the -step maximal alternating path ending at and , respectively. Then,
[TABLE]
Proof 3.4**.**
Let be the time when is matched with by Alg. Using the definition of , we obtain
[TABLE]
The first inequality follows by the budget sufficiency condition of Alg and the second one by the budget balance condition.
By Lemma 3.1, and , which combined with (2) immediately yield the lemma.
Recall now the iterative construction of forest from Section 3.1: whenever a non-final matching edge created by Alg joins two alternating paths and , we add a new node to , such that and make trees and its children. These trees correspond to paths and , and satisfy and . Therefore, Lemma 3.3 immediately implies the following equivalent relation on tree weights.
Corollary 3.5**.**
Let be an internal node of the forest whose children are and . Then, .
This relation can be used to express the total weight of a tree of in terms of the total weight of its leaves. The proof of the following technical lemma is deferred to Section 4. Here, we present how to use it to bound the cost of Alg on non-final edges of a single alternating cycle.
Lemma 3.6**.**
Let be a weighted full binary tree and be any constant. Assume that for each internal node with children and , their weights satisfy . Then,
[TABLE]
where is the set of leaves of and is their total weight.
Lemma 3.7**.**
Let be an alternating cycle obtained from combining matchings of Alg and Opt. Then , where .
Proof 3.8**.**
As described in Section 3.1, is associated with a tree from forest , such that Opt-edges of correspond to the set of leaves of (denoted ) and non-final Alg-edges of correspond to internal (non-leaf) nodes of . Hence, and .
By Corollary 3.5, the weight of any internal tree node with children satisfies . Therefore, we may apply Lemma 3.6 to tree , obtaining , and thus
[TABLE]
The last inequality follows as , the number of leaves, is equal to the number of Opt-edges on cycle , which is clearly at most .
3.4. Cost of final ALG-edges
In the previous section, we derived a bound on the cost of all non-final Alg-edges. The following lemma shows that the cost of final Alg-edges contribute at most a constant factor to the competitive ratio.
Lemma 3.9**.**
Let be a final Alg-edge matched at time and be the alternating cycle containing . Then .
Proof 3.10**.**
Fix a final Alg-edge , where . By the budget sufficiency condition of Alg,
[TABLE]
Our goal now is to bound in terms of or . Observe that whenever Alg matches two requests, the budget sufficiency condition of Alg or one of the inequalities of the budget balance condition is satisfied with equality. We apply this observation to pair .
- •
If the budget sufficiency condition holds with equality, , and therefore .
- •
If the budget balance condition holds with equality, . Then,
[TABLE]
Hence, in either case it holds that
[TABLE]
Finally, we bound in terms of costs of other edges of . These edges form a path , where and . By the triangle inequality applied to distances and time differences (in the same way as in (1)), we obtain that
[TABLE]
The lemma follows immediately by combining (3), (4) and (5).
3.5. The competitive ratio
Finally, we optimize constants and used throughout the previous sections and bound the competitiveness of Alg.
Theorem 3.11**.**
For and , the competitive ratio of Alg is , where is the number of requests in the input sequence.
Proof 3.12**.**
The union of matchings constructed by Alg and Opt can be split into a set of disjoint cycles. It is sufficient to show that we have the desired performance guarantee on each cycle from .
Fix a cycle . Let be the final Alg-edge of . By Lemma 3.9, . Therefore, the competitive ratio of Alg is at most
[TABLE]
where the second inequality follows by Lemma 3.7.
4. Relating weights in trees (proof of Lemma 3.6)
We start with the following technical claim that will facilitate the inductive proof of Lemma 3.6.
Lemma 4.1**.**
Fix any constant and let . Then, for all .
Proof 4.2**.**
Fix any and let . We observe that and . Moreover, the function is convex as it is a sum of two convex functions. As , by convexity, for any .
To prove the lemma, assume without loss of generality that . By the monotonicity, , and therefore
[TABLE]
The last inequality follows as .
Proof 4.3** (Proof of Lemma 3.6).**
We scale weights of all nodes, so that the average weight of each leaf is , i.e., we define a scaled weight function ws as
[TABLE]
Note that ws also satisfies . Moreover, since we scaled all weighs in the very same way, , and hence to show the lemma, it suffices to bound the term .
For any node and the corresponding subtree rooted at , we define . We inductively show that for any node of , it holds that
[TABLE]
For the induction basis, assume that is a leaf of . Then,
[TABLE]
where the last inequality follows as and .
For the inductive step, let be a non-leaf node of and let and be its children. Then,
[TABLE]
The first inequality follows by the lemma assumption and the second one by the inductive assumptions for and . The last inequality is a consequence of Lemma 4.1 and the final equality follows by the additivity of function size.
Recall that we scaled weights so that . Therefore, applying (6) to the whole tree yields . Hence,
[TABLE]
which concludes the proof.
5. Conclusions
We showed a deterministic algorithm Alg for the MPMD problem whose competitive ratio is . The currently best lower bound (holding even for randomized solutions) is [3]. A natural research direction would be to narrow this gap.
It is not known whether the analysis of our algorithm is tight. However, one can show that its competitive ratio is at least . To this end, assume that all requests arrive at the same time. For such input, Opt does not pay for delays and simply returns the min-cost perfect matching. On the other hand, Alg computes the same matching as a greedy routine (i.e., it greedily connects two nearest, not yet matched requests). Hence, even if we neglect the delay costs of Alg, its competitive ratio would be at least the approximation ratio of the greedy algorithm for min-cost perfect matching. The latter was shown to be by Reingold and Tarjan [40].
The reasoning above indicates an inherent difficulty of the problem. In order to beat the barrier, an online algorithm has to handle settings when all requests are given simultaneously more effectively. In particular, for such and similar input instances it has to employ a non-local and non-greedy policy of choosing requests to match.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Susanne Albers and Helge Bals. Dynamic TCP acknowledgment: Penalizing long delays. SIAM Journal on Discrete Mathematics , 19(4):938–951, 2005.
- 2[2] Antonios Antoniadis, Neal Barcelo, Michael Nugent, Kirk Pruhs, and Michele Scquizzato. A o(n)-competitive deterministic algorithm for online matching on a line. In Proc. 12th Workshop on Approximation and Online Algorithms (WAOA) , pages 11–22, 2014.
- 3[3] Itai Ashlagi, Yossi Azar, Moses Charikar, Ashish Chiplunkar, Ofir Geri, Haim Kaplan, Rahul Makhijani, Yuyi Wang, and Roger Wattenhofer. Min-cost bipartite perfect matching with delays. 2017. URL: https://web.stanford.edu/~iashlagi/papers/mbpmd.pdf .
- 4[4] Yossi Azar, Ashish Chiplunkar, and Haim Kaplan. Polylogarithmic bounds on the competitiveness of min-cost (bipartite) perfect matching with delays. 2016. URL: https://arxiv.org/abs/1610.05155 .
- 5[5] Yossi Azar, Ashish Chiplunkar, and Haim Kaplan. Polylogarithmic bounds on the competitiveness of min-cost perfect matching with delays. In Proc. 28th ACM-SIAM Symp. on Discrete Algorithms (SODA) , pages 1051–1061, 2017.
- 6[6] Yossi Azar, Amir Epstein, Łukasz Jeż, and Adi Vardi. Make-to-order integrated scheduling and distribution. In Proc. 27th ACM-SIAM Symp. on Discrete Algorithms (SODA) , pages 140–154, 2016.
- 7[7] Nikhil Bansal, Niv Buchbinder, Anupam Gupta, and Joseph Naor. A randomized O ( log 2 k ) 𝑂 superscript 2 𝑘 O(\log^{2}k) -competitive algorithm for metric bipartite matching. Algorithmica , 68(2):390–403, 2014.
- 8[8] Nikhil Bansal, Niv Buchbinder, Aleksander Mądry, and Joseph Naor. A polylogarithmic-competitive algorithm for the k -server problem. Journal of the ACM , 62(5):40:1–40:49, 2015.
