
TL;DR
This paper introduces a new selection problem called 'Selecting with History' that extends the secretary problem by incorporating historical data, and proposes a strategy with its success probability analyzed asymptotically.
Contribution
It defines the 'Selecting with History' problem, extending classical secretary problem frameworks, and provides a strategy with success probability analysis for large sequences.
Findings
Proposed a new selection strategy for the problem.
Calculated success probability in the large sequence limit.
Extended secretary problem to include historical information.
Abstract
We define a new selection problem, \emph{Selecting with History}, which extends the secretary problem to a setting with historical information. We propose a strategy for this problem and calculate its success probability in the limit of a large sequence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Auction Theory and Applications · Complexity and Algorithms in Graphs
Selecting with History
Tom Hess and Sivan Sabato
Ben-Gurion University of the Negev
Beer Sheva, Israel
tomhe,sabatos@cs.bgu.ac.il
Abstract
We define a new selection problem, Selecting with History, which extends the secretary problem to a setting with historical information. We propose a strategy for this problem and calculate its success probability in the limit of a large sequence.
In the classical secretary problem (Dynkin, 1963; Gilbert and Mosteller, 1966), numbers appear at a random order. The algorithm is allowed to select a single number. If it decides to select a number, it must do so immediately, before observing the next numbers, and it cannot later change its decision. The goal of the algorithm is to select the maximal number with the highest probability, where the set of numbers is selected by an adversary and the order of their appearance is random. Gilbert and Mosteller (1966) show that, for any input size , there is a number such that the optimal strategy is to observe the first numbers, set to be the maximal number among those, and then select the first number in the rest of the sequence which is larger than . They show that and that the probability of success of the optimal strategy by also tends to when for .
In this note we define a new selection problem, Selecting with History (SwH), which extends the secretary problem to a setting with historical information. We propose a strategy for this problem, and calculate its success probability in the limit of a large sequence.
Let be integers, such that divides . Let be a finite set of real numbers of size . In this problem, the numbers in are ordered according to a uniformly random order. The algorithm observes the first numbers (the history). Then, the algorithm observes the last numbers (the selection sequence) one by one, and should select the maximal number in the selection sequence with the highest probability. As in the secretary problem, the algorithm may only select a number immediately after observing it, and cannot regret this selection later. The secretary problem is thus equivalent to with . When , one can ignore the history and simply apply the optimal secretary problem strategy to the selection sequence. However, this does not exploit the information from the history. Instead, we propose the following strategy for . This strategy is parametrized by .
During the first numbers in the selection sequence, select the first number that exceeds the th-largest value in the history. If no such number was found in this part of the selection sequence, select from the rest of the sequence the first number that exceeds the maximal number observed so far in the selection sequence.
This strategy is inspired by a strategy proposed in Gilbert and Mosteller (1966) for a setting where a selection sequence is drawn i.i.d. from a known distribution. Whereas under a known distribution the first threshold can be set based on this knowledge, here we estimate it based on the history.
As in the secretary problem, the probability of success of this strategy depends only on the rank order of the numbers, and not on their specific values. For that divides , we denote by the probability that the proposed strategy succeeds in selecting the maximal number from the selection sequence, for any of size . This probability depends on , which we leave as an implicit parameter of . For convenience, we also let , where denotes the success probability of the optimal secretary problem strategy on an input sequence of size .
Define
[TABLE]
By the definition of , . The following lemma gives the value of for , as a function of .
Lemma 1**.**
If , then
[TABLE]
Proof.
Let . We calculate based on its definition, and then take the limit . Let be the set of input numbers, where . Denote by the event that the strategy selects the maximal number in the selection sequence, when the strategy is applied to a random ordering of . Let be the set of numbers in the history, and let be the set of numbers in the selection sequence. Let be the set of first numbers in selection sequence. Let be the ’th largest number in , and let be the largest number in . The strategy described above selects the first number observed from which is larger than if one exists. Otherwise, it selects the first number observed from that is larger than (if one exists). Let . We have . Note that the probabilities all depend (implicitly) on . Let . For any ,
[TABLE]
Define , , and suppose that for some , . Assuming all these limits exist, we have
[TABLE]
If in addition , then, taking on the inequality above, we get
[TABLE]
We now give expressions for and . First, for , we calculate . Define the random variable which satisfies . If , this means that out of the numbers , exactly are in , and also . Therefore . Since and its content is allocated uniformly at random, we have
[TABLE]
Therefore
[TABLE]
Taking the limit for (recalling ) we get
[TABLE]
Second, to find , we now calculate . If , then all have , therefore no element will be selected from . The probability of success is thus exactly as the probability of success of the secretary problem strategy with input size and threshold . Denote this probability . We have, following the analysis in Ferguson (1989) for the secretary problem,
[TABLE]
Hence ,
[TABLE]
To find for , let be the location in of the maximal number . Note that if the strategy does not select anything before reaching location , it will certainly select by the definition of the strategy. Distinguish two cases:
If , then is selected as long as all other items that exceed are located after . Hence, for
[TABLE] 2. 2.
If , then is selected as long as all other items that exceed are located after , and also the maximal item in the first items is in the first items, so that is the first item in that is larger than . Hence, for ,
[TABLE] 3. 3.
Neither of the conditions above can hold if , since numbers cannot be located after in this case. Hence ,.
Therefore
[TABLE]
We have
[TABLE]
and . Therefore
[TABLE]
Taking the limit on both sides and defining , this gives, for ,
[TABLE]
Lastly, we are left to show an upper bound such that . Recall that if then . For an integer , denote . Note that if and only if , which occurs if and only if . Therefore We now give an upper bound on , using a concentration bound for sampling without replacement from a population. Denote the ordered numbers in the selection sequence by . Then is a sum of uniformly random draws without replacement from the sequence , where . We have . Hence ,. Setting for , we have . Hence ,
[TABLE]
By Bernstein’s inequality for sampling without replacement (Boucheron et al., 2013), setting and ,
[TABLE]
Noting that , and , we get that for some constant , Setting the RHS to , we get and , as required. Combining Eq. (1), Eq. (2), Eq. (3), Eq. (4) and the limit above, we get the equality in the statement of the lemma. ∎
The value of for a given can be calculated numerically. We propose to select . This gives, e.g. , , . Compare this to .
In a previous version of this manuscript we proposed to use the strategy for SwH above to improve the competitive ratio of the submodular secretary problem under resource constraints. Unfortunately our analysis turned out to have an error which we have not been able to solve as of yet.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Boucheron et al. (2013) S. Boucheron, G. Lugosi, and P. Massart. Concentration inequalities: A nonasymptotic theory of independence . Oxford university press, 2013.
- 2Dynkin (1963) E. B. Dynkin. The optimum choice of the instant for stopping a markov process. In Sov. Math. Dokl , volume 4(52), pages 627–629, 1963.
- 3Ferguson (1989) T. S. Ferguson. Who solved the secretary problem? Statistical science , pages 282–289, 1989.
- 4Gilbert and Mosteller (1966) J. P. Gilbert and F. Mosteller. Recognizing the maximum of a sequence. Journal of the American Statistical Association , 61(313):35–73, 1966.
