Consider a deck of n numbered cards. You play the following game: you turn over the first card and retain it. Then you turn the next one: if it ranks less than the first one, discard it; if it ranks greater then keep it and make it the new benchmark. Repeat this step for each subsequent card. At the end of the deck, count the number of cards you have retained. For a given ‘n’, what is the expected number of cards? If there were 11 cards? If there were 12,367 cards?
We’ll start with a few small values of n: trivially when n=0, the number of cards retained, x, will also be 0. For n=1, x will also be 1. For n=2, the two cards will either be in descending or ascending order, so x could be either 1 or 2, with equal probability, so X (big x, the average value of x) will be 1.5.
Since no cards will be retained after the maximum card has been seen, it is useful to consider where in the deck the maximum card might be and what the consequences are:
Consider the case with three cards. If the 3 card comes up first, which it will do in 1/3 of cases, x will be 1. If the 3 card comes up second, x will be 2. If the 3 card comes up last, x may be 2 or 3, depending on the order of the 1 and 2 cards.
In other words, when the maximum 3 card is first, there are 0 cards before it so the average number of cards retained before reaching the maximum card is 0, so the count will be 1 more than that, 1.
If the maximum card is second, there is 1 card before it so the average number of cards retained before reaching the maximum card is 1, so the count will be 1 more than that, 2.
If the maximum card is third, there are 2 cards before it so the average number of cards retained before reaching the maximum card is 1.5, so the count will be 1 more than that, 2.5.
Those three cases are equally likely, so we can express X3 in terms of previous X values:
X3 = ((X0) + 1 + (X1) +1 + (X2) +1)/3
We can generalise this so that any Xn can be expressed as an equation involving previous X values by extending the idea of considering where the maximum card is and how many cards are retained prior to reaching it:
Xn = ((X0) + (X1) + … + (Xn-1) + n)/n
Now a very interesting thing happens if you compare the equation for Xn with that of Xn-1
Xn-1 = ((X0) + (X1) + … + (Xn-2) + n-1)/(n-1)
In the first equation you multiply both sides by n, and in the second equation by (n-1) to get the following two equations:
n(Xn) = (X0) + (X1) + … + (Xn-1) + n
(n-1)(Xn-1) = (X0) + (X1) + … + (Xn-2) + n-1
Now if you take the difference between those two equations almost all the terms cancel out:
n(Xn) - (n-1)(Xn-1) = (Xn-1) + 1
Which then simplifies to:
(Xn) – (Xn-1) = 1/n
This is great as it means that since X0 = 0, the sequence of Xn is merely the partial sum of the harmonic series:
1 + 1/2 + 1/3 + 1/4 +1/5 …
… a series about which we know a great deal.
We know for instance that the sequence does not converge but tends to infinity. This rings true when we consider the numbered cards, as we would expect Xn to keep creeping up as the number of cards increased and it would seem very odd for it to converge to an upper bound.
What is surprising I feel, when considering it in terms of the cards, is just how slowly Xn does increase.
It seems quite reasonable that Xn first exceeds 3 when n is 11, that is when there are 11 numbered cards, the average number of cards retained is around 3.
However, when we calculate when the expected number of retained cards first exceeds 10, the number of cards would be 12367 (!).