veridis quo
I posed the problem of finding the execution time of increment sort without first figuring it out myself. What folly! Here goes:
Obviously the best-case is O(n) to check all elements of a pre-sorted list, and the worst-case is O(∞), as is common among randomized algorithms. But for the average case, not only does the running time depend on the sizes and spread of the elements in mysterious ways, but it turns out that merely determining the expected number of iterations is hopelessly complicated for all but the simplest of inputs (at least, using only high-school mathematics).
So let’s look at the simplest non-trivial input, the list [1,0]. With just two elements, we’re essentially flipping a coin to decide which element to increment, repeating until the second element catches up to the first. Since we only require that the number of tails is one more than the number of heads, coin-toss sequences are isomorphic to downward paths in Pascal’s triangle from \( \binom{1}{1} \) to \( \binom{2k+1}{k+1} \) [Edit: where k is the number of heads].
The total number of paths to \( \binom{2k+1}{k+1} \) in Pascal’s triangle is, of course, exactly this binomial coefficient. However, note that valid paths must stay on the left side of the triangle until the last minute, so we’re actually interested in paths to \( \binom{2k}{k} \) in a “modified” triangle that only counts left-side paths. It’s not too hard to convince yourself that these numbers are
$$ \binom{n}{r}’ = \binom{n}{r} \frac{n – 2r + 1}{n – r + 1} $$
which can be proved via induction, after verifying the base-case identities \( \binom{n}{0}’ = 1 \) and \( \binom{2n+1}{n+1}’ = 0 \).
Now, the probability of choosing any particular path to row \(2k+1\) is \( 2^{-(2k+1)} \), and so the expected number of iterations to sort [1,0] is
$$ \sum_{k=0}^\infty \frac{ 2k+1 }{ 2^{2k+1} (k+1) } \binom{2k}{k} $$
Stirling’s super-handy approximation gives
$$ \binom{2k}{k} \sim \frac{4^k}{\sqrt{\pi k}} $$
and simplifies the expected value sum to approximately
$$ \sum_{k=0}^\infty \frac{2k+1 }{ 2(k+1)\sqrt{\pi k} } $$
Hm. This most certainly does not converge. Which means increment sort is O(∞) on average!
Well then. The astounding ineffectiveness of this algorithm is just beginning to dawn on me.
Comments (3)
May 9th, 2010 at 6:22 am
I can’t really follow your maths — where did the thing about coin-toss sequences being the same as $\binom{2k+1}{k+1}$? I assume $k$ is the number of coin tosses.
The number of ways of tossing $n$ coins and getting $k$ heads is $\binom{n}{k}$. That’s where the Pascal triangle stuff comes in – the number at the $k$’th column in the $n$’th row of Pascal’s triangle is precisely $\binom{n}{k}$.
You don’t even need to get that complicated though. Increment sort of [1,0] will take $n$ steps only if it looks at the wrong element $n-1$ times, and then finally looks at the correct element once. This set-up is within grasping distance of high-school maths: it’s the geometric distribution. Using that fact, the expected number of iterations is $\frac{1}{0.5} = 2$.
May 9th, 2010 at 6:12 pm
Oops, I wasn’t clear enough about the relationship to traversals of Pascal’s triangle. I used 2k+1 just to force the number of coin-tosses to be odd. A successful sort of
[1,0]always has k heads and k+1 tails, for some positive integer k. That’s where $\binom{2k+1}{k+1}$ comes from.And again, this binomial coefficient needs to be adjusted to include only valid coin-toss sequences, i.e. those that wouldn’t have terminated any earlier.
Judging from your last paragraph though, it looks like you might not have grasped the profound ineffectiveness of increment sort yet. Remember: wrong elements are also incremented when chosen, which moves the list farther away from sorted-ness. That’s why Pascal’s machinery is invoked.
June 11th, 2010 at 6:09 am
He finally found the answer!