Wednesday, March 10, 2010

Wednesday Math, Vol. 112: Quicksort

It's common though misleading to think of computers as smart. They are phenomenally good at two parts of the way we often measure intelligence, they move at crazy fast speeds and they never make a mistake, which is to say they will do what they are told. But a computer has no gift for original thought, so presented with a problem in a form it has never seen before, a computer program has no way of moving forward unless the programmer was clever enough to consider the possibility beforehand.

Computer science became a major branch of mathematics in the 20th Century, and it now rivals older parts of math like calculus in terms of importance. Much of the field involves finding ways to do tasks that are more efficient than the "common sense" approach a human might naturally take.

We don't think that much about our methods of doing simple tasks. For example, let's say I have a pile of quizzes on my desk and I want to sort them alphabetically. What method do I use? Personally, I sort the papers first into three piles, A through I, J through Q, R through Z, then I sort the piles of papers one by one, taking a quiz from the unsorted pile and putting it into its correct position in the sorted group of papers I hold in my hand. Once I sort A-I, I sort J-Q and put that sorted stack after, then sort R-Z and put that stack at the end. If there are n papers in the stack, this should take n²/6 + n comparison operations. In computer science, we would say this is a method that works in the order of n².

Coming up with faster ways to do things on a computer can pay big dividends. A British mathematician named Tony Hoare invented an algorithm called Quicksort, a method nearly no human would use for sorting on his or her own but which fits a computer's skills very well.

Let's start with a list of numbers we want to sort.

31, 94, 69, 62, 25, 9, 20, 13, 11, 37, 5

The idea of the first pass of Quicksort is to put 31 in its proper position on the list, to put everything bigger than 31 behind 31 and everything smaller than 31 in front of it.

We start count from the front and from the back simultaneously, marked with a green star and a red star. When we find a number smaller than 31 at the back we stop and look for numbers bigger than 31 near the front. In this case, we have a 94 at the front and a 5 at the back. Switch them. Move the green star forward and the red star backward. As you can see we make several switches, 69 for 11, 62 for 13, until the green star and red star are at the same position, the number 20. Since 20 is less than 31, we now switch those two. The list has been split into three parts: The numbers less than 31 in front, followed by 31 itself, followed by the numbers bigger than 31. This is called a divide and conquer strategy. We will do the same divide and conquer on the two smaller lists and on each split after that. Instead of having an algorithm that works in the order of n² operations, Quicksort should sort the list in about n × logn comparisons.

Here come the big payoffs. If the list has 100 things in it, a standard sorting method will make about 5,000 comparisons, while Quicksort will make about 800 comparisons. If the list had a million things in it, which is not an impossible size for a computer, sorting it "normally" will take about 500 billion comparisons, while Quicksort would expect to do only about 30 million comparisons. The time savings get better in terms of percentage as the lists get longer.

You might think of both 30 million and 500 billion as being numbers too large to deal with, but if you have a computer on your desk that is less than ten years old, it can probably perform many millions of such operations in a second, so the 30 million operation program might run in no more than a few seconds, while the 500 billion operation program could take hours to finish, maybe even half a day.

Yay, computer science! Yay, Tony Hoare! You are so cool, I won't even make any obvious jokes about your last name.


Anonymous said...

Finally a math post even I can understand. Took me two plus years visiting here each Wednesday to find it. Put that math problem in your computer.

ken said...

I love Quicksort. So much computer science in such a small package.

Matty Boy said...

Z&M: I gotta believe you understood the gist of N/O and O/K a few weeks back, but if you say it's the first, we'll go with that. If this one went well, I guess I'll try MergeSort next week.

Ken: QuickSort is very pretty stuff. Only going through the first pass doesn't get to the power of recursion that it uses.