Main content

## Computer science

# Overview of quicksort

Like merge sort, quicksort uses divide-and-conquer, and so it's a recursive algorithm. The way that quicksort uses divide-and-conquer is a little different from how merge sort does. In merge sort, the divide step does hardly anything, and all the real work happens in the combine step. Quicksort is the opposite: all the real work happens in the divide step. In fact, the combine step in quicksort does absolutely nothing.

Quicksort has a couple of other differences from merge sort. Quicksort works in place. And its worst-case running time is as bad as selection sort's and insertion sort's: \Theta, left parenthesis, n, squared, right parenthesis. But its average-case running time is as good as merge sort's: \Theta, left parenthesis, n, log, start base, 2, end base, n, right parenthesis.

So why think about quicksort when merge sort is at least as good? That's because the constant factor hidden in the big-Θ notation for quicksort is quite good. In practice, quicksort outperforms merge sort, and it significantly outperforms selection sort and insertion sort.

Here is how quicksort uses divide-and-conquer. As with merge sort, think of sorting a subarray

`array[p..r]`

, where initially the subarray is `array[0..n-1]`

.**Divide**by choosing any element in the subarray`array[p..r]`

. Call this element the**pivot**.Rearrange the elements in`array[p..r]`

so that all elements in`array[p..r]`

that are less than or equal to the pivot are to its left and all elements that are greater than the pivot are to its right. We call this procedure**partitioning**. At this point, it doesn't matter what order the elements to the left of the pivot are in relation to each other, and the same holds for the elements to the right of the pivot. We just care that each element is somewhere on the correct side of the pivot.As a matter of practice, we'll always choose the rightmost element in the subarray,`array[r]`

, as the pivot. So, for example, if the subarray consists of [9, 7, 5, 11, 12, 2, 14, 3, 10, 6], then we choose 6 as the pivot. After partitioning, the subarray might look like [5, 2, 3, 6, 12, 7, 14, 9, 10, 11]. Let`q`

be the index of where the pivot ends up.**Conquer**by recursively sorting the subarrays`array[p..q-1]`

(all elements to the left of the pivot, which must be less than or equal to the pivot) and`array[q+1..r]`

(all elements to the right of the pivot, which must be greater than the pivot).**Combine**by doing nothing. Once the conquer step recursively sorts, we are done. Why? All elements to the left of the pivot, in`array[p..q-1]`

, are less than or equal to the pivot and are sorted, and all elements to the right of the pivot, in`array[q+1..r]`

, are greater than the pivot and are sorted. The elements in`array[p..r]`

can't help but be sorted!Think about our example. After recursively sorting the subarrays to the left and right of the pivot, the subarray to the left of the pivot is [2, 3, 5], and the subarray to the right of the pivot is [7, 9, 10, 11, 12, 14]. So the subarray has [2, 3, 5], followed by 6, followed by [7, 9, 10, 11, 12, 14]. The subarray is sorted.

The base cases are subarrays of fewer than two elements, just as in merge sort. In merge sort, you never see a subarray with no elements, but you can in quicksort, if the other elements in the subarray are all less than the pivot or all greater than the pivot.

Let's go back to the conquer step and walk through the recursive sorting of the subarrays. After the first partition, we have subarrays of [5, 2, 3] and [12, 7, 14, 9, 10, 11], with 6 as the pivot.

To sort the subarray [5, 2, 3], we choose 3 as the pivot. After partitioning, we have [2, 3, 5]. The subarray [2], to the left of the pivot, is a base case when we recurse, as is the subarray [5], to the right of the pivot.

To sort the subarray [12, 7, 14, 9, 10, 11], we choose 11 as the pivot. After partitioning, we have [7, 9, 10] to the left of the pivot and [14, 12] to the right. Then the subarrays are sorted,
resulting in [7, 9, 10], followed by 11, followed by [12, 14].

Here is how the entire quicksort algorithm unfolds. Array locations in blue have been pivots in previous recursive calls, and so the values in these locations will not be examined or moved again:

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.

## Want to join the conversation?

- Why select the right element as the pivot? Why not the median of three method, which is supposed to do it better?(32 votes)
- Pivot selection is an important part of quick sort and there are many techniques, all with pros and cons.
**Why does the pivot matter**?

On average quick sort runs in O(n log n) but if it consistently chooses bad pivots its performance degrades to O(n^2)

This happens if the pivot is consistently chosen so that all (or too many of) the elements in the array are < the pivot or > than the pivot.

(A classic case is when the first or last element is chosen as a pivot and the data is already sorted, or nearly sorted)**What are some techniques to choose a pivot**?

Choose the left most or rightmost element.

Pros: Simple to code, fast to calculate

Cons: If the data is sorted or nearly sorted, quick sort will degrade to O(n^2)

Choose the middle element:

Pros: Simple to code, fast to calculate, but slightly slower than the above methods

Cons: Still can degrade to O(n^2). Easy for someone to construct an array that will cause it to degrade to O(n^2)

Choose the median of three:

Pros: Fairly simple to code, reasonably fast to calculate, but slightly slower than the above methods

Cons: Still can degrade to O(n^2). Fairly easy for someone to construct an array that will cause it to degrade to O(n^2)

Choose the pivot randomly (using built in random function):

Pros: Simple to code. Harder for someone to construct an array that will cause it to degrade to O(n^2)

Cons: Selecting a random pivot is fairly slow. Still can degrade to O(n^2).

Choose the pivot randomly (using a custom built random function):

Pros: Much harder for someone to construct an array that will cause it to degrade to O(n^2), if they don't know how you are choosing the random numbers.

Cons: May be complicated to code. Selecting a random pivot is fairly slow. Still theoretically possible that it can degrade to O(n^2).

Use the median of medians method to select a pivot

Pros: The pivot is guaranteed to be good. Quick sort is now O(n log n) worst case !

Cons: Complicated code. Typically, a lot slower than the above methods.**Which method should I use**?

-If it is unlikely that the data will be sorted, and you are willing to accept O(n^2) in the rare cases when the array is sorted then use the leftmost or rightmost element.

-If there is a reasonable chance your data is sorted use the middle element or median of threes.

-If you are somewhat worried about malicious users giving you bad arrays to sort (used as a Denial of Service attack) then use random pivots.

-If you are really worried about malicious users or you need to guarantee that the quicksort runs is O(n log n) then use the median of medians. At this point you may want to seriously consider using a different sorting method like merge sort.

Hope this makes sense(190 votes)

- The second paragraph says, "constant factor hidden in the big-Θ notation for quicksort is quite good"

can someone tell me what is this hidden constant factor. Is it do with the time it takes to partition versus the time it takes to merge in merge sort.(8 votes)- Here's what they are talking about:

Both merge sort and quicksort are Θ(n log n) algorithms (for the average case).

So quicksort's running time will be ~k1 * n log n and merge sort's running time will be ~k2 * n log n (for the average case). When they are talking about the "constant factor"s, they are talking about k1 and k2. The values of k1 and k2 depend on the implementation of the algorithms, the computer used etc. In practice, if you ran quick sort and merge sort on the same computer, you would find that k1 is much smaller than k2.

Hope this makes sense(20 votes)

- After the first pivot - '6' is chosen, I can understand the left array being 5,2,3 since that is the order that they'll be visited in the original array. But how does the right subarray become 12,7,14,9,10,11 ?

Shouldn't it be 9,7,11,12,14,10?

I realize that in the bigger context of sorting, the subarray does not matter. But I'm just curious to find out how the subarray became 12,7,14,9,10,11.(13 votes)- It’s because quicksort doesn’t care relation among elements inside subarray. As long as they are smaller/equal or greater than the chosen pivot, they are placed left or right of the pivot accordingly.(1 vote)

- While working on the challenge, I came across something strange with the recursion calls and how to use the pivot point,
`q`

. Here is my quickSort function so far:

The above code produces the correct output but is not accepted by the grader.`var quickSort = function(array, p, r) {`

if (2 <= r-p) {

var q = partition(array, p, r);

quickSort(array, p, q-1);

quickSort(array, q, r);

}

};

If I change the quickSort recursion calls to the example below:

The wrong output is produced because there is an index in the array not being included in the sorting process.`quickSort(array, p, q-1);`

quickSort(array, q+1, r);

Now, I change it to this:

And I get an Oh Noes message that tells me there is an "InternalError: too much recursion". Why is this happening? How should I get past this issue?`quickSort(array, p, q);`

quickSort(array, q+1, r);(6 votes)- Hey, so you have this almost right. Your error, I believe, is stemming from your if statement, but your second edit of the quickSort calls is also the correct call.

If statement should be:`if ( p < r )`

This ensure that the array is of adequate length, because if the array is 1 or zero, p would either equal r or not exist.

Your quickSort calls should be:`quickSort(array, p, q-1);`

quickSort(array, q+1, r);

The partition places the pivot in the correct spot and returns the index. You then use the pivot index to sort the two remaining arrays, and since the pivot is already "sorted" you just sort the subarrays to indexes one below and one above the pivot. Once all the recursion occurs, everything is sorted in place and it should return the whole array sorted.(13 votes)

- In the Challenge, Implement quicksort, I played with ( < ); until I got it to work with (p < r), which I really do not understand. Can someone please explain why it works? Thanx.d(3 votes)
- "The quickSort function should recursively sort the subarray array[p..r]."

If p = r then you have a one element array that is, by definition, already sorted. So, you only need to sort subarrays that have more than one element.

How many elements does array[p..r] have ?

It has r-p+1 elements.

So we need to sort when r-p+1 > 1

Which is equivalent to:

r-p > 0,

r > p,

p < r,

etc.(5 votes)

- So this algorithm basically splits it in half, the halves are bigger/smaller than the middle one, then you sort them and put them together?(2 votes)
- That's the basic idea.

However, we don't really know which element is the "middle one", so we just pick an element. As a result, the smaller and bigger chunks may not be the same size. That is, we may not actually be splitting the array in half.

If the data is randomly ordered it works pretty well i.e. the big and small chunks will be close to the same size. However, on already sorted data, it can fail miserably, as one chunk can have 1 element while the other chunk has the rest.(6 votes)

- No matter what I do, it always says maximum call stack exceeded. I don't understand how we can evaluate subarrays, because we have not had the pivot yet(2 votes)
- If you are getting max call stack exceeded, i.e. a stack overflow, it is likely because your recursion is not reaching the base case. Instead it is infinitely calling itself, kind of like an infinite loop.

Make sure your base case is set up right, and that your recursion makes the problem smaller each call. To debug, try printing out the parameters that the function was called with at the top of the function.(6 votes)

- Can I get a more precise easy to understand definition of working in place?(4 votes)
- How many comparisons in total were there in the example?(2 votes)
- I believe that there are a total of 20 comparisons made in the example.(3 votes)

- array[p..r]

What symbolic mearning do the letters p and r have here, what are they referring to?(1 vote)- p is the leftmost index of the subarray

r is the rightmost index of the subarray

q is the index of the pivot (after the subarray has been partitioned)(3 votes)