**What is an algorithm?**Our text defines an

*algorithm*to be any well-defined computational procedure that takes some

values as

*input*and produces some values as*output*. Like a cooking recipe, an algorithm provides a step-by-stepmethod for solving a computational problem. Unlike programs, algorithms are not dependent on a particular

programming language, machine, system, or compiler. They are mathematical entities, which can be thought of

as running on some sort of

*idealized computer*with an infinite random access memory and an unlimited wordsize. Algorithm design is all about the mathematical theory behind the design of good programs.

**Recurrences:**Another useful mathematical tool in algorithm analysis will be recurrences. They arise naturally in the

analysis of divide-and-conquer algorithms. Recall that these algorithms have the following general structure.

**Divide:**Divide the problem into two or more subproblems (ideally of roughly equal sizes),

**Conquer:**Solve each subproblem recursively, and

**Combine:**Combine the solutions to the subproblems into a single global solution.

How do we analyze recursive procedures like this one? If there is a simple pattern to the sizes of the recursive

calls, then the best way is usually by setting up a

*recurrence*, that is, a function which is defined recursively interms of itself. Here is a typical example. Suppose that we break the problem into two subproblems, each of size

roughly

*n=*2. (We will assume exactly*n=*2 for simplicity.). The additional overhead of splitting and mergingthe solutions is

*O*(*n*). When the subproblems are reduced to size 1, we can solve them in*O*(1) time. We willignore constant factors, writing

*O*(*n*) just as*n*, yielding the following recurrence:*T*(

*n*) = 1 if

*n*= 1,

*T*(

*n*) = 2

*T*(

*n=*2) +

*n*if

*n >*1.

Note that, since we assume that

*n*is an integer, this recurrence is not well defined unless*n*is a power of 2 (sinceotherwise

*n=*2 will at some point be a fraction). To be formally correct, I should either write*b**n=*2*c*or restrictthe domain of

*n*, but I will often be sloppy in this way.There are a number of methods for solving the sort of recurrences that show up in divide-and-conquer algorithms.

The easiest method is to apply the

*Master Theorem*, given in CLRS. Here is a slightly more restrictiveversion, but adequate for a lot of instances. See CLRS for the more complete version of the Master Theorem

and its proof.

**Theorem:**(Simplified Master Theorem) Let

*a*

*_*1,

*b >*1 be constants and let

*T*(

*n*) be the recurrence

*T*(

*n*) =

*aT*(

*n=b*) +

*cn*

*k*

*;*

defined for

*n**_*0.**Case 1:**

*a > b*

*k*then

*T*(

*n*) is _(

*n*log

*b*

*a*).

**Case 2:**

*a*=

*b*

*k*then

*T*(

*n*) is _(

*n*

*k*log

*n*).

**Case 3:**

*a < b*

*k*then

*T*(

*n*) is _(

*n*

*k*).

Using this version of the Master Theorem we can see that in our recurrence

*a*= 2,*b*= 2, and*k*= 1, so*a*=*b**k*and Case 2 applies. Thus

*T*(*n*) is _(*n*log*n*).There many recurrences that cannot be put into this form. For example, the following recurrence is quite

common:

*T*(*n*) = 2*T*(*n=*2) +*n*log*n*. This solves to*T*(*n*) = _(*n*log2*n*), but the Master Theorem (either thisform or the one in CLRS will not tell you this.) For such recurrences, other methods are needed.

**Lecture 3: Review of Sorting and Selection**

**Read:**Review Chapts. 6–9 in CLRS.

Lecture Notes 7 CMSC 451

**Review of Sorting:**Sorting is among the most basic problems in algorithm design. We are given a sequence of items,

each associated with a given

*key value*. The problem is to permute the items so that they are in increasing (ordecreasing) order by key. Sorting is important because it is often the first step in more complex algorithms.

Sorting algorithms are usually divided into two classes,

*internal sorting algorithms*, which assume that data isstored in an array in main memory, and

*external sorting algorithm*, which assume that data is stored on disk orsome other device that is best accessed sequentially. We will only consider internal sorting.

You are probably familiar with one or more of the standard simple _(

*n*2) sorting algorithms, such as*Insertion-**Sort*,

*SelectionSort*and

*BubbleSort*. (By the way, these algorithms are quite acceptable for small lists of, say,

fewer than 20 elements.) BubbleSort is the easiest one to remember, but it widely considered to be the worst of

the three.

The three canonical efficient comparison-based sorting algorithms are

*MergeSort*,*QuickSort*, and*HeapSort*. Allrun in _(

*n*log*n*) time. Sorting algorithms often have additional properties that are of interest, depending on theapplication. Here are two important properties.

**In-place:**The algorithm uses no additional array storage, and hence (other than perhaps the system’s recursion

stack) it is possible to sort very large lists without the need to allocate additional working storage.

**Stable:**A sorting algorithm is stable if two elements that are equal remain in the same relative position after

sorting is completed. This is of interest, since in some sorting applications you sort first on one key and

then on another. It is nice to know that two items that are equal on the second key, remain sorted on the

first key.

Here is a quick summary of the fast sorting algorithms. If you are not familiar with any of these, check out the

descriptions in CLRS. They are shown schematically in Fig. 1

**QuickSort:**It works recursively, by first selecting a random “pivot value” from the array. Then it partitions the

array into elements that are less than and greater than the pivot. Then it recursively sorts each part.

QuickSort is widely regarded as the fastest of the fast sorting algorithms (on modern machines). One

explanation is that its inner loop compares elements against a single pivot value, which can be stored in

a register for fast access. The other algorithms compare two elements in the array. This is considered

an

*in-place*sorting algorithm, since it uses no other array storage. (It does implicitly use the system’srecursion stack, but this is usually not counted.) It is

*not stable*. There is a stable version of QuickSort,but it is not in-place. This algorithm is _(

*n*log*n*) in the*expected case*, and _(*n*2) in the worst case. Ifproperly implemented, the probability that the algorithm takes asymptotically longer (assuming that the

pivot is chosen randomly) is extremely small for large

*n*.QuickSort:

MergeSort:

HeapSort:

Heap

extractMax

x partition < x x > x

sort sort

x

split

sort

merge

buildHeap

Fig. 1: Common

*O*(*n*log*n*) comparison-based sorting algorithms.Lecture Notes 8 CMSC 451

**MergeSort:**MergeSort also works recursively. It is a classical divide-and-conquer algorithm. The array is split

into two subarrays of roughly equal size. They are sorted recursively. Then the two sorted subarrays are

merged together in _(

*n*) time.MergeSort is the only

*stable*sorting algorithm of these three. The downside is the MergeSort is the onlyalgorithm of the three that requires additional array storage (ignoring the recursion stack), and thus it is

*not in-place*. This is because the merging process merges the two arrays into a third array. Although it is

possible to merge arrays in-place, it cannot be done in _(

*n*) time.