CE100 Algorithms and Programming II

CE100 Algorithms and Programming II

Week-3 (Matrix Multiplication/ Quick Sort)

Spring Semester, 2021-2022

Download DOC-PDF, DOC-DOCX, SLIDE, PPTX

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication / Quick Sort

Outline (1)

  • Matrix Multiplication

    • Traditional

    • Recursive

    • Strassen

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Outline (2)

  • Quicksort

    • Hoare Partitioning

    • Lomuto Partitioning

    • Recursive Sorting

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Outline (3)

  • Quicksort Analysis

    • Randomized Quicksort

    • Randomized Selection

      • Recursive

      • Medians

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication (1)

  • Input: A=[aij],B=[bij]A=[a_{ij}],B=[b_{ij}]
  • Output: C=[cij]=ABC=[c_{ij}]=A \cdot B i,j=1,2,3,,n\Longrightarrow i,j=1,2,3, \dots, n

[c11c12c1nc21c22c2ncn1cn2cnn]=[a11a12a1na21a22a2nan1an2ann][b11b12b1nb21b22b2nbn1an2bnn]\begin{bmatrix} c_{11} & c_{12} & \dots & c_{1n} \\ c_{21} & c_{22} & \dots & c_{2n} \\ \vdots & \vdots & \vdots & \ddots \\ c_{n1} & c_{n2} & \dots & c_{nn} \\ \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \vdots & \ddots \\ a_{n1} & a_{n2} & \dots & a_{nn} \\ \end{bmatrix} \cdot \begin{bmatrix} b_{11} & b_{12} & \dots & b_{1n} \\ b_{21} & b_{22} & \dots & b_{2n} \\ \vdots & \vdots & \vdots & \ddots \\ b_{n1} & a_{n2} & \dots & b_{nn} \\ \end{bmatrix}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication (2)

alt:"alt" center

  • cij=1knaik.bkjc_{ij}=\sum \limits_{1\leq k \leq n}^{}a_{ik}.b_{kj}
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Standard Algorithm

Running Time: Θ(n3)\Theta(n^3)

for i=1 to n do
    for j=1 to n do
        C[i,j] = 0
        for k=1 to n do
            C[i,j] = C[i,j] + A[i,k] + B[k,j]
        endfor
    endfor
endfor
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Divide & Conquer (1)

IDEA: Divide the nxnnxn matrix into 2x22x2 matrix of (n/2)x(n/2)(n/2)x(n/2) submatrices.

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Divide & Conquer (2)

[c11c12c21c22]=[a11a12a21a22][b11b12b21b22]\begin{bmatrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \cdot \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix}

8 mults and 4 adds of (n/2)*(n/2) submatrices={c11=a11b11+a12b21c21=a21b11+a22b21c12=a11b12+a12b22c22=a21b12+a22b22\text{8 mults and 4 adds of (n/2)*(n/2) submatrices}= \begin{cases} c_{11}=a_{11}b_{11}+a_{12}b_{21} \\ c_{21}=a_{21}b_{11}+a_{22}b_{21} \\ c_{12}=a_{11}b_{12}+a_{12}b_{22} \\ c_{22}=a_{21}b_{12}+a_{22}b_{22} \end{cases}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Divide & Conquer (3)

MATRIX-MULTIPLY(A, B)
    // Assuming that both A and B are nxn matrices
    if n == 1 then 
        return A * B
    else  
        //partition A, B, and C as shown before
        C[1,1] = MATRIX-MULTIPLY (A[1,1], B[1,1]) + 
                 MATRIX-MULTIPLY (A[1,2], B[2,1]); 

        C[1,2] = MATRIX-MULTIPLY (A[1,1], B[1,2]) + 
                MATRIX-MULTIPLY (A[1,2], B[2,2]); 

        C[2,1] = MATRIX-MULTIPLY (A[2,1], B[1,1]) + 
        MATRIX-MULTIPLY (A[2,2], B[2,1]);

        C[2,2] = MATRIX-MULTIPLY (A[2,1], B[1,2]) + 
        MATRIX-MULTIPLY (A[2,2], B[2,2]);
    endif      

    return C
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Divide & Conquer Analysis

T(n)=8T(n/2)+Θ(n2)T(n) = 8T(n/2) + \Theta(n^2)

  • 88 recursive calls 8T()\Longrightarrow 8T(\cdots)
  • each problem has size n/2n/2 T(n/2)\Longrightarrow \cdots T(n/2)
  • Submatrix addition Θ(n2)\Longrightarrow \Theta(n^2)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Solving the Recurrence

  • T(n)=8T(n/2)+Θ(n2)T(n) = 8T(n/2) + \Theta(n^2)

    • a=8a=8, b=2b=2
    • f(n)=Θ(n2)f(n)=\Theta(n^2)
    • nlogba=n3n^{log_b^a}=n^3
  • Case 1: nlogbaf(n)=Ω(nε)T(n)=Θ(nlogba)\frac{n^{log_b^a}}{f(n)}=\Omega(n^{\varepsilon}) \Longrightarrow T(n)=\Theta(n^{log_b^a})

Similar with ordinary (iterative) algorithm.

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Strassen’s Idea (1)

Compute c11,c12,c21,c22c_{11},c_{12},c_{21},c_{22} using 77 recursive multiplications.

In normal case we need 88 as below.

[c11c12c21c22]=[a11a12a21a22][b11b12b21b22]\begin{bmatrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \cdot \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix}

8 mults and 4 adds of (n/2)*(n/2) submatrices={c11=a11b11+a12b21c21=a21b11+a22b21c12=a11b12+a12b22c22=a21b12+a22b22\text{8 mults and 4 adds of (n/2)*(n/2) submatrices}= \begin{cases} c_{11}=a_{11}b_{11}+a_{12}b_{21} \\ c_{21}=a_{21}b_{11}+a_{22}b_{21} \\ c_{12}=a_{11}b_{12}+a_{12}b_{22} \\ c_{22}=a_{21}b_{12}+a_{22}b_{22} \end{cases}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Strassen’s Idea (2)

  • Reminder:
    • Each submatrix is of size (n/2)(n/2)(n/2)*(n/2)
    • Each add/sub operation takes Θ(n2)\Theta(n^2) time
  • Compute P1P7P1 \dots P7 using 77 recursive calls to matrix-multiply

P1=a11(b12b22)P2=(a11+a12)b22P3=(a21+a22)b11P4=a22(b21b11)P5=(a11+a22)(b11+b22)P6=(a12a22)(b21+b22)P7=(a11a21)(b11+b12)\begin{align*} P_1 & = a_{11} * (b_{12} - b_{22} ) \\ P_2 & = (a_{11} + a_{12} ) * b_{22} \\ P_3 & = (a_{21} + a_{22} ) * b_{11} \\ P_4 & = a_{22} * (b_{21} - b_{11} ) \\ P_5 & = (a_{11} + a_{22} ) * (b_{11} + b_{22} ) \\ P_6 & = (a_{12} - a_{22} ) * (b_{21} + b_{22} ) \\ P_7 & = ( a_{11} - a_{21} ) * (b_{11} + b_{12} ) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Strassen’s Idea (3)

P1=a11(b12b22)P2=(a11+a12)b22P3=(a21+a22)b11P4=a22(b21b11)P5=(a11+a22)(b11+b22)P6=(a12a22)(b21+b22)P7=(a11a21)(b11+b12)\begin{align*} P_1 &= a_{11} * (b_{12} - b_{22} ) \\ P_2 &= (a_{11} + a_{12} ) * b_{22} \\ P_3 &= (a_{21} + a_{22} ) * b_{11} \\ P_4 &= a_{22} * (b_{21} - b_{11} ) \\ P_5 &= (a_{11} + a_{22} ) * (b_{11} + b_{22} ) \\ P_6 &= (a_{12} - a_{22} ) * (b_{21} + b_{22} ) \\ P_7 &= ( a_{11} - a_{21} ) * (b_{11} + b_{12} ) \end{align*}

  • How to compute cijc_{ij} using P1P7P1 \dots P7 ?

c11=P5+P4P2+P6c12=P1+P2c21=P3+P4c22=P5+P1P3P7\begin{align*} c_{11} & = P_5 + P_4 – P_2 + P_6 \\ c_{12} & = P_1 + P_2 \\ c_{21} & = P_3 + P_4 \\ c_{22} & = P_5 + P_1 – P_3 – P_7 \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Strassen’s Idea (4)

  • 77 recursive multiply calls
  • 1818 add/sub operations
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Matrix Multiplication: Strassen’s Idea (5)

e.g. Show that c12=P1+P2c_{12} = P_1+P_2 :

c12=P1+P2=a11(b12b22)+(a11+a12)b22=a11b12a11b22+a11b22+a12b22=a11b12+a12b22\begin{align*} c_{12} & = P_1 + P_2 \\ &= a_{11}(b_{12}–b_{22})+(a_{11}+a_{12})b_{22} \\ &= a_{11}b_{12}-a_{11}b_{22}+a_{11}b_{22}+a_{12}b_{22} \\ &= a_{11}b_{12}+a_{12}b_{22} \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Strassen’s Algorithm

  • Divide: Partition AA and BB into (n/2)(n/2)(n/2)*(n/2) submatrices. Form terms to be multiplied using ++ and -.

  • Conquer: Perform 77 multiplications of (n/2)(n/2)(n/2)*(n/2) submatrices recursively.

  • Combine: Form CC using ++ and on (n/2)(n/2)(n/2)*(n/2)submatrices.

Recurrence: T(n)=7T(n/2)+Θ(n2)T(n) = 7T(n/2) + \Theta(n^2)

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Strassen’s Algorithm: Solving the Recurrence (1)

  • T(n)=7T(n/2)+Θ(n2)T(n) = 7T(n/2) + \Theta(n^2)

    • a=7a=7, b=2b=2
    • f(n)=Θ(n2)f(n)=\Theta(n^2)
    • nlogba=nlg7n^{log_b^a}=n^{lg7}
  • Case 1: nlogbaf(n)=Ω(nε)T(n)=Θ(nlogba)\frac{n^{log_b^a}}{f(n)}=\Omega(n^{\varepsilon}) \Longrightarrow T(n)=\Theta(n^{log_b^a})

T(n)=Θ(nlog27)T(n)=\Theta(n^{log_2^7})

23=8,22=42^3 = 8, 2^2=4 so log272.81\Longrightarrow log_2^7 \approx 2.81

or use https://www.omnicalculator.com/math/log

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Strassen’s Algorithm: Solving the Recurrence (2)

  • The number 2.812.81 may not seem much smaller than 33

  • But, it is significant because the difference is in the exponent.

  • Strassen’s algorithm beats the ordinary algorithm on today’s machines for n30n \geq 30 or so.

  • Best to date: Θ(n2.376)\Theta(n^{2.376 \dots}) (of theoretical interest only)

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Maximum Subarray Problem

Input: An array of values

Output: The contiguous subarray that has the largest sum of elements

  • Input array:
    [13][3][25][20][3][16][23][18][20][7][12]max. contiguous subarray[22][4][7][13][-3][-25][20][-3][-16][-23]\overbrace{[18][20][-7][12]}^{\textrm{max. contiguous subarray}}[-22][-4][7]
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Maximum Subarray Problem: Divide & Conquer (1)

  • Basic idea:
    • Divide the input array into 2 from the middle
    • Pick the best solution among the following:
      • The max subarray of the left half
      • The max subarray of the right half
      • The max subarray crossing the mid-point
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Maximum Subarray Problem: Divide & Conquer (2)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Maximum Subarray Problem: Divide & Conquer (3)

  • Divide: Trivial (divide the array from the middle)

  • Conquer: Recursively compute the max subarrays of the left and right halves

  • Combine: Compute the max-subarray crossing the midpointmid-point

    • (can be done in Θ(n)\Theta(n) time).
    • Return the max among the following:
      • the max subarray of the left-subarray\text{left-subarray}
      • the max subarray of the rightsubarray\text{rightsubarray}
      • the max subarray crossing the mid-point\text{mid-point}

TODO : detailed solution in textbook...

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Conclusion : Divide & Conquer

  • Divide and conquer is just one of several powerful techniques for algorithm design.
  • Divide-and-conquer algorithms can be analyzed using recurrences and the master method (so practice this math).
  • Can lead to more efficient algorithms
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Quicksort (1)

  • One of the most-used algorithms in practice
  • Proposed by C.A.R. Hoare in 1962.
  • Divide-and-conquer algorithm
  • In-place algorithm
    • The additional space needed is O(1)
    • The sorted array is returned in the input array
    • Reminder: Insertion-sort is also an in-place algorithm, but Merge-Sort is not in-place.
  • Very practical
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Quicksort (2)

  • Divide: Partition the array into 2 subarrays such that elements in the lower part \leq elements in the higher part
  • Conquer: Recursively sort 2 subarrays
  • Combine: Trivial (because in-place)

Key: Linear-time (Θ(n))(\Theta(n)) partitioning algorithm

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Divide: Partition the array around a pivot element

  • Choose a pivot element xx

  • Rearrange the array such that:

    • Left subarray: All elements x\leq x
    • Right subarray: All elements x\geq x

    alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Conquer: Recursively Sort the Subarrays

Note: Everything in the left subarray ≤ everything in the right subarray

alt:"alt" center

Note: Combine is trivial after conquer. Array already sorted.

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Two partitioning algorithms

  • Hoare’s algorithm:
    Partitions around the first element of subarray
    • (pivot=x=A[p])(pivot = x = A[p])

alt:"alt" center

  • Lomuto’s algorithm:
    Partitions around the last element of subarray
    • (pivot=x=A[r])(pivot = x = A[r])

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm (1)

  • Choose a pivot element: pivot=x=A[p]pivot = x = A[p]

alt:"alt" center

  • Grow two regions:
    • from left to right: A[pi]A[p \dots i]
    • from right to left: A[jr]A[j \dots r]
      • such that:
    • every element in A[pi]A[p \dots i] \leq pivot
    • every element in A[pi]A[p \dots i] \geq pivot
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm (2)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm (3)

  • Elements are exchanged when
    • A[i]A[i] is too large to belong to the left region
    • A[j]A[j] is too small to belong to the right region
      • assuming that the inequality is strict
  • The two regions A[pi]A[p \dots i] and A[jr]A[j \dots r] grow until A[i]pivotA[j]A[i] \geq pivot \geq A[j]
  H-PARTITION(A, p, r)
        pivot = A[p]
          i = p - 1
          j = r - 1
          while true do
            repeat j = j - 1 until A[j] <= pivot
            repeat i = i - 1 until A[i] <= pivot
          if i < j then 
            exchange A[i] with A[j]
          else 
            return j
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-1)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-2)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-3)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-4)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-5)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-6)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-7)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-8)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-9)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-10)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-11)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm Example (Step-12)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm - Notes

  • Elements are exchanged when
    • A[i]A[i] is too large to belong to the left region
    • A[j]A[j] is too small to belong to the right region
      • assuming that the inequality is strict
  • The two regions A[pi]A[p \dots i] and A[jr]A[j \dots r] grow until A[i]pivotA[j]A[i] \geq pivot \geq A[j]
  • The asymptotic runtime of Hoare’s partitioning algorithm Θ(n)\Theta(n)
  H-PARTITION(A, p, r)
        pivot = A[p]
          i = p - 1
          j = r - 1
          while true do
            repeat j = j - 1 until A[j] <= pivot
            repeat i = i - 1 until A[i] <= pivot
          if i < j then exchange A[i] with A[j]
          else return j
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Quicksort with Hoare’s Partitioning Algorithm

QUICKSORT (A, p, r)
  if p < r then
    q = H-PARTITION(A, p, r)
    QUICKSORT(A, p, q)
    QUICKSORT(A, q + 1, r)
  endif

Initial invocation: QUICKSORT(A,1,n)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Hoare’s Partitioning Algorithm: Pivot Selection

  • if we select pivot to be A[r]A[r] instead of A[p]A[p] in H-PARTITION

alt:"alt" center

alt:"alt" center

  • Consider the example where A[r]A[r] is the largest element in the array:
    • End of H-PARTITION: i=j=ri = j = r
    • In QUICKSORT: q=rq = r
      • So, recursive call to:
        • QUICKSORT(A, p, q=r)
          • infinite loop
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Correctness of Hoare’s Algorithm (1)

We need to prove 33 claims to show correctness:

  • Indices ii and jj never reference AA outside the interval A[pr]A[p \dots r]
  • Split is always non-trivial; i.e., jrj \neq r at termination
  • Every element in A[pj]A[p \dots j] \leq every element in A[j+1r]A[j+1 \dots r] at termination

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Correctness of Hoare’s Algorithm (2)

  • Notations:

    • kk: #\# of times the while-loop iterates until termination
    • imi_m: the value of index i at the end of iteration mm
    • jmj_m: the value of index j at the end of iteration mm
    • xx: the value of the pivot element
  • Note: We always have i1=pi_1= p and pj1rp \leq j_1 \leq r
    because x=A[p]x = A[p]

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Correctness of Hoare’s Algorithm (3)

Lemma 1: Either ik=jki_k = j_k or ik=jk+1i_k = j_k+1 at termination

Proof of Lemma 1:

  • The algorithm terminates when iji \geq j (the else condition).

  • So, it is sufficient to prove that ikjk1i_k – j_k \leq 1

  • There are 22 cases to consider:

    • Case 1: k=1k = 1, i.e. the algorithm terminates in a single iteration
    • Case 2: k>1k > 1, i.e. the alg. does not terminate in a single iter.

    By contradiction, assume there is a run with ikjk>1i_k – j_k > 1

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Correctness of Hoare’s Algorithm (4)

Original correctness claims:

  • Indices ii and jj never reference A outside the interval A[pr]A[p \dots r]
  • Split is always non-trivial; i.e., jrj \neq r at termination

Proof:

  • For k=1k = 1:
    • Trivial because i1=j1=pi_1 = j_1 = p (see Case 1 in proof of Lemma 2)
  • For k>1k > 1:
    • ik>pi_k > p and jk<rj_k < r (due to the repeat-until loops moving indices)
    • ikri_k \leq r and jkpj_k \geq p (due to Lemma 1 and the statement above)

The proof of claims (a) and (b) complete

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Correctness of Hoare’s Algorithm (5)

Lemma 2: At the end of iteration mm, where m<km<k (i.e. m is not the last iteration), we must have:
A[pim]xA[p \dots i_m] \leq x and A[jmr]xA[j_m \dots r] \geq x

Proof of Lemma 2:

  • Base case: m=1m=1 and k>1k>1 (i.e. the alg. does not terminate in the first iter.)

Ind. Hyp.: At the end of iteration m1m-1, where m<km<k (i.e. m is not the last iteration), we must have:
A[pim1]xA[p \dots i_m-1] \leq x and A[jm1r]xA[j_m-1 \dots r] \geq x

General case: The lemma holds for mm, where m<km < k

Proof of base case complete!

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Correctness of Hoare’s Algorithm (6)

Original correctness claim:

  • (c) Every element in A[j]A[ \dots j] \leq every element in A[j+r]A[j+ \dots r] at termination

Proof of claim (c)

  • There are 33 cases to consider:
    • Case 1: k=1k=1, i.e. the algorithm terminates in a single iteration
    • Case 2: k>1k>1 and ik=jki_k = j_k
    • Case 3: k>1k>1 and ik=jk+1i_k = j_k + 1
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm (1)

  • Choose a pivot element: pivot=x=A[r]pivot = x = A[r]

alt:"alt" center

  • Grow two regions:
    • from left to right: A[pi]A[p \dots i]
    • from left to right: A[i+1j]A[i+1 \dots j]
      • such that:
        • every element in A[pi]pivotA[p \dots i] \leq pivot
        • every element in A[i+1j]>pivotA[i+1 \dots j] > pivot
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm (2)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-1)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-2)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-3)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-4)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-5)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-6)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-7)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-8)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-9)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-10)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-11)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-12)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-13)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-14)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Lomuto’s Partitioning Algorithm Ex. (Step-15)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Quicksort with Lomuto’s Partitioning Algorithm

QUICKSORT (A, p, r)
  if p < r then
    q = L-PARTITION(A, p, r)
    QUICKSORT(A, p, q - 1)
    QUICKSORT(A, q + 1, r)
  endif

Initial invocation: QUICKSORT(A,1,n)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Comparison of Hoare’s & Lomuto’s Algorithms (1)

  • Notation: n=rp+1n=r-p+1
    • pivot=A[p]pivot=A[p] (Hoare)
    • pivot=A[r]pivot=A[r] (Lomuto)
  • #\# of element exchanges: e(n)e(n)
    • Hoare: 0e(n)n20 \geq e(n) \geq \lfloor \frac{n}{2} \rfloor
      • Best: k=1k=1 with i1=j1=pi_1=j_1=p (i.e., A[p+1r]>pivotA[p+1 \dots r]>pivot)
      • Worst: A[p+1p+n21]pivotA[p+n2r]A[p+1 \dots p+ \lfloor \frac{n}{2} \rfloor - 1] \geq pivot \geq A[p+ \lceil \frac{n}{2} \rceil \dots r]
    • Lomuto : 1e(n)n1 \leq e(n) \leq n
      • Best: A[pr1]>pivotA[p \dots r -1]>pivot
      • Worst: A[pr1]pivotA[p \dots r-1] \leq pivot
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Comparison of Hoare’s & Lomuto’s Algorithms (2)

  • #\# of element comparisons: ce(n)c_e(n)

    • Hoare: n+1ce(n)n+2n+1 \leq c_e(n) \leq n+2
      • Best: ik=jki_k=j_k
      • Worst: ik=jk+1i_k=j_k+1
    • Lomuto: ce(n)=n1c_e(n)=n-1
  • #\# of index comparisons: ci(n)c_i(n)

    • Hoare: 1ci(n)n2+1(ci(n)=e(n)+1)1 \leq c_i(n) \leq \lfloor \frac{n}{2} \rfloor + 1 | (c_i(n)=e(n)+1)
    • Lomuto: ci(n)=n1c_i(n)=n-1
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Comparison of Hoare’s & Lomuto’s Algorithms (3)

  • #\# of index increment/decrement operations: a(n)a(n)

    • Hoare: n+1a(n)n+2(a(n)=ce(n))n+1 \leq a(n) \leq n+2 | (a(n)=c_e(n))
    • Lomuto: na(n)2n1(a(n)=e(n)+(n1))n \leq a(n) \leq 2n-1 | (a(n)=e(n)+(n-1))
  • Hoare’s algorithm is in general faster

  • Hoare behaves better when pivot is repeated in A[pr]A[p \dots r]

    • Hoare: Evenly distributes them between left & right regions
    • Lomuto: Puts all of them to the left region
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Analysis of Quicksort (1)

QUICKSORT (A, p, r)
  if p < r then
    q = H-PARTITION(A, p, r)
    QUICKSORT(A, p, q)
    QUICKSORT(A, q + 1, r)
  endif

Initial invocation: QUICKSORT(A,1,n)

alt:"alt" center

Assume all elements are distinct in the following analysis

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Analysis of Quicksort (2)

  • H-PARTITION always chooses A[p]A[p] (the first element) as the pivot.
  • The runtime of QUICKSORT on an already-sorted array is Θ(n2)\Theta(n^2)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Example: An Already Sorted Array

Partitioning always leads to 22 parts of size 11 and n1n-1

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Worst Case Analysis of Quicksort

  • Worst case is when the PARTITION algorithm always returns imbalanced partitions (of size 11 and n1n-1) in every recursive call.
    • This happens when the pivot is selected to be either the min or max element.
    • This happens for H-PARTITION when the input array is already sorted or reverse sorted

T(n)=T(1)+T(n1)+Θ(n)=T(n1)+Θ(n)=Θ(n2)\begin{align*} T(n) &= T(1) + T(n-1) + Θ(n) \\ &= T(n-1) + Θ(n) \\ &= Θ(n2) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Worst Case Recursion Tree

T(n)=T(1)+T(n1)+cnT(n) = T(1) + T(n-1) + cn

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Best Case Analysis (for intuition only)

  • If we’re extremely lucky, H-PARTITION splits the array evenly at every recursive call

T(n)=2T(n/2)+Θ(n)=Θ(nlgn)\begin{align*} T(n) &= 2T(n/2) + \Theta(n) \\ &= \Theta(nlgn) \end{align*}

(same as merge sort)

  • Instead of splitting 0.5:0.50.5:0.5, if we split 0.1:0.90.1:0.9 then we need solve following equation.

T(n)=T(n/10)+T(9n/10)+Θ(n)=Θ(nlgn)\begin{align*} T(n) &= T(n/10) + T(9n/10) + \Theta(n) \\ &= \Theta(nlgn) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

“Almost-Best” Case Analysis

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (1)

  • We have seen that if H-PARTITION always splits the array with 0.1to0.90.1-to-0.9 ratio, the runtime will be Θ(nlgn)\Theta(nlgn).

  • Same is true with a split ratio of 0.01to0.990.01-to-0.99, etc.

  • Possible to show that if the split has always constant (Θ(1))(\Theta(1)) proportionality, then the runtime will be Θ(nlgn)\Theta(nlgn).

  • In other words, for a constant α(0<α0.5)\alpha | (0 < \alpha ≤ 0.5):

    • αto(1α)\alpha–to–(1-\alpha) proportional split yields Θ(nlgn)\Theta(nlgn) total runtime
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (2)

  • In the rest of the analysis, assume that all input permutations are equally likely.
    • This is only to gain some intuition
    • We cannot make this assumption for average case analysis
    • We will revisit this assumption later
  • Also, assume that all input elements are distinct.
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (3)

  • Question: What is the probability that H-PARTITION returns a split that is more balanced than 0.1to0.90.1-to-0.9?
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (4)

Reminder: H-PARTITION will place the pivot in the right partition unless the pivot is the smallest element in the arrays.

Question: If the pivot selected is the mth smallest value (1<mn)(1 < m ≤ n) in the input array, what is the size of the left region after partitioning?

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (5)

  • Question: What is the probability that the pivot selected is the mthm^{th} smallest value in the array of size nn?

    • 1/n1/n (since all input permutations are equally likely)
  • Question: What is the probability that the left partition returned by H-PARTITION has size mm, where 1<m<n1<m<n?

    • 1/n1/n (due to the answers to the previous 2 questions)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (6)

  • Question: What is the probability that H-PARTITION returns a split that is more balanced than 0.1to0.90.1-to-0.9?

Probability=q=0.1n+10.9n11n=1n(0.9n10.1n1+1)=0.81n0.8 for large n\begin{align*} Probability &=\sum \limits_{q=0.1n+1}^{0.9n-1}\frac{1}{n} \\ &=\frac{1}{n}(0.9n-1-0.1n-1+1) \\ &= 0.8-\frac{1}{n} \\ & \approx 0.8 \text{ for large n} \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (7)

  • The probability that H-PARTITION yields a split that is more balanced than 0.1to0.90.1-to-0.9 is 80%80\% on a random array.

  • Let Pα>P_{\alpha>} be the probability that H-PARTITION yields a split more balanced than αto(1α)\alpha-to-(1-\alpha), where 0<α0.50 < \alpha \leq 0.5

  • Repeat the analysis to generalize the previous result

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (8)

  • Question: What is the probability that H-PARTITION returns a split that is more balanced than αto(1α)\alpha-to-(1-\alpha)?

Probability=q=αn+1(1α)n11n=1n((1α)n1αn1+1)=(12α)1n(12α) for large n\begin{align*} Probability & =\sum \limits_{q=\alpha n+1}^{(1-\alpha)n-1}\frac{1}{n} \\ & =\frac{1}{n}((1-\alpha)n-1- \alpha n-1+1) \\ & = (1-2\alpha)-\frac{1}{n} \\ & \approx (1-2\alpha) \text{ for large n} \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Balanced Partitioning (9)

  • We found Pα>=12αP_{\alpha >}=1-2\alpha
    • Ex: P0.1>=0.8P_{0.1>}=0.8 and P0.01>=0.98P_{0.01>}=0.98
  • Hence, H-PARTITION produces a split
    • more balanced than a
      • 0.1to0.90.1-to-0.9 split 80%80\% of the time
      • 0.01to0.990.01-to-0.99 split 98%98\% of the time
    • less balanced than a
      • 0.1to0.90.1-to-0.9 split 20%20\% of the time
      • 0.01to0.990.01-to-0.99 split 2%2\% of the time
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Intuition for the Average Case (1)

  • Assumption: All permutations are equally likely

    • Only for intuition; we’ll revisit this assumption later
  • Unlikely: Splits always the same way at every level

  • Expectation:

    • Some splits will be reasonably balanced
    • Some splits will be fairly unbalanced
  • Average case: A mix of good and bad splits

    • Good and bad splits distributed randomly thru the tree
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Intuition for the Average Case (2)

  • Assume for intuition: Good and bad splits occur in the alternate levels of the tree
    • Good split: Best case split
    • Bad split: Worst case split
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Intuition for the Average Case (3)

Compare 2-successive levels of avg case vs. 1 level of best case

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Intuition for the Average Case (4)

  • In terms of the remaining subproblems, two levels of avg case is slightly better than the single level of the best case

  • The avg case has extra divide cost of Θ(n)\Theta(n) at alternate levels

  • The extra divide cost Θ(n)\Theta(n) of bad splits absorbed into the Θ(n)\Theta(n) of good splits.

  • Running time is still Θ(nlgn)\Theta(nlgn)

    • But, slightly larger hidden constants, because the height of the recursion tree is about twice of that of best case.
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Intuition for the Average Case (5)

  • Another way of looking at it:

    • Suppose we alternate lucky, unlucky, lucky, unlucky, \dots
    • We can write the recurrence as:
      • L(n)=2U(n/2)+Θ(n)L(n) = 2U(n/2) + \Theta(n) lucky split (best)
      • U(n)=L(n1)+Θ(n)U(n) = L(n-1) + \Theta(n) unlucky split (worst)
    • Solving:

    L(n)=2(L(n/21)+Θ(n/2))+Θ(n)=2L(n/21)+Θ(n)=Θ(nlgn)\begin{align*} L(n) & = 2(L(n/2-1) + \Theta(n/2)) + \Theta(n) \\ & = 2L(n/2-1) + \Theta(n) \\ & = Θ(nlgn) \end{align*}

  • How can we make sure we are usually lucky for all inputs?

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Summary: Quicksort Runtime Analysis (1)

  • Worst case: Unbalanced split at every recursive call

T(n)=T(1)+T(n1)+Θ(n)T(n)=Θ(n2)\begin{align*} T(n) & = T(1) + T(n-1) + \Theta(n) \\ T(n) & = \Theta(n2) \end{align*}

  • Best case: Balanced split at every recursive call (extremely lucky)

T(n)=2T(n/2)+Θ(n)T(n)=Θ(nlgn)\begin{align*} T(n) & = 2T(n/2) + \Theta(n) \\ T(n) & = \Theta(nlgn) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Summary: Quicksort Runtime Analysis (2)

  • Almost-best case: Almost-balanced split at every recursive call

T(n)=T(n/10)+T(9n/10)+Θ(n)or T(n)=T(n/100)+T(99n/100)+Θ(n)or T(n)=T(αn)+T((1αn)+Θ(n)\begin{align*} T(n) &=T(n/10)+T(9n/10)+ \Theta(n) \\ \text{or } T(n) &= T(n/100) + T(99n/100) + Θ(n) \\ \text{or } T(n) &= T(\alpha n) + T((1-\alpha n)+ \Theta(n) \end{align*}

for any constant α,0<α0.5\alpha, 0 < \alpha \leq 0.5

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Summary: Quicksort Runtime Analysis (3)

  • For a random input array, the probability of having a split

    • more balanced than 0.1to0.9:80%0.1 – to – 0.9 : 80\%
    • more balanced than 0.01to0.99:98%0.01 – to – 0.99 : 98\%
    • more balanced than αto(1α):12α\alpha – to – (1-\alpha) : 1 – 2 \alpha
  • for any constant α,0<α0.5\alpha, 0 < \alpha \leq 0.5

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Summary: Quicksort Runtime Analysis (4)

  • Avg case intuition: Different splits expected at different levels

    • some balanced (good), some unbalanced (bad)
  • Avg case intuition: Assume the good and bad splits alternate

    • i.e. good split -> bad split -> good split -> …
    • T(n)=Θ(nlgn)T(n) = \Theta(nlgn)
      • (informal analysis for intuition)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Randomized Quicksort

  • In the avg-case analysis, we assumed that all permutations of the input array are equally likely.
    • But, this assumption does not always hold
    • e.g. What if all the input arrays are reverse sorted?
      • Always worst-case behavior
  • Ideally, the avg-case runtime should be independent of the input permutation.
  • Randomness should be within the algorithm, not based on the distribution of the inputs.
    • i.e. The avg case should hold for all possible inputs
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Randomized Algorithms (1)

  • Alternative to assuming a uniform distribution:
    • Impose a uniform distribution
    • e.g. Choose a random pivot rather than the first element
  • Typically useful when:
    • there are many ways that an algorithm can proceed
    • but, it’s difficult to determine a way that is always guaranteed to be good.
    • If there are many good alternatives; simply choose one randomly.
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Randomized Algorithms (1)

  • Ideally:
    • Runtime should be independent of the specific inputs
    • No specific input should cause worst-case behavior
    • Worst-case should be determined only by output of a random number generator.
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Randomized Quicksort (1)

  • Using Hoare’s partitioning algorithm:
R-QUICKSORT(A, p, r)
  if p < r then
    q = R-PARTITION(A, p, r)
    R-QUICKSORT(A, p, q)
    R-QUICKSORT(A, q+1, r)
R-PARTITION(A, p, r)
  s = RANDOM(p, r)
  exchange A[p] with A[s]
  return H-PARTITION(A, p, r)
  • Alternatively, permuting the whole array would also work
    • but, would be more difficult to analyze
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Randomized Quicksort (2)

  • Using Lomuto’s partitioning algorithm:
R-QUICKSORT(A, p, r)
  if p < r then
    q = R-PARTITION(A, p, r)
    R-QUICKSORT(A, p, q-1)
    R-QUICKSORT(A, q+1, r)
R-PARTITION(A, p, r)
  s = RANDOM(p, r)
  exchange A[r] with A[s]
  return L-PARTITION(A, p, r)
  • Alternatively, permuting the whole array would also work
    • but, would be more difficult to analyze
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Notations for Formal Analysis

  • Assume all elements in A[pr]A[p \dots r] are distinct

    • Let n=rp+1n = r – p + 1
  • Let rank(x)=A[i]:pir and A[i]xrank(x) = |{A[i]: p \leq i \leq r \text{ and } A[i] \leq x}|

  • i.e. rank(x)rank(x) is the number of array elements with value less than or equal to xx

    • A={5,9,7,6,8,1,4}A=\{5,9,7,6,8,1,4\}
    • p=5,r=4p=5,r=4
    • rank(5)=3rank(5)=3
      • i.e. it is the 3rd3^{rd} smallest element in the array
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Formal Analysis for Average Case

  • The following analysis will be for Quicksort using Hoare’s partitioning algorithm.

  • Reminder: The pivot is selected randomly and exchanged with A[p]A[p] before calling H-PARTITION

  • Let xx be the random pivot chosen.

  • What is the probability that rank(x)=irank(x) = i for i=1,2,ni = 1, 2, \dots n ?

    • P(rank(x)=i)=1/nP(rank(x) = i) = 1/n
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Various Outcomes of H-PARTITION (1)

  • Assume that rank(x)=1rank(x)=1
    • i.e. the random pivot chosen is the smallest element
    • What will be the size of the left partition (L)(|L|)?
    • Reminder: Only the elements less than or equal to xx will be in the left partition.

A={2p=x=pivot,L=19,7,6,8,5,4r}A=\{\overbrace{2}^{p=x=pivot}\underbrace{,}_{\Longrightarrow|L|=1 } 9,7,6,8,5,\overbrace{4}^r\}

p=2,r=4p=2,r=4
pivot=x=2pivot=x=2

TODO: convert to image...S6_P9

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Various Outcomes of H-PARTITION (2)

  • Assume that rank(x)>1rank(x)>1
    • i.e. the random pivot chosen is not the smallest element
    • What will be the size of the left partition (L)(|L|)?
    • Reminder: Only the elements less than or equal to xx will be in the left partition.
    • Reminder: The pivot will stay in the right region after H-PARTITION if rank(x)>1rank(x)>1

A={2p,4,L=rank(x)17,6,8,5,pivot9r}A=\{\overbrace{2}^{p}, 4 \underbrace{,}_{\Longrightarrow|L|=rank(x)-1}7,6,8,\overbrace{5,}^{pivot}\overbrace{9}^r\}

p=2,r=4p=2,r=4
pivot=x=5pivot=x=5

TODO: convert to image...S6_P10

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Various Outcomes of H-PARTITION - Summary (1)

  • x:pivotx: pivot

  • L:size of left region|L|: \text{size of left region}

  • P(rank(x)=i)=1/n for 1inP(rank(x) = i) = 1/n \text{ for } 1 \leq i \leq n

    • if rank(x)=1 then L=1\text{if } rank(x) = 1 \text{ then } |L| = 1
    • if rank(x)>1 then L=rank(x)1\text{if } rank(x) > 1 \text{ then } |L| = rank(x) - 1
  • P(L=1)=P(rank(x)=1)+P(rank(x)=2)P(|L| = 1) = P(rank(x) = 1) + P(rank(x) = 2)

    • P(L=1)=2/nP(|L| = 1) = 2/n
  • P(L=i)=P(rank(x)=i+1) for 1<i<nP(|L| = i) = P(rank(x) = i+1) \text{ for } 1< i < n

    • P(L=i)=1/n for 1<i<nP(|L| = i) = 1/n \text{ for } 1< i < n
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Various Outcomes of H-PARTITION - Summary (2)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average - Case Analysis: Recurrence (1)

x=pivotx=pivot

T(n)=1n(T(1)+t(n1))rank:1+1n(T(1)+t(n1))rank:2+1n(T(2)+t(n2))rank:3+1n(T(i)+t(ni))rank:i+1+1n(T(n1)+t(1))rank:n+Θ(n)\begin{align*} T(n) & = \frac{1}{n}(T(1)+t(n-1) ) & rank:1 \\ & + \frac{1}{n}(T(1)+t(n-1) ) & rank:2 \\ & + \frac{1}{n}(T(2)+t(n-2) ) & rank:3 \\ & \vdots & \vdots \\ & + \frac{1}{n}(T(i)+t(n-i) ) & rank:i+1 \\ & \vdots & \vdots \\ & + \frac{1}{n}(T(n-1)+t(1) ) & rank:n \\ & + \Theta(n) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average - Case Analysis: Recurrence (2)

T(n)=1nq=1n1(T(q)+T(nq))+1n(T(1)+T(n1))+Θ(n)Note: 1n(T(1)+T(n1))=1n(Θ(1)+O(n2))=O(n)T(n)=1nq=1n1(T(q)+T(nq))+Θ(n)\begin{align*} T(n) &= \frac{1}{n}\sum \limits_{q=1}^{n-1}(T(q)+T(n-q))+\frac{1}{n}(T(1)+T(n-1))+\Theta(n)\\ & \text{Note: } \frac{1}{n}(T(1)+T(n-1))=\frac{1}{n}(\Theta(1)+O(n^2))=O(n) \\ T(n) &= \frac{1}{n}\sum \limits_{q=1}^{n-1}(T(q)+T(n-q))+\Theta(n) \end{align*}

for k=1,2,,n1k=1,2,\dots,n-1 each term T(k)T(k) appears twice once for q=kq = k and once for q=nkq = n−k

T(n)=2nk=1n1T(k)+Θ(n)T(n) = \frac{2}{n}\sum \limits_{k=1}^{n-1} T(k)+\Theta(n)

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average - Case Analysis -Solving Recurrence: Substitution

  • Guess: T(n)=O(nlgn)T(n)=O(nlgn)
  • T(k)aklgkT(k) ≤ aklgk for k<nk<n, for some constant a>0a > 0

T(n)=2nk=1n1T(k)+Θ(n)2nk=1n1aklgk+Θ(n)2ank=1n1klgk+Θ(n)\begin{align*} T(n) &= \frac{2}{n} \sum \limits_{k=1}^{n-1} T(k)+\Theta(n) \\ & \leq \frac{2}{n} \sum \limits_{k=1}^{n-1} aklgk+\Theta(n) \\ & \leq \frac{2a}{n} \sum \limits_{k=1}^{n-1} klgk+\Theta(n) \end{align*}

  • Need a tight bound for klgk\sum klgk
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Tight bound for klgk\sum klgk (1)

  • Bounding the terms
    •  k=1n1klgkk=1n1nlgn=n(n1)lgnn2lgn\ \sum \limits_{k=1}^{n-1}klgk \leq \sum \limits_{k=1}^{n-1}nlgn = n(n-1)lgn \leq n^2lgn
    • This bound is not strong enough because
    • T(n)2ann2lgn+Θ(n)T(n) \leq \frac{2a}{n}n^2lgn+\Theta(n)
    • =2anlgn+Θ(n)=2anlgn+\Theta(n) \Longrightarrow couldn’t prove T(n)anlgnT(n) \leq anlgn
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Tight bound for klgk\sum klgk (2)

  • Splitting summations: ignore ceilings for simplicity

k=1n1klgkk=1n/21klgk+k=n/2n1klgk\sum \limits_{k=1}^{n-1}klgk \leq \sum \limits_{k=1}^{n/2-1}klgk + \sum \limits_{k=n/2}^{n-1}klgk

  • First summation: lgk<lg(n/2)=lgn1lgk < lg(n/2)=lgn-1
  • Second summation: lgk<lgnlgk < lgn
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Splitting: k=1n1klgkk=1n/21klgk+k=n/2n1klgk\sum \limits_{k=1}^{n-1}klgk \leq \sum \limits_{k=1}^{n/2-1}klgk + \sum \limits_{k=n/2}^{n-1}klgk (3)

k=1n1klgk(lg(n1))k=1n/21k+lgnk=n/2n1k=lgnk=1n1kk=1n/21k=12n(n1)lgn12n2(n21)=12n2lgn18n212n(lgn1/2)\begin{align*} & \sum \limits_{k=1}^{n-1}klgk \leq (lg(n-1))\sum \limits_{k=1}^{n/2-1}k + lgn \sum \limits_{k=n/2}^{n-1}k \\ &= lgn \sum \limits_{k=1}^{n-1}k- \sum \limits_{k=1}^{n/2-1}k \\ &= \frac{1}{2}n(n-1)lgn - \frac{1}{2} \frac{n}{2}(\frac{n}{2}-1) \\ &= \frac{1}{2}n^2lgn - \frac{1}{8}n^2 - \frac{1}{2}n(lgn-1/2) \\ \end{align*}

k=1n1klgk12n2lgn18n2 for lgn1/2n2\begin{align*} & \sum \limits_{k=1}^{n-1} klgk \leq \frac{1}{2}n^2lgn-\frac{1}{8}n^2 \ for \ lgn \geq 1/2 \Longrightarrow n \geq \sqrt{2} \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Substituting: - k=1n1klgk12n2lgn18n2\sum \limits_{k=1}^{n-1}klgk \leq \frac{1}{2}n^2lgn-\frac{1}{8}n^2 (4)

T(n)2ank=1n1klgk+Θ(n)2an(12n2lgn18n2)+Θ(n)=anlgn(a4nΘ(n))\begin{align*} T(n) & \leq \frac{2a}{n}\sum \limits_{k=1}^{n-1}klgk+\Theta(n)\\ & \leq \frac{2a}{n}(\frac{1}{2}n^2lgn-\frac{1}{8}n^2)+\Theta(n) \\ & = anlgn - (\frac{a}{4}n-\Theta(n)) \end{align*}

  • We can choose a large enough so that a4nΘ(n)\frac{a}{4}n \geq \Theta(n)

T(n)anlgnT(n)=O(nlgn)\begin{align*} T(n) & \leq anlgn \\ T(n) & = O(nlgn) \end{align*}

Q.E.D.

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Medians and Order Statistics

  • ith order statistic: ithi^{th} smallest element of a set of nn elements

  • minimum: first order statistic

  • maximum: nthn^{th} order statistic

  • median: “halfway point” of the set

i=(n+1)2 or i=(n+1)2\begin{align*} i & = \lfloor \frac{(n+1)}{2} \rfloor \\ \text{ or } \\ i & = \lceil \frac{(n+1)}{2} \rceil \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection Problem

  • Selection problem: Select the ithi^{th} smallest of nn elements

  • Naïve algorithm: Sort the input array AA; then return A[i]A[i]

    • T(n)=θ(nlgn)T(n) = \theta(nlgn)
      • using e.g. merge sort (but not quicksort)
  • Can we do any better?

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Expected Linear Time

  • Randomized algorithm using divide and conquer

  • Similar to randomized quicksort

    • Like quicksort: Partitions input array recursively
    • Unlike quicksort: Makes a single recursive call
      • Reminder: Quicksort makes two recursive calls
  • Expected runtime: Θ(n)\Theta(n)

    • Reminder: Expected runtime of quicksort: Θ(nlgn)\Theta(nlgn)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Expected Linear Time: Example 1

  • Select the 2nd2^{nd} smallest element:

A={6,10,13,5,8,3,2,11}i=2\begin{align*} A & = \{6,10,13,5,8,3,2,11\} \\ i & = 2 \\ \end{align*}

  • Partition the input array:

A={2,3,5,left subarray13,8,10,6,11right subarray}\begin{align*} A & = \{\underbrace{2,3,5,}_{\text{left subarray} }\underbrace{13,8,10,6,11}_{\text{right subarray}}\} \end{align*}

  • make a recursive call to select the 2nd2^{nd} smallest element in left subarray
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Expected Linear Time: Example 2

  • Select the 7th7^{th} smallest element:

A={6,10,13,5,8,3,2,11}i=7\begin{align*} A & = \{6,10,13,5,8,3,2,11\} \\ i & = 7 \\ \end{align*}

  • Partition the input array:

A={2,3,5,left subarray13,8,10,6,11right subarray}\begin{align*} A & = \{\underbrace{2,3,5,}_{\text{left subarray} }\underbrace{13,8,10,6,11}_{\text{right subarray}}\} \end{align*}

  • make a recursive call to select the 4th4^{th} smallest element in right subarray
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Expected Linear Time (1)

R-SELECT(A,p,r,i)
  if p == r then 
    return A[p];
  q = R-PARTITION(A, p, r)
  k = q–p+1;
  if i <= k then 
    return R-SELECT(A, p, q, i);
  else
    return R-SELECT(A, q+1, r, i-k);

A={px(k smallest elements)qxr}x=pivot\begin{align*} A & = \{ \underbrace{ | }_{p} \dots \leq x \text{(k smallest elements)} \dots \underbrace{ | }_{q} \dots \geq x \dots \underbrace{ | }_{r} \} \\ x & = pivot \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Expected Linear Time (2)

A={pxLqxRr}x=pivot\begin{align*} A & = \{ \overbrace{ | }^{p} \underbrace{ \dots \leq x \dots }_{L} \overbrace{ | }^{q} \underbrace{ \dots \geq x \dots }_{R} \overbrace{ | }^{r} \} \\ x & = pivot \end{align*}

  • All elements in LL \leq all elements in RR
  • LL contains:
    • L=qp+1|L| = q–p+1 == k smallest elements of A[p...r]A[p...r]
    • if iL=ki \leq |L| = k then
      • search LL recursively for its ithi^{th} smallest element
    • else
      • search RR recursively for its (ik)th(i-k)^{th} smallest element
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Runtime Analysis (1)

  • Worst case:
    • Imbalanced partitioning at every level and the recursive call always to the larger partition

={1,2,3,4,5,6,7,8recursive call}i=8={2,3,4,5,6,7,8recursive call}i=7\begin{align*} & = \{1,\underbrace{2,3,4,5,6,7,8}_{\text{recursive call}} \} & i & = 8 \\ & = \{2,\underbrace{3,4,5,6,7,8}_{\text{recursive call}} \} & i & = 7 \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Runtime Analysis (2)

  • Worst case: Worse than the naïve method (based on sorting)

T(n)=T(n1)+Θ(n)T(n)=Θ(n2)\begin{align*} T(n) &= T(n-1) + \Theta(n) \\ T(n) &= \Theta(n^2) \end{align*}

  • Best case: Balanced partitioning at every recursive level

T(n)=T(n/2)+Θ(n)T(n)=Θ(n)\begin{align*} T(n) &= T(n/2) + \Theta(n) \\ T(n) &= \Theta(n) \end{align*}

  • Avg case: Expected runtime – need analysis T.B.D.
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Reminder: Various Outcomes of H-PARTITION

  • x:pivotx: pivot

  • L:size of left region|L|: \text{size of left region}

  • P(rank(x)=i)=1/n for 1inP(rank(x) = i) = 1/n \text{ for } 1 \leq i \leq n

    • if rank(x)=1 then L=1\text{if } rank(x) = 1 \text{ then } |L| = 1
    • if rank(x)>1 then L=rank(x)1\text{if } rank(x) > 1 \text{ then } |L| = rank(x) - 1
  • P(L=1)=P(rank(x)=1)+P(rank(x)=2)P(|L| = 1) = P(rank(x) = 1) + P(rank(x) = 2)

    • P(L=1)=2/nP(|L| = 1) = 2/n
  • P(L=i)=P(rank(x)=i+1) for 1<i<nP(|L| = i) = P(rank(x) = i+1) \text{ for } 1< i < n

    • P(L=i)=1/n for 1<i<nP(|L| = i) = 1/n \text{ for } 1< i < n
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average Case Analysis of Randomized Select

  • To compute the upper bound for the avg case, assume that the ithi^{th} element always falls into the larger partition.

A={pxLeftPartitionqxRightPartitionr}x=pivot\begin{align*} A & = \{ \overbrace{ | }^{p} \underbrace{ \dots \leq x \dots }_{Left Partition} \overbrace{ | }^{q} \underbrace{ \dots \geq x \dots }_{Right Partition} \overbrace{ | }^{r} \} \\ x & = pivot \end{align*}

  • We will analyze the case where the recursive call is always made to the larger partition
    • This will give us an upper bound for the avg case
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Various Outcomes of H-PARTITION

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average-Case Analysis of Randomized Select (1)

Recall:P(L=i)={2/nfor i=11/nfor i=2,3,,n1\text{Recall:} P(|L|=i) = \begin{cases} 2/n & \text{for } i=1 \\ 1/n & \text{for } i=2,3,\dots,n-1 \end{cases}

Upper bound: Assume ithi^{th} element always falls into the larger part.

T(n)1nT(max(1,n1))+1nq=1n1T(max(q,nq))+O(n)Note:1nT(max(1,n1))=1nT(n1)=1nO(n2)=O(n)(3 dot mean therefore) T(n)1nq=1n1T(max(q,nq))+O(n)\begin{align*} T(n) &\leq \frac{1}{n}T(max(1,n-1))+\frac{1}{n}\sum \limits_{q=1}^{n-1}T(max(q,n-q))+O(n) \\ Note: & \frac{1}{n}T(max(1,n-1)) = \frac{1}{n}T(n-1)=\frac{1}{n}O(n^2) = O(n) \\ \therefore \text{(3 dot mean therefore) } & T(n) \leq \frac{1}{n}\sum \limits_{q=1}^{n-1}T(max(q,n-q))+O(n) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average-Case Analysis of Randomized Select (2)

T(n)1nq=1n1T(max(q,nq))+O(n)\begin{align*} \therefore T(n) \leq \frac{1}{n}\sum \limits_{q=1}^{n-1}T(max(q,n-q))+O(n) \end{align*}

max(q,nq)={q if qn/2nq if q<n/2max(q, n–q) = \begin{cases} q & \text{ if } q \geq \lceil n/2 \rceil \\ n-q & \text{ if } q < \lceil n/2 \rceil \\ \end{cases}

  • nn is odd: T(k)T(k) appears twice for k=n/2+1,n/2+2,,n1k=\lceil n/2 \rceil+1,\lceil n/2 \rceil+2,\dots,n–1
  • nn is even:T(n/2)T(\lceil n/2 \rceil) appears once T(k)T(k) appears twice for k=n/2+1,n/2+2,,n1k = \lceil n/2 \rceil +1, \lceil n/2 \rceil+2,\dots,n–1
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average-Case Analysis of Randomized Select (3)

  • Hence, in both cases:

q=1n1T(max(q,nq))+O(n)2q=n/2n1T(q)+O(n)T(n)2nq=n/2n1T(q)+O(n)\begin{align*} \sum \limits_{q=1}^{n-1} T(max(q,n-q))+O(n) & \leq 2\sum \limits_{q=\lceil n/2 \rceil}^{n-1} T(q)+O(n) \\ \therefore T(n) & \leq \frac{2}{n} \sum \limits_{q=\lceil n/2 \rceil}^{n-1}T(q)+O(n) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average-Case Analysis of Randomized Select (4)

T(n)2nq=n/2n1T(q)+O(n)\begin{align*} T(n) & \leq \frac{2}{n} \sum \limits_{q=\lceil n/2 \rceil}^{n-1}T(q)+O(n) \end{align*}

  • By substitution guess T(n)=O(n)T(n) = O(n)
  • Inductive hypothesis: T(k)ck,k<nT(k) \leq ck, \forall k<n

T(n)2nq=n/2n1ck+O(n)=2cn(k=1n1kk=1n/21k)+O(n)=2cn(12n(n1)12n2(n21))+O(n)\begin{align*} T(n) & \leq \frac{2}{n} \sum \limits_{q=\lceil n/2 \rceil}^{n-1}ck+O(n) \\ & = \frac{2c}{n} \Bigg(\sum \limits_{k=1}^{n-1}k-\sum \limits_{k=1}^{\lceil n/2 \rceil-1}k \Bigg)+ O(n) \\ & = \frac{2c}{n} \Bigg(\frac{1}{2}n(n-1)-\frac{1}{2} \lceil \frac{n}{2} \rceil \bigg( \frac{n}{2}-1 \bigg) \Bigg)+ O(n) \end{align*}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Average-Case Analysis of Randomized Select (5)

T(n)2cn(12n(n1)12n2(n21))+O(n)c(n1)c4n+c2+O(n)=cnc4nc2+O(n)=cn((c4n+c2)+O(n))cn\begin{align*} T(n)& \leq \frac{2c}{n} \Bigg(\frac{1}{2}n(n-1)-\frac{1}{2} \lceil \frac{n}{2} \rceil \bigg( \frac{n}{2}-1 \bigg) \Bigg)+ O(n) \\ & \leq c(n-1)-\frac{c}{4}n+\frac{c}{2}+O(n) \\ & = cn - \frac{c}{4}n - \frac{c}{2} + O(n) \\ & = cn - \Bigg( \bigg( \frac{c}{4}n+\frac{c}{2}\bigg) + O(n) \Bigg) \\ & \leq cn \end{align*}

  • since we can choose c large enough so that (cn/4+c/2)(cn/4+c/2 ) dominates O(n)O(n)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Summary of Randomized Order-Statistic Selection

  • Works fast: linear expected time

  • Excellent algorithm in practise

  • But, the worst case is very bad: Θ(n2)\Theta(n^2)

  • Blum, Floyd, Pratt, Rivest & Tarjan[1973] algorithms are runs in linear time in the worst case.

  • Generate a good pivot recursively

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time

//return i-th element in set S with n elements
SELECT(S, n, i)  

  if n <= 5 then

    SORT S and return the i-th element

  DIVIDE S into ceil(n/5) groups
  //first ceil(n/5) groups are of size 5, last group is of size n mod 5

  FIND median set M={m ,, m_ceil(n/5)}
  // m_j : median of j-th group

  x = SELECT(M,ceil(n/5),floor((ceil(n/5)+1)/2))

  PARTITION set S around the pivot x into L and R

  if i <= |L| then
    return  SELECT(L, |L|, i)
  else
    return  SELECT(R, n–|L|, i–|L|)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (1)

  • Input: Array SS and index ii
  • Output: The ithi^{th} smallest value

25916811273942156321436203322314173304121319721103413723405291824123828263543\begin{array}{ccc} 25 & 9 & 16 & 8 & 11 & 27 & 39 & 42 & 15 & 6 32 & 14 & 36 & 20 & 33 & 22 & 31 & 4 & 17 & 3 & 30 & 41 \\ 2 & 13 & 19 & 7 & 21 & 10 & 34 & 1 & 37 & 23 & 40 & 5 & 29 & 18 & 24 & 12 & 38 & 28 & 26 & 35 & 43 \end{array}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (2)

Step 1: Divide the input array into groups of size 55

25916811273942156321436203322314173304121319721103413723405291824123828263543group size=5\overbrace{ \begin{array}{ccc} 25 & 9 & 16 & 8 & 11 \\ 27 & 39 & 42 & 15 & 6 \\ 32 & 14 & 36 & 20 & 33 \\ 22 & 31 & 4 & 17 & 3 \\ 30 & 41 & 2 & 13 & 19 \\ 7 & 21 & 10 & 34 & 1 \\ 37 & 23 & 40 & 5 & 29 \\ 18 & 24 & 12 & 38 & 28 \\ 26 & 35 & 43 \end{array} }^{\text{group size}=5}

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (3)

Step 2: Compute the median of each group (Θ(n)\Theta(n))

251611Medians89394227615363332201422311734413019132213410173740292353828241218263543\begin{array}{ccc} 25 & 16 & \overbrace{11}^{Medians} & 8 & 9 \\ 39 & 42 & 27 & 6 & 15 \\ 36 & 33 & 32 & 20 & 14 \\ 22 & 31 & 17 & 3 & 4 \\ 41 & 30 & 19 & 13 & 2 \\ 21 & 34 & 10 & 1 & 7 \\ 37 & 40 & 29 & 23 & 5 \\ 38 & 28 & 24 & 12 & 18 \\ & 26 & 35 & 43 \end{array}

  • Let MM be the set of the medians computed:
    • M={11,27,32,17,19,10,29,24,35}M = \{11, 27, 32, 17, 19, 10, 29, 24, 35\}
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (4)

Step 3: Compute the median of the median group MM

xSELECT(M,M,(M+1)/2)x \leftarrow SELECT(M,|M|,\lfloor (|M|+1)/2 \rfloor) where M=n/5|M|=\lceil n/5 \rceil

  • Let MM be the set of the medians computed:

    • M={11,27,32,17,19,10,29,24Median,35}M = \{11, 27, 32, 17, 19, 10, 29, \overbrace{24}^{Median}, 35\}
  • Median=24Median = 24

  • The runtime of the recursive call: T(M)=T(n/5)T(|M|)=T(\lceil n/5 \rceil)

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (5)

Step 4: Partition the input array SS around the median-of-medians xx

25916811273942156321436203322314173304121319721103413723405291824123828263543\begin{array}{ccc} 25 & 9 & 16 & 8 & 11 & 27 & 39 & 42 & 15 & 6 32 & 14 & 36 & 20 & 33 & 22 & 31 & 4 & 17 & 3 & 30 & 41 \\ 2 & 13 & 19 & 7 & 21 & 10 & 34 & 1 & 37 & 23 & 40 & 5 & 29 & 18 & 24 & 12 & 38 & 28 & 26 & 35 & 43 \end{array}

Partition SS around x=24x = 24

Claim: Partitioning around x is guaranteed to be well-balanced.

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (6)

  • MM : Median, MM^* : Median of Medians

413019M132213410172231173425161189382824M12183633322014374029235394227615263543\begin{array}{ccc} 41 & 30 & \overbrace{19}^{M} & 13 & 2 \\ 21 & 34 & 10 & 1 & 7 \\ 22 & 31 & 17 & 3 & 4 \\ 25 & 16 & 11 & 8 & 9 \\ 38 & 28 & \overbrace{24}^{M^*} & 12 & 18 \\ 36 & 33 & 32 & 20 & 14 \\ 37 & 40 & 29 & 23 & 5 \\ 39 & 42 & 27 & 6 & 15 \\ & 26 & 35 & 43 \end{array}

  • About half of the medians greater than x=24x=24 (about n/10n/10)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (7)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (8)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time - Example (9)

S={25916811273942156321436203322314173304121319721103413723405291824123828263543}S = \begin{array}{ccc} \{ 25 & 9 & 16 & 8 & 11 & 27 & 39 & 42 & 15 & 6 32 & 14 & 36 & 20 & 33 & 22 & 31 & 4 & 17 & 3 & 30 & 41 \\ 2 & 13 & 19 & 7 & 21 & 10 & 34 & 1 & 37 & 23 & 40 & 5 & 29 & 18 & 24 & 12 & 38 & 28 & 26 & 35 & 43 \} \end{array}

  • Partitioning SS around x=24x = 24 will lead to partitions
    of sizes 3n/10\sim 3n/10 and 7n/10\sim 7n/10 in the worst case.

Step 5: Make a recursive call to one of the partitions

if i <= |L| then 
  return SELECT(L,|L|,i)
else 
  return SELECT(R,n-|L|,i-|L|)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time

//return i-th element in set S with n elements
SELECT(S, n, i)  

  if n <= 5 then

    SORT S and return the i-th element

  DIVIDE S into ceil(n/5) groups
  //first ceil(n/5) groups are of size 5, last group is of size n mod 5

  FIND median set M={m ,, m_ceil(n/5)}
  // m_j : median of j-th group

  x = SELECT(M,ceil(n/5),floor((ceil(n/5)+1)/2))

  PARTITION set S around the pivot x into L and R

  if i <= |L| then
    return  SELECT(L, |L|, i)
  else
    return  SELECT(R, n–|L|, i–|L|)
RTEU CE100 Week-3
CE100 Algorithms and Programming II

Choosing the Pivot (1)

  1. Divide S into groups of size 5

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Choosing the Pivot (2)

  • Divide S into groups of size 5
  • Find the median of each group

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Choosing the Pivot (3)

  • Divide S into groups of size 5
  • Find the median of each group
  • Recursively select the median x of the medians

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Choosing the Pivot (4)

  • At least half of the medians x\geq x
  • Thus m=n/5/2m = \lceil \lceil n/5 \rceil /2 \rceil groups contribute 3 elements to R except possibly the last group and the group that contains xx, R3(m2)3n106|R|\geq 3(m–2)\geq \frac{3n}{10}–6

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Choosing the Pivot (5)

  • Similarly L3n106|L| \geq \frac{3n}{10}– 6
  • Therefore, SELECT is recursively called on at most n(3n106)=7n10+6n-(\frac{3n}{10}-6)=\frac{7n}{10}+6 elements

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time (1)

alt:"alt" center

RTEU CE100 Week-3
CE100 Algorithms and Programming II

Selection in Worst Case Linear Time (2)

  • Thus recurrence becomes
    • T(n)T(n5)+T(7n10+6)+Θ(n)T(n) \leq T \big( \lceil \frac{n}{5} \rceil \big) + T\big( \frac{7n}{10}+6 \big) + \Theta(n)
  • Guess T(n)=O(n)T(n)=O(n) and prove by induction
  • Inductive step:

T(n)cn/5+c(7n/10+6)+Θ(n)cn/5+c+7cn/10+6c+Θ(n)=9cn/10+7c+Θ(n)=cn[c(n/107)Θ(n)]cn( for large c)\begin{align*} T(n) & \leq c \lceil n/5 \rceil + c(7n/10+6)+\Theta(n) \\ & \leq cn/5 + c + 7cn/10 + 6c + \Theta(n) \\ & = 9cn/10 + 7c + \Theta(n) \\ & = cn - [c(n/10-7)-\Theta(n)] \leq cn &\text{( for large c)} \end{align*}

  • Work at each level of recursion is a constant factor (9/10)(9/10) smaller
RTEU CE100 Week-3
CE100 Algorithms and Programming II

References

RTEU CE100 Week-3
CE100 Algorithms and Programming II

EndOfWeek3CourseModule-End-Of-Week-3-Course-Module-

RTEU CE100 Week-3