Design and Analysis of Algorithms

Table of Contents

Lecture 0 <2019-08-26 Mon>

  • Syllabus review
  • Course Webpage
  • Proof-heavy course
  • Design – methods used to create new algorithms
  • Analysis – Mathematical assessment of an algorithm’s correctness and efficiency
  • Correctness – Does it do what it’s supposed to do on all possible inputs?
  • Efficiency – Measure the algorithm’s running time, counting the number of basic operations

Lecture 1 <2019-08-26 Mon>

Asymptotic Notation – Starts on Slide 3

  • Big-\(O\)
  • Big-\(\Omega\)
  • Big-\(\Theta\)
  • Little-\(o\)
  • Little-\(\omega\)
  • Not Interchangeable
  • Asymptotic Upper Bound – Big-\(O\), \(O(g(n)) = \{f(n) : \exists c, n_0 > 0 \mid \forall n \geq n_0, 0 \leq f(n) \leq cg(n) \}\)
  • Asymptotic Lower Bound, Big-\(\Omega\), \(\Omega(g(n)) = \{f(n) : \exists c, n_0 \mid \forall n \ge n_0, 0 \leq cg(n) \leq f(n) \}\)
  • Asymptotic Tight Bound, Big-\(\Theta\), \(\Theta(g(n)) = \{f(n) : \exists c_1, c_2, n_0 > 0 \mid \forall n \geq n_0, 0 \leq c_1 g(n) \leq f(n) \leq c_2 g(n) \}\)
  • Upper Bound, not tight, little-\(o\)
  • Lower bound, not tight, little-\(\omega\)

Lecture 1, Cont. <2019-08-28 Wed>

Upper And Lower Bounds

  • 01.9
  • Analyze algorithms and problems in terms of time complexity (count of operations)
  • Sometimes in space complexity (memory usage)
  • upper and lower bounds are used
  • applies to general problems
  • 01.10 – upper is most common form
  • Algorithm \(A\) has an upper bound of \(f(n)\) for input of size \(n\) if there exists no input of size \(n\) such that \(A\) requires more than \(f(n)\) time
  • Consider sort, Quicksort and Bubblesort are \(\mathcal{O}(n^2)\), and Mergesort is \(\mathcal{O}(n \log{n})\)
    • Quicksort is used more often in practice, when randomized tends to be closer to \(n \log{n}\) time
  • Lower Bounds are like best-case results, typically not as interesting
  • 01.11 – Uppper bound of a problem
  • A problem has an upper bound of \(f(n)\) if there exists at least one algorithm that has an uppperbound \(f(n)\)
    • there exists an algorithm with time/space complexity of at most \(f(n)\) on all inputs of size \(n\)
  • Mergesort has worst-case \(\mathcal{O}(n \log{n})\), sorting has an upper bound of \(\mathcal{O}(n \log{n})\)
  • 01.12 – Lower bounds of a problem
  • A problem has a lower bound of \(f(n)\) if, for any algorithm \(A\), to solve it, there exists at least one input of size \(n\) that forces \(A\) to take at least \(f(n)\) time/space.
    • Called the pathological input, depends on \(A\) itself.
    • input of size \(n\) (reverse) that forces Bubblesort to take \(\Omega(n^2)\)
    • Since every sorting algorithm has an input of size \(n\) forcing \(\Omega(n \log{n})\) steps, the sorting problem has a time comlpexity lower bound of \(\Omega(n \log{n})\). Therefore, Mergesort is asymptotically optimal.
  • 01.13 Lower Bounds, 2
    • To argue a lower bound, use adversarial argument: algorithm simulates an arbitrary algorithm \(A\) to build pathological input
    • Needs to be a general form, pathological input depends on specific algorithm \(A\)
    • Can reduce one problem to another to establish lower bounds
      • Show that if we can compute convex hull in \(o(n \log{n})\) time, then we can sort in time \(o(n \log{n})\); cannot be true, so convex hull takes \(\Omega(n \log{n})\)


  • 01.14 – Efficiency
    • An algorithm is time- or space- efficient if worst case time/space complexity is \(\mathcal{O}(n^c)\) for constant \(c\) and input size \(n\), polynomial in the input size
    • Input size on in number of bits needed
      • graph of \(n\) nodes takes \(\mathcal{O}(n \log{n})\) bits to represent nodes, \(\mathcal{O}(n^2 \log{n})\) to represent edges., algorithm in \(\mathcal{O}(n^c)\) is efficient
      • Problem that includes as an input a number, \(k\) (e.g., threshold) onnly needs \(\mathcal{O}(\log{k})\) bits to represent, efficient must run in \(\mathcal{O}(\log^c{k})\)
      • If polynomial in \(k\), called pseudopolynomial

Recurrence Relations

  • 01.15 – Recurrence Relations
    • Non-recursive relations are relatively easy to analyse.
    • Recurrence relations are used for time complexity of recursive algorithms, bound relation asymptotically
    • For example, Mergesort
      • \(T(n) = 2 T(n/2) + \mathcal{O}(n)\)
      • Bound on \(T(n)\) needed
      • Use the Master theorem (See 235/156 notes)
      • Other approaches include 01.17–18


  • 01.19–24 – Graphs
    • Set of vertices \(V\), set of unordered pairs between vertices \(E\) (edges)
    • Directed graph has ordered pairs
    • Weighted is directed augmented with weights for edges
    • Adjacency list, each vertex has linked list listing edges, \(\mathcal{O}(n + m)\)
    • Adjacency matrix, \(n \times n\) – \(\mathcal{O}(n^2)\)

Lecture 1, Cont. <2019-08-30 Fri>


  • 01.24 – Adjacency matrices
  • \(n \times n\) matrix \(M\), \(M(i,j) = w\) for weight, \(\infty\) for non-edges, takes \(\Theta(n^2)\) space

Algorithmic Techniques, Dynamic Programming

  • 01.25
  • Technique for solving optimization problems, where a best solution is needed, evaluated by an objective function
  • Decomposes a problem into subproblems, optimally solves recursively, combines into final/optimal solution
  • Exponential number of subproblems, many overlap, re-use solutions
  • Number of distinct subproblems is polynomial
  • Works if they have optimal substructure property – optimal solution is made up of optimal solutions to subproblems
  • All-pairs shortest paths, knapsack
  • Can prove correctness if optimal substructure property exists
  • Must be careful, very carefully proven

Algorithmic Techniques, Greedy Algorithms

  • 01.26
  • Must consider optimal substructure property
  • Each step, commits to locally optimal
  • Tend to be very quick, does not use all possible subproblems
  • MST, Dijkstra’s

Divide And Conquer

  • 01.27
  • Not limited to optimization, solve each sub-problem recursively, then combine
  • Think Mergesort

Proof By Contradiction

  • 01.28
  • Assume opposite of premise to be proved, arrive at a contradiction of some other assumption
  • To prove there is no greatest even integer:
    • Assume for a reductio that there exists a greatest even integer \(N\), for all even integers, \(n\), we have \(N \geq n\). (1)
    • But \(M = N + 2\) is an even integer, since it’s the sum of two even integers, and \(M > N\).
    • Therefore, (1) is false, so there is no greatest even integer. \(\box\)

Proof by Induction

  • 01.29
  • Base case, and inductive step, involves non-negative integers
  • Useful for proving invariants in algorithms, properties that always hold at every step (especially at the final step)
  • See PHIL 411 notes

Proof by Construction

  • 01.30
  • Proof of existence by directly constructing it.
  • Will be used extensively for NP-completeness

Proof by Contraposition

  • 01.31
  • \(P \to Q \equiv \lnot Q \to \not P\)
  • Proof that if \(x^2\) is even, then \(x\) is even. Contraposition is the simplest
  • Useful for proving biconditionals as well
  • Useful for NP-completeness

Lecture 2: Medians and Order Statistics <2019-08-30 Fri>

  • Given an array \(A\) of \(n\) distinct numbers, $i$th order statistic of \(A\) is its $i$th smallest element
  • \(i = 1\) – minimum
  • \(i = n\) – max
  • \(i = \lfloor (n+1)/2 \rfloor\) – (lower) median
  • Given array \(A\) of \(n\) elements and a number \(i \in \{1, \dots, n\}\) find the $i$th order statistic of \(A\)
  • Can we do better?
  • Minimum, linear, normal just scan through the array, keeping track of the smallest so far
    • small is smallest from \(A_1\) to \(A_i\) – the loop invariant
    • Loop invariant can be shown easily via induction
    • Correctness by observing \(i = n\) at the end, but before return
    • \(\Theta(n)\) – Cannot do better
    • Must find a lower bound
    • Must make at least \(n - 1\) comparisons
      • no element can be eliminated until shown it’s greater than at least one other element
      • All still eligible to be smallest are in a bucket, removed after shown to be \(>\) another.
      • Each comparison removes at most one element, from the bucket, therefore at least \(n - 1\) comparisons are needed to remove all but one from the bucket
      • Follows by pigeonhole principle
  • Minimum and Maximum (02.06)
    • Given \(A\) with \(n\) elementts, find both min and max
    • Obvious algorithm, non asymptotic time complexity
    • Do better by checking for both
    • Look at 02.07

Lecture 2: Medians and Order Statistics, Continued <2019-09-04 Wed>

The simultaneous Minimum and Maximum (starts 02.06)

  • find-min and find-max are optimal, as they are at \(n - 1\) comparisons for \(n\) elements.
  • Algorithm presented on 02.07 – be familiar with the algorithm (retype later)

    (defun min-and-max (list-of-nums)
      (let ((len (length list-of-nums))
            (max (max (first list-of-nums)
                      (second list-of-nums)))
            (min (min (first list-of-nums)
                      (second list-of-nums))))
        (do ((i 1 (1+ i)))
            ((= i (1- (floor len 2))))
          (setq max (max max
                         (nth list-of-nums (* 2 i))
                         (nth list-of-nums (- (* 2 i) 1)))
                min (min min
                         (nth list-of-nums (* 2 i))
                         (nth list-of-nums (- (* 2 i) 1)))))
        (when (oddp len)
          (setq max (max max (nth list-of-nums (1- len)))
                min (min min (nth list-of-nums (1- len)))))
        (values max min)))
  • For this algorithm, in \(\mathcal{O}(\frac{3}{2}n)\)

Selection of $i$th smallest value (02.10)

  • Select $i$th smallest value. Obvious solution (sort, return $i$th element), complexity is \(\Theta(n \log n)\)
  • Can do better, running in linear time
  • Uses divide-and-conquer, similar to binary search
  • Definition on 02.12
  • Uses partition like in quicksort (deterministic)
  • \(q\) from algorithm is in its final sorted position
  • Select calls partition, choses pivot element \(q\), reorders \(A\) such that all elements \(< A_q\) to the left of \(A_q\) and all elements \(> A_q\) to the right
  • If \(A_q\) is the element, return, if it’s greater learch left, less, search right.
  • Partition chooses a pivot element \(A_x\), exchange \(A_x\) and \(A_r\), then begin to loop to exchange appropriately

Lecture 2: Medians and Order Statistics, Continued <2019-09-06 Fri>

Selection, Continued

  • Recall algorithm, from 02.12
  • Requires partition, defined later on
  • Partition chooses the pivot element
  • Partition helps to preserve the invariant
  • Choce of pivot element is crucial

Choosing a Pivot element

  • choise is critical to low time-complexit
  • best choice of pivot element to partition
  • Use the median – circular to do this
  • Median of Medians
  • \(A\) of \(n\) elements, partition \(A\) into \(m = \lfloor n/3 \rfloor\) groups of 5 elements, atnd at most one othe group of \(n \mod 3\) elements
  • \(A^{\prime} = [ x_1, x_2, \ldots, x_{\lceil n / 5 \rceil}]\), where \(x_i\) is median of group \(i\) from sorting \(i\)
  • \(y = \mathrm{Select}(A^{\prime}, 1, \lceil n/5 \rceil, \lfloor(\lceil n/5 \rceil + 1)/2 \rfloor)\) is the pivot element, scan \(A[p \cdots r]\) and return \(y\) in as \(i\) – this is choose pivot element
  • As a team, do 02.19

Time Complexity

  • See 02.20
  • Recurrence form (02.22):
    • \(T(n)\) total time for size \(n\)
    • Pivot selection takes \(\mathcal{O}(n)\) to split and \(T(n/5)\) for finding median of medians, partitioning \(n\) elements takes \(\mathcal{O}(n)\) time
    • Recursive select takes \(T(3n/4)\) (in choose pivot element)
    • ∴, \(T(n) \leq T(n/5) + T(3n/4) + \mathcal{O}(n)\)
    • Or, rewritten as \(T(\alpha n) + T(\beta n) + O(n)\) where \(\alpha = 1/5\) and \(\beta = 3/4\)
    • By theorem from Lecture 1, when \(\alpha + \beta < 1\), \(T(n) \in \mathcal{O}(n)\)
  • And thus, Select has complexity \(\mathcal{O}(n)\)
  • Proof shown on 02.23
  • How can we lower-bound to 1/4 of the elements – see 02.21

Lecture 2: Medians and Order Statistics, Continued <2019-09-09 Mon>

Practice, Median of Medians, 02.19

  • \(A = [4, 9, 12, 17, 6, 5, 21, 14, 8, 11, 13, 29, 3]\)
  • select(A, 1, 13, 4)
    • Partition(A, 1, 13)
      • ChoosePivotElement(A, 1, 13)
        • \([4, 9, 12, 17, 6], [5, 21, 14, 8, 11], [13, 29, 3]\)
        • \([9, 11, 13]\)
        • \(11\) is median, \(x = 10\)
      • Proceed to swap
      • \(A = [4, 9, 6, 3, 8, 11, 12, 17, 21, 14, 13, 29]\), \(p = 6\), \(k = 5\)
  • Make sure to use lower median!

Lecture 3: Dynamic Programming <2019-09-09 Mon>

  • solving optimization problems
  • Decompose problem into subproblems, solving them recursively and combine the solutions into a final, optimal solution
  • exponential number of subproblems to solve, but with overlap, means solutions can be resued
  • Number of distinct subproblems is polynomial

The Rod cutting problem

  • Rod of length \(n\), cut into integer length smaller rods to maximize profit.
  • Rod of length \(i\) has price \(p_i\)
  • Cuts are free, profit base solely on prices charged for rods
  • Cuts only occur at integral boundaries, so there are \(n - 1\) positions to cut, total number of possible solutions is \(2^{n - 1}\)
  • Cuts \(i_1, \ldots, i_k\), (\(i_1 + \cdots i_k = n\)), and revenue \(r_n = p_{i_1} + \cdots p_{i_k}\) is maximized
  • Optimal substructure problem

Lecture 3: Dynamic Programming, Continued <2019-09-11 Wed>

Rod cutting, an implementation

  • Recall the optimal substructure property, an optimal solution is made up of optimal solutions to subproblems
  • Easy to prove via contradiction
  • \(r_n = \mathrm{max}(p_n, r_1 + r_{n - 1}, \ldots, r_{n - 1} + r_1)\)
  • Helpful to simply store previously calculated values

Lecture 3: Dynamic Programming, Continued <2019-09-13 Fri>

Rod Cutting, continued

  • Requires considering all possibilities (with one initial cut) for a given \(r_n\), which gives two subproblems.
  • Proof of correctness/proof of optimal substructure property by contradiction
    • Assume for a reductio that the solution is not optimal for the subproblem…
  • Some dynamic programming problems can be consider slightly differently.
  • \(r_n = \max_{1 \leq i \leq n} (p_i + r_{n-i})\) is an alternative form (03.05)
  • Rod cutting proof available in book, §15.1 (p. 362)
  • Cut-rod on (03.06)
  • Missing memoization

Time Complexity

  • \(T(0) = 1\), See 03.07
  • Non-memoized complexity is huge
  • Top-down with memoization
  • Bottom-up, fill in table
  • Have the same running time
  • Memoized (recursive) implementation at 03.10
  • Bottom up at 03.11 – easier to analyse

Reconstructing a solution

  • add a second array \(s\) to keep track of the optimal size of the first piece cut in each subproblem
  • Printing from this table is reasonably simple

Lecture 3: Dynamic Programming, Continued <2019-09-16 Mon>

Matrix Chain Multiplication

  • long chain of \(n\) matrices we want to multiply
  • \(p \times q\) by \(q \times r\) requires \(pqr\) steps
  • Brute force solution is \(\Omega(4^n / n^{3/2})\)
  • Steps:
    1. Characterize structure of an optimal solution
    2. Recursively define the value of an optimal solution
    3. Compute the value of an optimal solution
    4. Construct an optimal solution from computed information (03.18)
  • Step 1
    • \(A_{i \dots j}\) is the matrix from \(A_i A_{i + 1} \cdots A_j\)
    • Must split product and compute i to k and k+1 to j, then multiply the two together
    • \(k\) is the optimal parenthesization
  • Assume cost is not optimal, then there exists some other solution from \(i\) to \(k\) such that \(cost(s^{\prime}) < cost(s)\)
    • Then replace \(s\) with \(s^{\prime}\)
  • Step 2:
    • \(m[i, j]\) is min number of mults for Ai to Aj
    • \(m[i, j]\)
      1. \(i = j\) = 0
      2. \(i < j\), split at \(k\), \(m[i, j] = m[i, k] + m[k + 1, j] + p_{i - 1}p_{k}p_j\)
      3. \(k\) must be tried for all possible values, store in \(s[i, j]\) to find \(k\) for a given \(i, j\) pair. (03.20)
  • Step 3:
    • Is \(\Theta(n^2)\)
    • Bottom-up is on chain length
      • Linear to solve each problem, a quadratic number of subproblems, ∴, \(\Theta(n^3)\)
      • Implementation on 03.22

Lecture 3: Dynamic Programming, Continued <2019-09-18 Wed>

Matrix Chain Multiplication, Continued

  • Optimal substructure defined with regards to characterization of subproblems

Optimal Substructure

  • Consider shortest path?
  • Does shortest path have optimal substructure?
  • Yes!
  • But not for Longest Simple Path. LSP is in fact NP complete, because subproblems are not independent

Longest Common Subsequence

  • Sequence \(Z\) is a subsequence of anothe sequence \(x\) if there is a strictly increasing sequence \(i_1, \ldots, i_k\) of indices of \(X\) such that for all \(j = 1, \ldots, k\), \(x_{i_j} = z_j\)
  • As one reads through \(Z\), one can find a match to each symbol of \(Z\) in \(X\) in order.
  • \(Z\) is a common subsequence of \(X\) and \(Y\) if it is a subsequence of both
  • Goal of LCS is to find a maimum-length common subsequence of sequences \(X\) and \(Y\)

Characterizing Structure of LCS Optimal Solution

  • Given \(X\), the $i$th prefix of \(X\) is \(X_i\)
  • If \(X\) and \(Y\) have LCS \(Z\), then
    1. \(x_m = y_n \implies z_k = x_m = y_n\) and \(Z_{k - 1}\) is an LCS of \(X_{m - 1}\) and \(Y_{n - 1}\)
      • if \(z_k \neq x_m\), can lengthen \(Z\). Contradiction
      • If \(Z_{k - 1}\) is not LCS of \(X_{m - 1}\) and \(Y_{n - 1}\), then a longer CS of \(X_{m - 1}\) and \(Y_{n - 1}\) could have \(x_m\) appended to it to get CS of \(X\) and \(Y\) that is longer than \(Z\). Contradiction
    2. See 03.31!
  • Last symbol must be equal!

Recursive definition of Optimal Solution

  • \(c[i, j]\) defined on 03.32
  • Find the LCS of the prefixes, if \(x\) and \(y\)’s last chars match
  • If not, find LCS of \(X\) and \(Y_{n - 1}\), find LCS of \(X_{m - 1}\) and \(Y\) and use the longer.
  • Remember to use mathematical style

Define the Algorithm

  • 03.33
  • \(b\) is used to keep direction
  • Diagonal in an LCS is where it’s printed
  • But LCS is prented backwards, use recursion to print in correct order

Lecture 3: Dynamic Programming, Continued <2019-09-20 Fri>

  • exactly 4
  • 5 and 4 hinge on finding the correct subproblem

Lecture 3: Dynamic Programming, Continued <2019-09-23 Mon>

Optimal Binary Search Trees

  • Goal is to construct BSTs such that most frequently sought values are near the root, minimizing expected search time.
  • Given a sequence \(K\) of \(n\) distinct keys in sorted order
  • Key \(k_i\) has a probability \(p_i\) that it will be sought on a particular search
  • To handle searches for values not in \(k\), have \(n + 1\) dummy keys, \(d_0, d_1, \ldots, d_n\) to serve as leaves. \(d_i\) is reached with probability \(q_i\)
  • If \(depth_{T}(k_i)\) is distance from root of \(k_i\) in tree \(T\), then the expected cost of \(T\) is \(1 + \sum_{i=1}^{n} p_i depth_{T}(k_i) + \sum_{i=0}^{n} q_i depth_{t}(d_i)\)
  • An optimal BST is one with minimum expected search cost

Step 1: Structure of Optimal Solution

  • Observation: Since \(K\) is sorted and dummy keys are interspersed in order, any subtree of a BST must contain contiguous keys from \(i\) to \(j\) and leaves from \(d_{i - 1}\) to \(d_j\)
  • “Proof” in 03.44
  • Given keys \(k_i, \ldots, k_j\), say that it’s optimal BST roots at \(k_r\) for some \(i \leq r \leq j\)
  • Since we dont know optimal \(k_r\), all must be tried

Recursive definition

  • \(e[i, j]\) is expected cost from \(i\) to \(j\)
  • if \(j = i - 1\), then there is only the dummy key \(d_{i - 1}\) and \(e[i, i - 1] = q_{i - 1}\)
  • if \(j \geq i\), then choose root \(k_r\) from \(k_i\) to \(k_j\), optimally solving subproblems.
  • When combining optimal trees from subproblems and increase depth by one.
  • Define \(w(i, j) = \sum_{\ell = i}^{j} p_{\ell} + \sum_{\ell = i - 1}^{j} q_{\ell}\) as the sum of the probabilities of the nodes in the subtree built on \(k_i, \ldots, k_j\), meaning \(e[i, j] = p_r + (e[i, r - 1] + w(i, r - 1)) + (e[r + 1, j] + w(r + 1, j))\)
  • Becomes \(w(i, j) = w(i, r - 1) + p_r + w(r + 1, j)\)
  • Minimize, adding root array

Define the function

  • Pseudocode in 03.43
  • is in \(\mathcal{O}(n^3)\)

Lecture 4: Greedy Algorithms <2019-09-27 Fri>

  • A technique for solving optimization problems
  • Still examine subproblems, stil exploits optimal substructure
  • Greedy algorithms instead choose the best choice at the moment, rather than examine all subproblems
    • ability to do this is called the greedy choice property, this allows us to limit what subproblems we look at
  • MST, single-source shortest path

Activity Selection

  • scheduling classes in a classroom
  • Courses are candidates to be scheduled in that room, not all can have it
  • Maximize utilization of the room in terms of number of classes scheduled
  • Generally:
    • Given \(S = \{a_1, a_2, \ldots, a_n\}\) of \(n\) proposed activities that wish to use a resource that can serve only on activity at a time
    • \(a_i\) has a start time \(s_i\) and a finish time \(f_i\), such that \(O \leq s_i < f_i < \infty\)
    • If \(a_i\) is scheduled to use the resource, it occupies it during the interval \([s_i, f_i)\) then we can schedule both \(_i\) and \(a_j\) iff \(s_i \geq f_j\) or \(s_j \geq f_i\) (if holds, \(a_i\) and \(a_j\) are compatible)
    • Find a largest subset \(S^{\prime} \subseteq S\) such that all activities in \(S^{\prime}\) are pairwise compatible
    • Assume that activities are sorted by finish time.
  • Let \(S_{ij}\) be a set of activities that start after \(i\) and end before \(j\), and \(A_{ij}\) be the subset of \(S_{ij}\) that is maximal
  • See optimal substructure (04.05)
  • Recursive definition in 04.07, but we can optimize
  • Greedy choice – choose the one with the earliest finish time of all still compatible with currently schedule activities
  • Now, consider \(S_k = \{a_i \in S \mid s_i \geq f_k\}\) be the activities that start after \(a_k\) finishes
    • Then greedily choose \(a_1\) first and the first thing in each
  • Proof on 04.09
  • Argument of Optimal Substructure and Greedy Choice properties

Lecture 4: Greedy Algorithms, Continued <2019-09-30 Mon>

Rod cutting, redux

  • Subproblem \(n\) is rod of length \(n\)
  • Let \(r_n\) be value of optimal solution to subproblem \(n\)
  • \(r_n = p_i + r_{n - i}\)
  • Assume than \(r_{n - i}\) is not optimal. Falls in from there

Activity Selection, Continued

  • Optimal solutions are done with dynamic programming in \(\mathcal{O}(n^3)\)
  • But greedy solution is better

The Greedy Choice

  • Maximizes amount of something left over
  • State Greedy Choice Property
  • Argue property
  • argue optimal substructure
  • Greedy Choice Property means that there exists a greedy choice that leads to an optimal solution
    • Use a proof by construction
  • See 04.09

Lecture 4: Greedy Algorithms, Continued <2019-10-02 Wed>

Proof of greedy choice

  • Slide 04.09
  • Let \(A_k\) be an optimal solution to \(S_k\), and let \(a_j\) have earliest finish time of all in \(A_k\)
  • Make this choice, Activities in \(A^{\prime}\) are mutually compatible
  • Therefore, there exists an optimal solution taking the greedy choice
  • \(A_k\) is such that all elements in it are compatible with \(a_k\)
  • Thus, \(A_0 \cap S_1\) is optimal for \(S_1\), by contradiction:
    • Assume for a reductio that \(A_1^{\prime}\) exists such that \(|A_1^{\prime}| > |A_1|\) and all activities in \(A_1^{\prime}\) are mutually compatible
    • \(A_0^{\prime} = A_0 \setminus A_1 \cup A_1^{\prime}\)
    • \(|A_0^{\prime}| > |A_0|\), contradiction!
  • Greedy Choice propertiy is such that the optimal solution \(A_0\) contains \(a_1\)

Greedy Vs Dynamic Programming

  • Gredy algorithms leverage optimal substructure
  • When we can argue that greedy choice is part of a optimal solution, implying that we don’t need to explore all subproblems
  • Consider the knapsack problem
  • 0-1 is taken in entriety
  • Fractional is not
  • Fractional chooses \(v_i / w_i\) and choose items in descending order
    • knapsack can always be filled in the end
  • 0-1 is dynamic, but NP-complete, is \(\mathcal{O}(nW)\), and therefore pseudopolynomial

Lecture 4: Greedy Algorithms, Continued <2019-10-04 Fri>

Huffman Coding

  • encoding a file of symbols from some alphabet
  • Want to minimize size, based on frequency
  • Fixed-length uses \(\lceil \log_2{n} \rceil\) bits per symbol
  • Variable-length uses fewer bits for more frequent symbols
  • Prefix codes can be represented as a binary tree, as can prefix-free codes
  • Corresponds to a full binary tree
  • Pseudocode on 04.18
  • Computation is linear, log for heap ops, therefore, total, \(\mathcal{O}(n \log{n})\)

Greedy Choice

  • 04.20–22
  • Lemma: Let \(C\) be an alpabet in which symbol \(c \in C\) has frequency \(c.freq\) and let \(x, y \in C\) have lowest frequencies. Then there exists an optimal prefix code for \(C\) in which codewords \(x\) and \(y\) have the same length and differ only in the last bit. I.e., an optimal solution exists that merges lowest frequencies first.
  • Proof:
    • Let \(T\) be a tree representing an arbitrary optimal prefx code, and let \(A\) and \(b\) be siblings of maximum depth in \(T\)
    • Assume, without loss of generality, that \(x.freq \leq y.freq\) and \(a.freq \leq b.freq\)
    • Since \(x\) and \(y\) are the two least frequent nodes, we get \(x.freq \leq a.freq\) and \(y.freq \leq b.freq\)
    • Convert \(T\) to \(T^{\prime}\) by exchanging \(a\) and \(x\), then \(T^{\prime}\) to \(T^{\prime\prime}\) by exchanging \(b\) and \(y\)
    • Then, in \(T^{\prime\prime}\), \(x\) and \(y\) are siblings of maximum depth
    • Cost difference between \(T\) and \(T^{\prime}\) is \(B(T) - B(T^{\prime})\)
      • \(= \sum_{c \in C} c.freq \cdot d_T(c) - \sum_{c \in C} c.freq \cdot d_{T^{\prime}}(c)\)
      • \(= x.freq \cdot d_T(x) + a.freq \cdot d_T(a) - x.freq \cdot d_{T^{\prime}}(x) - a.freq \cdot d_{T^{\prime}}(a)\)
      • \((a.freq - x.freq)(d_T(a) d_T(x)) \geq 0\)
    • Since \(a.freq \geq x.freq\) and \(d_T(a) \geq d_T(x)\)
    • Similarly, \(B(T^{\prime}) - B(T^{\prime\prime}) \geq 0\) , so \(B(T^{\prime\prime}) \leq B(T)\), so \(T^{\prime\prime}\) is optimal. QED

Optimal Substructure

  • 04.23–25
  • Let \(C\) be an alphabet in which symbol \(c \in C\) has frequency \(c.freq\) and let \(x, y \in C\) have lowest frequencies
  • Let \(C^{\prime} = C \setminus \{x, y\} \cup {z}\) and \(z.freq = x.freq + y.freq\)
  • Let \(T^{\prime}\) be any tree representing an optimal prefix code for \(C^{\prime}\)
  • Then, $T, which is \(T^{\prime}\) with leaf \(z\) replaced by internal node with children \(x\) and \(y\), represents an optimal prefix code for \(C\)
  • See 04.24–25
  • Look back at it.

Lecture 5: Elementary Graph Algorithms <2019-10-07 Mon>

  • Graphs are ADTs applicable to numerous problems
    • can capture entities and the relationships between them
    • degree of the relationship
    • and more
  • Basics in graph theory, representation, algorithms for basic graph-theory problems

Breadth-First Search

  • Given a graph \(G = (V, E)\), and a source node \(s \in V\), BFS visits every vertex that is reachable from \(s\)
  • Uses a queue to search in a breadth-first manner
  • creates a structure called a BFS tree such that for each vertex in \(v\), the distance from \(s\) to \(v\) is a shortest path in \(G\)
  • Initializes each node’s color to WHITE
  • As nodes are visited, color to GREY when in the queue, then BLACK when finished.

Lecture 5: Elementary Graph Algorithms, Continued <2019-10-09 Wed>

Breadth-First Search, continued

  • BFS tree is shortest paths tree
  • Algorithm presented in 05.04
  • Array \(d\) is distance from start node
  • Array \(\pi\) is parent node.
  • \(\mathcal{O}(|V|^2)\) for adjacency matrix
  • With adjacency list, is \(\mathcal{O}(|V|)\), or actually, \(\mathcal{O}(|E| + |V|)\)
  • \(d[v]\) is shortest distance from \(s\) to \(v\) for unweighted shortest paths
  • Can print shortest path by recursively following \(\pi[v]\), \(\pi[\pi[v]]\), etc.
  • If \(d[v] == \infty\), then \(v\) is not reachable from \(s\)

Depth First Search

  • Graph traversal, follows path as deep as possible before backtracking
  • DFSis implemented recursively
  • Tracks discovery time and finish time of each node
  • All nodes coloured black at end
  • Produces a “forest”
  • Presented on 05.09 and 05.10
  • Time complexity equivalent to BFS, \(\Theta(|V| + |E|)\)
  • Vertex \(u\) is a proper descendent of vertex \(v\) in the DF forest iff \(d[v] < d[u] < f[u] < f[v]\)
    • Parenthesis structure If one prints “(u” when discovering \(u\) and “u)” when finishing \(u\), then the printed text will be a well-formed parenthesized expression
  • Types of tree edges
    • Tree edges are in depth-first forest
    • Back edges \((u, v)\) connect a vertex \(u\) to its ancestor \(v\) in the DF tree (includes self-loops)
    • Forward edges are non-tree edges connecting a node to one of its DF tree descendants
    • Cross edges goe between non-ancestral edges within a DF tree or between DF trees
  • A graph has a cycle iff DFS discovers a back edge (deadlock detection)
  • When dfs first explores an edge, look at \(v\)’s color
    • white implies tree edge
    • grey implies back edge
    • black implies forward or cross edge

Lecture 5: Elementary Graph Algorithms, Continued <2019-10-11 Fri>

Topological Sorting

  • Begins with a directed acyclic graph, edge implies event must occur before another
  • Topological sort of a dag \(G\) is a linear ordering of its vertices such that if \(G\) contains an edge \((u, v)\), then \(u\) appears before \(v\) in the ordering
  • Call DFS on dag \(G\)
  • As each vertex is finished, insert into the front of a linked list, return the linked list of vertices
  • Thus, topological sort is a descending sort of vertices based on DFS finish times
  • What is time complexity – \(\Theta(|V| + |E|)\)
  • See 05.15

Strongly Connected Components

  • Given \(G = \langle V, E \rangle\), a strongly connected component of \(G\) is a maximal set of vertices \(C \subseteq V\) such that for every pair of vertices \(u, v \in C\), \(u\) is reachable from \(v\) and \(v\) is reachable from \(u\)
  • Collapsing the edges within an SCC yields an acyclic component graph
  • Algorithm requisres transpose of \(G\), \(G^T\), which is \(G\) with the edges reversed. \(G^T\) and \(G\) have the same SCCs
  • Call DFS on \(G\)
  • Call DFS on \(G^T\), Looping through vertices in order of decreasing finishing times from first DFS call
  • Each DFS tree in second DFS run is an SCC in \(G\)
  • Is \(\Theta(|V| + |E|)\), because \(G^T\) can be computed in \(\Theta(|V| + |E|)\) if in adjacency list, and let the first run of DFS use Topological Sorting to generate the list
  • The first component node of \(G^T\) has not outgoing edges, since in \(G\) it has only incoming edges, second has edges into the first, etc. (Lemma 22.14 in book)
  • And, DFS on \(G^T\) visits components in topological order of \(G\)’s component graph

Lecture 6: Minimum-Weight Spanning Trees <2019-10-14 Mon>


  • Given a connected, undirected graph \(G = (V, E)\), a spanning tree is an acyclic subset \(T \subseteq E\), that connects all vertices in \(V\)
    • \(T\) is acyclic implies tree
    • \(T\) connects all vertices, spans \(G\)
  • If \(G\) is weighted, then \(T\)’s weight is \(w(T) = \sum_{(u, v) \in T} w(u, v)\)
  • An MST is a spanning tree of minimum weight, not necessarily unique.

Kruskal’s Algorithm

  • Greedy algorithm, makes locally best choice at each step
  • Starts by declaring each vertex to be its own tree (so all nodes together make a forest)
  • Iteratively identify the minimum-weight edge \((u, v)\) that connects two distinct trees and add it to the MST \(T\), merging the two trees

Lecture 6: Minimum-Weight Spanning Trees, Continued <2019-10-16 Wed>

Kruskal’s Algorithm

  • Relies on Disjoint Set data structure
  • Algorithm presented on 06.05
  • Algorithm is \(\mathcal{O}(|E| \log{|E|})\) (06.12)

Disjoint Set Data Structure

  • Given universe \(U\) of elements, DSDS maintains collection \(\mathcal{S} = \{S_1, \ldots, S_k\}\)
    • Each element in \(U\) is in exactly one set \(S_j\)
    • No set \(S_j\) is empty
  • Dynamically updated throughout program
  • Each set \(S \in \mathcal{S}\) has a representative element \(x \in S\)
  • Chapter 21
  • \(\alpha\) is the inverse Ackermann function
  • very near constant

Prim’s Algorithm

  • Still greedy
  • Maintains a single tree, not a forest
  • Starts with an arbitrary root \(r\)
  • Repeatedly finds minimum weight edge incident to the tree
  • algorithm on 06.14
  • Uses min queue
  • Proof in 06.18
    • Nodes already are those in \(V \setminus Q\)
    • For all vertices \(v \in Q\), if \(\pi[v] \neq \mathsc{Nil}\), then \(key[v] < \infty\)
  • Proof is same, more or less
  • Prove invariant via cut that respects \(A\) (no edges span the cut)
  • GRedy choise by looking for a safe edge \(e\)

Minimum Heap

  • Binary tree where key at each node is \(\leq\) keys of children
  • Any subtree also a heap
  • \(\theta(\log{n})\)
  • Build is \(\mathcal{O}(n)\)
  • Filter new min in \(\mathcal{O}(\log{n})\)

Lecture 6: Minimum Spanning Trees, Continued <2019-10-18 Fri>

Proof of correctness of both algorithms

  • Both use greedy approach
  • Maintain invariant that at any time, set of edges \(A\) selected so far is subset of some MST (optimal substructure)
  • Each iteration of the algorithms look for a safe edge \(e\) such that \(A \cup \{e\}\) is also a subset of an MST (greedy choice)
  • Prove invariant via use of cut \((S, V - S)\) that respects \(A\) (no edges span cut)
  • Cut is partition of graph into two disjoint subsets
  • Theorem Let \(A \subseteq E\) be included in some MST of \(G\), \((S, V - S)\) be a cut respecting \(A\) and \((u, v) \in E\) be a minimum-weight edge crossing the cut. Then \((u, v)\) is a safe edge for \(A\). (06.20)
  • Proof.
    • Let \(T\) be an MST including \(A\) and not including \((u, v)\)
    • Let \(p\) be path from \(u\) to \(v\) in \(T\), and let \((x, y)\) be an edge from \(p\) crossing cut (not \(A\)).
    • Since \(T\) is a spanning tree, so is \(T^{\prime} = (T \setminus \{(x, y)\}) \cup \{(u, v)\}\).
    • Both \((u, v)\) and \((x, y)\) cross cut, so \(w(u, v) \leq w(x, y)\)
    • So \(w(T^{\prime}) = w(T) - w(x, y) + w(u, v) \leq w(T)\)
    • Thus, \(T^{\prime}\) is an MST
    • And \((u, v)\) is a safe edge for \(A\) since \(A \cup \{(u, v)\} \subseteq T^{\prime}\)

Lecture 7: Single-Source Shortest Path <2019-10-18 Fri>

  • Given a weighted, directed graph \(G = (V, E)\) with weight function \(w: E \mapsto \mathbb{R}\)
  • The weight of a path \(p = \langle v_0, v_1, \ldots, v_k \rangle\) is the sum of the weights of its edges: \(w(p) = \sum_{i = 1}^{k} w(v_{i - 1}, v_i)\)
  • Shortest path weight from \(u\) to \(v\) is \(\delta(u, v)\) is the min of all paths between the two, infinite otherwise
  • A shortest path from \(u\) to \(v\) is any path \(p\) such that \(w(p) = \delta(u, v)\)
  • Network routing, driving directions


finding shortest pth from source \(s\) to every other node.
find shortest paths from every node to destination \(t\) (solved by running single-source on the transpose)
find shortest path from \(u\) to \(v\) (solveable with SSSP, nothing asymptoticaly faster)
find shortest paths between every pair of nodes (can be solved with repeated applications of single-source, but can be faster/simpler)

Optimal substructure

  • Shortest paths problem has optimal substructure property. Proof on 07.04.

Negative-Weight Edges and Cycles

  • What happens if \(G\) has edges with negative weights?
  • Dijkstra’s algorithm cannot handle this!! Bellman-Ford can, under the right circumstances, namely no negative-weight cycles
  • Cycles include
    • Negative-weight cycles
    • Zero-weight cycles
    • Positive-weight cycles

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-23 Wed>


  • \(d[v]\) for \(v \in V\), is the upper bound on \(\delta(s, v)\)
  • Relaxation of an edge \((u, v)\) is the process of testing whether we can decrease \(d[v]\) and yielding a tighter upper bound
  • Initialize single source, initializes sets \(d[v]\) to \(\infty\), and the parent to non-existent
  • Relax is such that if the current \(d[v]\) is greater than \(d[u] + w(u, v)\), update \(d[v]\) to that, and set \(v\)’s parent to \(u\)

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-25 Fri>


  • Provides core to shortest path
  • Basic method works simply, varied based on implementation


  • Works with negative-weight edges and detects negative-weight cycles
  • Makes \(|V| - 1\) passes over all edges, relaxing each edge during each path
  • No cycles implies all shortest paths have \(\leq |V| - 1\) edges, so that the number of relaxations is sufficient
  • Can be modified to tag negative-weight cycles
  • Algorithm shown on 07.13
  • Pass is through every single edge, \(|V| - 1\) times
  • 7-10 detects negative weight cycles
  • Time Complexity:
    • Initialize Single Source takes \(\mathcal{O}(|V|)\)
    • Relax itself is \(\mathcal{O}(1)\)
    • Negative Weight Cycle is \(\mathcal{O}(|E|)\)
    • Dual loop is \(\mathcal{O}(|V| |E|)\)
    • Thus, total time complexity is \(\mathcal{O}(|V| |E|)\)
  • Correctness
    • Assume no negative weight cycles
    • Since no cycles appear in SPs, every SP has at most \(|V| - 1\) edges
    • Then define sets \(S_0, \ldots, S_{|V| - 1}\)
      • \(S_k = \{ v \in V | \exists p = (s \to v) \mathrm{s.t.} \delta(s, v) = w(p) \land |p| \leq k\}\)
      • Invariant: After the $i$th iteration of outer relaxation loop, for all \(v \in S_i\), we have \(d[v] = \delta(s, v)\)
        • This is the path-relaxation property (Lemma 24.15)
        • Proven via induction on \(i\):
          • obvious for \(i = 0\)
          • If holds for \(v \in S_{i - 1}\), then relaxation and optimal substructure implies holding for \(v \in S_i\)
      • Thus, after \(|V| - 1\) iterations, \(d[v] = \delta(s, v)\) for all \(v \in V = S_{|V| - 1}\)
    • Detection of Negative-weight cycles (07.18)
      • Let \(c = \langle v_0, v_1, \ldots, v_k = v_0 \rangle\) be a neg-weight cycle reachable from \(s\): \(\sum_{i = 1}^k w(v_{i - 1}, v_i) < 0\)
      • If the algorithm incorrectly returns true, then for all nodes in the cycle \(i = 0, 1, \ldots, k\), \(d[v_i] \leq d[v_{i - 1}] + w(v_{i - 1}, v_i)\)
      • By summing we get thing
      • Since \(v_0 = v_k\), the two are the same, then contradiction follows (07.18)

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-28 Mon>

  • Billboard problem and avocado problem are very similar

Shortest paths in a DAG

  • topo sort vertices, initialize single source, relax from left to right
  • Correctness follows from path-relaxation of Bellman-Ford, but relaxing in topo order implies edges relaxed in order
  • Total time complexity is \(\mathcal{O}(|V| + |E|)\)

Dijkstra’s Algorithm

  • Greedy
  • Faster than Bellman-Ford
  • Requires all edge weights to be non-negative
  • Maintains set \(S\) of vertices whose final shortest path weights from \(s\) have been determined
  • Repeatedly selects \(u \in V \setminus S\) with minimum SP estimate, adds \(u\) to \(S\) and relaxes all edges leaving \(u\)
  • Uses min-priority queue to repeatedly make greedy choice
  • Pseudocode on 07.25
  • Use array to implement min queue


  • Invariant: At the start of each iteration of the while loop, \(d[v] = \delta(s, v)\) for all \(v \in S\)
  • Proof: Let \(u\) be first node added to \(S\) where \(d[u] \neq \delta(s, u)\)
  • Let \(p = s \to x \to y \to u\) be SP to \(u\) and \(y\) first node on \(p\) in \(V \setminus S\)
  • Since \(y\)’s predecessor \(x \in S\), \(d[y] = \delta(s, y)\) due to relaxacion of \((x, y)\)

Lecture 7: Single-Source Shortest Paths, Continued <2019-10-30 Wed>

Dijkstra’s and Time Complexity

  • Initialize single source – \(\Theta(|V|)\)
  • Create Q – array, \(\mathcal{O}(|V|)\); heap, \(\mathcal{O}(|V|)\)
  • Extract-Min – \(\mathcal{O}(|V|)\) calls
    • Time Complexity – \(\mathcal{O}(|V|)\) if array; \(\mathcal{O}(\log{|V|})\) if heap
  • Relax – \(\mathcal{O}(|E|)\)
    • Time Complexity – \(\mathcal{O}(1)\) ; \(\mathcal{O}(\log{|V|})\)
  • Total Time Complexity
    • \(\mathcal{O}(|V|^2)\) if array
    • \(\mathcal{O}(|E| \log{|V|})\) if heap
  • If \(E\) is \(o(|V|^2 / \log{|V|})\),, prefer heap

Correctness, continued

  • Since \(y\) precedes \(u\) in \(p\), and edge weights are non negative \(d[y] = \delta(s, y) \leq \delta(s, u) \leq d[u]\)
  • Since \(u\) was chosen before \(y\) in line 3, \(d[u] \leq d[y]\), so \(d[y] = \delta(s, y) = \delta(s, u) = d[u]\), a contradiction
  • Since all vertices eventually end up in \(S\), we see that the algorithm is correct. “QED”

Lecture 8: All Pairs Shortest Paths <2019-11-04 Mon>

  • Similar to SSSP, but for all pairs
  • Given weighted directed graph, find \(\delta(u, v)\) for all \((u, v) \in V \times V\)
  • Run SSP \(|V|\) times
    • If no negative weights, use Dijkstra’s, gives \(\mathcal{O}(|V|^3)\)
    • If negative weights, use Bellman-Ford, for \(\mathcal{O}(|V|^4)\)
  • Can do better, and handle neg-weight edges
    • Matrix-multiplication algorithm, \(\Theta(|V|^3 \log |V|)\)
    • Floyd-Warshall, \(\Theta(|V|^3)\)

Adjacency Matrix Representation

  • Assume vertices are numbered, 1 to \(n\)
  • Input is \(n \times n\) matrix: \[ w_{ij} = \begin{cases} 0 & i = j \\ w(i, j) & (i, j) \in E \\ \infty & (i, j) \not\in E \end{cases} \]
  • Assume, for now, that neg weight cycles are absent
  • Distance matrices \(L\) and \(D\) are produced by algorithms, predecessor matrix \(\Pi\) where \(\pi_{ij}\) is the predecessor of \(j\) on a shortest path from \(i\) to \(j\) or NIL if \(i = j\) or no path exists (well-defined due to optimal substructure)

Printing from \(\Pi\)

  • defined recursively, printing after recursive call
  • See 08.04

Matrix Multiplication

  • Maintain a series of matrices \(L^{(m)} = \left( \ell_{ij}^{(m)} \right)\), where \(\ell_{ij}^{(m)}\) is the minimum weight of any path from \(i\) to \(j\) that uses at most \(m\) edges
    • When \(m = 0\), \(\ell_{ij}^{(0)} = 0\) if \(i = j\), \(\infty\) otherwise
  • Exploits opitmal substructure to get a recursive definition of \(\ell\)
  • Uses \(\min\), see 08.06

Lecture 8: All Pairs Shortest Paths, continued <2019-11-08 Fri>

Matrix Multiplication, continued

  • Allows better paths to come up
  • Recursive solution due optimal substructure property (05.06) \[ \ell_{ij}^{(m)} = \min_{1 \leq k \leq n} \left(\ell_{ik}^{(m - 1)} + w_{ik}\right) \]
  • Ignoring negative weight cycles means \(\delta(i, j) = \ell_{ij}^{(n - 1)}\) (because shortest path uses \(n - 1\) edges)
  • Bottom up computation \(L^{(m)}\) using \(L^{(1)} = W, L^{(2)}, \ldots, L^{(n - 1)}\) by iterative computation
  • Defined on 05.08–09
  • Faster version on 05.13

Lecture 8: All Pairs Shortest Paths, continued <2019-11-11 Mon>


  • Shaves : log factor
  • Start assuming no neg weight cycles, can detect neg cycles as before
  • Considers decomposition by intermediate vertex,
    • If a simple path \(p = \langle v_1, v_2, v_3, \ldots, v_{\ell - 1}, v_{\ell} \rangle\), then the set of intermediate vertices is \(\{v_2, v_3, \ldots, v_{\ell - 1}\}\)
  • Let \(V = \{1, \ldots, n\}\), and fix \(i, j \in V\)
  • For some \(1 \leq k \leq n\), consider set of vertices \(V_k = \{1, \ldots, k\}\)
  • Consider all paths from \(i\) to \(j\) whose intermediate vertices come from \(V_k\) and let \(p\) be a minimum-weight path from them
  • Is \(k \in p\)?
    • If not, then all intermediate vertices of \(p\) are in \(V_{k - 1}\) and an SP from \(i\) to \(j\) based on \(V_{k - 1}\) is also an SP from \(i\) to \(j\) based on \(V_k\)
    • If so, then we can decompose \(p\) into \(i \overset{p_1}{\leadsto} k \overset{p_2}{\leadsto} j\)
  • Note, \(V_0\) is no intermediate vertices, i.e., only the direct edges between pairs of vertices
  • Now \(D^{(k)}_{ij}\) is the weight of an SP from \(i\) to \(j\) using only \(V_k\) as intermediates
    • \(D^{(0)} = W\)
    • \(D^{(n)}\) is the shortest paths
  • \(k \neq p \implies D^{(k)}_{ij} = D^{(k - 1)}_{ij}\)
  • \(k = p \implies D^{(k - 1)}_{ik} + D^{(k - 1)}_{kj} = D^{(k)}_{ij}\)
  • Shortest path from \(i\) to \(j\) based on \(V_k\) is either going to be the same as that based on \(V_{k - 1}\) or will be going to go through \(k\)
  • \(D^{(k)} = \left(d_{ij}^{(k)}\right)\), where \(d_{ij}^{(k)}\) is the weight of the shortest path at \(k\) (08.17), is in \(\Theta(n^3)\)
  • Algorithm on (08.18)
  • Transitive Closure, whether paths exist between pairs of vertices for \(G = (V, E)\), transitive closure of \(G^{*} = (V, E^{*})\)
    • \(E^{*} = \{(i, j) : \exists i \overset{p}{\leadsto} j \in G\}\)
    • Can be solved with Floyd-Warshall but with bitwise ops instead of min/plus
  • Memory can potentially be saved, operation can be in place

Lecture 9: Lower Bounds <2019-11-11 Mon>

Upper Bound

  • Of an algorithm, has \(f(n)\) for \(n\) input; such that if there exiusts no input of size \(n\) such that \(A\) requires more than \(f(n)\) time
  • Of a problem, has an upper bound of \(f(n)\) if there exists at least one algorithm that has an upper bound of \(f(n)\)
  • Generally, tightest is preferred

Lower Bound

  • A problem has a lower bound \(f(n)\) if for any algorithm \(A\) to solve the problem there exists at least one input of size \(n\) that forces \(A\) to take at least \(f(n)\) time or space
  • Pathological input depends on the sepecific algorithm \(A\), consider Bubble sort, takes reverse order, giving \(\Omega(n^2)\)
  • Since every sorting algorithm has an input of size \(n\) forcing \(\Omega(n \log{n})\) steps, the sorting problem has time complexit lower bound of \(\Omega(n \log{n})\)
  • Adversarial argument, simulate arbitrary algorithm \(A\) to build pathological input
  • Reduce one problem to another to establish lower bounds

Lecture 9: Lower Bounds, continued <2019-11-13 Wed>

Decision trees

  • full binary tree that represent comparisons bteween elementts performed by a particular sorting algorithm operating on a certain sized input
  • Tree represents an algorithm’s behavior on all possible inputs of size \(n\)
  • Each node represents one comparison made by the algorithm
    • Each node is labeld as \(i:j\) which represents \(A[i] \leq A[j]\)
    • If, in particular input, holds, then flo moves left, otherwise to right
    • Each leaf represents a possible output of the algorithm, a permutation of the input
    • All permutations must be in the tree in order for algorithm to work property

Proof of lower bound (using an adversarial approach)

  • Length of path from root to output leaf is number of comparisons made by algorithm
  • Worst-case number of comparisons is the length of the longest path
  • Adversary chooses a deepest leaf to create worst-case input
  • Number of leaves is \(n!\)
  • A binary tree of height \(h\) has at most \(2^h\) leaves
  • Thus, \(2^h \geq n# \geq \sqrt{2\pi n}\left(\frac{n}{e}\right)^n\)
  • Take base-2 logsx
  • \(h \geq \log \sqrt{2\pi} + (1/2) \log n + n \log n - n \log e = \Omega(n \log n)\)
  • Every comparison-based sorting algorithm has some input that forces it to make \(\Omega(n \log n)\) comparisons. QED
  • And thus, mergesort and heapsort are asymptotically optimal.

Reductions, using Convex Hull

  • Use sorting lower bound to get lower bound on convex hull
  • Given a set $Q = \{p1, p2, \ldots, pn\} of \(n\) points, each from \(\mathbb{R}^2\), output \(CH(Q)\), which is the smallest convex polygon \(P\) such that each point from \(Q\) is on \(P\)’s boundary or in its interior
  • Output is an ordered list of points that make up the boundary edges such that from one to the next, the hull will be drawn
  • Reduce problem of sorting to convex hull
  • Given any instance of the sorting problem, \(A = \{x_1, \ldots, x_n\}\), we will transform it to an instance of convex hull.
  • Transformation is \(Q = \{(x_1, x_1^2), \ldots, (x_n, x_n^2)\}\)
  • Feed into convex hull, convex hull transformed to sorted array

Lecture 9: Lower Bounds, continued <2019-11-15 Fri>

Reductions, using Convex Hull, continued

  • Known Lower Bound of Sorting, reducing to Convex Hull
  • Since Sorting is in \(\Omega(n \log{n})\), by reduction, we will find that Convex Hull is \(\Omega(n \log{n})\)
  • Takes any instance of problem A (the problem reduced from) to some instance of problem B (the problem reduced to)
  • Because of mapping, transformation will include all points (as will all lie on a parabola)
  • By definition, only one chunk
  • Conversion is still linear time
  • Proves a lower bound of convex hull, but not necessarily the lower bound (it is the lower bound of Convex Hull)
  • Can easily be done via contradiction. (09.12)

Lecture 10: NP Completeness <2019-11-15 Fri>

  • So far, focused on problems with efficient algorithms (i.e., those that run in polynomial time)
    • Called efficient even if \(c\) is large, since it is likely that another, more efficient algorithm exists
    • Polynomial in terms of size of input, \(n\).
  • Some problems have a fastest known algorithm in superpolynomial time (i.e., exponential, sub-exponential (\(2^{n^{1/3}}\))
  • Some problems cannot be solved in any amount of time
  • Focus on lower bounds, but used to argue that some problems don’t have any efficient solution

Lecture 10: NP Completeness, continued <2019-11-18 Mon>

  • Focus on complexity classes P and NP
  • P is deterministic polynomial time, the set of problems that can be solved with a deterministic TM (deterministic algorithm in polynomial time)
  • NP is non-deterministic polynomial time, i.e., set of problems that can be solved by a non-deterministic TM in polynomial time (can explore many possible paths of computation at once)
  • NP is the set of problems whose solutions, if given, can be verified in polynomial time
  • Hamiltonian cycle – \(G = (V, E)\) contain a hamiltonian cycle (a simple cycle visiting every vertex exactly once?)
    • NP, since given a specific \(G\), plus yes/no answer, plus certificate, we can verify a yes in polynomial time using the certificate
    • What would be an appropriate certificate?
    • Not known if \(HAM-CYCLE \in P\)
  • Euler – does a directed graph \(G = (V, E)\) contain an Euler tour – a cycle that visits every edge in \(E\) exactly once and can visit vertices multiple times
    • Check if each vertex’s in-degree is equivalent to its out-degree
    • Certificate is just the tour, or even the graph it self
  • Any problem in \(P\) is also in \(NP\), since if we can efficiently solve the problem, we get poly-time verification for free, i.e., \(P \subseteq NP\)
  • Not known if \(P \subset NP\) unknown if there exists a problem in \(NP\) that’s not \(P\)
  • A subset of the problems in \(NP\) is the set of NP complete problems
    • Every problem in NPC is at least as hard as all others in NP
    • These problems are believed to be intractable (no efficient algorithm), but not yet proven to be so
    • If any NPC problem is in \(P\), then \(P = NP\) and life is glorious and scary

Proving NP completeness

  • Prove that a problem is NPC, we can just take a different approach
  • To prove \(B \in NPC\):
    1. Prove that \(B \in NP\) by identifying certificate that can be used to verify a YES answer in polynomial time
      • Use obvious choice
      • Need to argue that verification requires polynomial time
      • Certificate is not instance unless \(B \in P\)
    2. Show that \(B\) is as hard as any other NP problem by showing that if we can efficiently solve \(B\) then we can efficiently solve all problems in NP
  • Step one is easy
  • Second looks hard, but just use a reduction (probably to 3SAT)

Reducing to NP

  • Takes instance of \(A\) and transforms to \(B\) such that if \(B\) is solveable, then \(A\) is solvable
  • TC of reduction for \(A\) is reduction to \(B\) plus time to solve \(B\)
  • Start with \(A\) and \(B\)
    • Trasform any instance \(\alpha\) of \(A\) to some instance \(\beta\) of \(B\) such that
      1. Transformation must take polynomial time
      2. The answer for \(\alpha\) is yes iff the answer for \(\beta\) is yes.
  • If such a reduction exists, then \(B\) is at least as hard as \(A\), since if an efficient algo iexists, any \(A\) can be solved similarly
  • \(A \leq_{p} B\) (read \(A\) is reducible to \(B\) modulo polynomial-time conversion)

Only decision problems

  • Only focus on decision problems, simplifies things
  • Only output is yes or no
  • Not asked for shortest path or cycle or similar
  • Don’t ask for shortest path from \(i\) to \(j\), just ask if there exists a path from \(i\) to \(j\) with weight at most \(k\)
  • Optimization problem implies decision problem
  • They are no harder than the original optimization problem, show if decision is hard, so is optimization
  • Decision are especially convenient if you think about TM and languages

Lecture 10: NP Completeness, continued <2019-11-20 Wed>

Proving NP completeness

  • Same form as reduction for convex hull, but no need to trasform solution for \(B\) to a solution for \(A\)
  • As with the convex hull, the reduction’s time complexity must be strictly less than the lower bound we are proving for \(B\)’s algorithm.
    • The problem conversion must happen in \(\mathcal{O}(n^c)\), where \(c \in \mathbb{Z}\)
  • Must show that output of \(B\) is yes iff \(A\) would be yes
  • But if we want to prove that a problem \(B\) is NPC, do we have to reduce it to every problem in NP?
    • No! reductions are transitive!

Circuit Sat

  • First NPC problem
  • An instance is a boolean combinational circuit, no feedback, no memory
  • Is there a satisfying assignment such that the assignment of inputs that makes it output 1?
  • To prove, show circuit sat is NP, run circuit
    • Certificate is satisfying assignment
    • Any problem in NP reduces to Circuit Sat
  • Skipping second, leverages the existence of an algorithm that verifies certificates for some NP problem

Other NPC Problems

  • Circuit Sat
  • SAT
  • 3CNF-Sat
    • Clique
      • Vertex Cover
      • Ham Cycle
      • TSP
    • Subset Sum

Proving the problem

  • 10.18
  • Follow every step
    1. Prove that B is in NP
      1. Describe certificate that can verify yes (often simple and obvious, but not merely the instance)
      2. Describe how the certificate is verified
      3. Argue tht verification takes polynomial time
    2. Prove that the problem \(B\) is NP-hard
      1. Take any other NP-complete problem \(A\) and reduce it to \(B\)
        • Reduction must transform any instance of \(A\) to some instance of \(B\)
      2. Prove that the reduction takes polynomial time (an algorithm, analyze like any other)
      3. Prove that the reduction is valid
        • Answer is “yes” for \(A\) if and only if the answer is “yes” for \(B\)
        • Must argue both directions
        • Constructive proofs work well, e.g., “Assume the instance of VERTEX-COVER (problem \(A\)) has a vertex cover of size \(\leq k\). We will now construct from that a Hamiltonian cycle in problem \(B\).” (and vice versa)

Lecture 10: NP Completeness, continued <2019-11-22 Fri>

Satisfiability (SAT) through Circuit Satisfiability

  • Given a poolean formula \(\phi\) consisting of \(n\) variables, \(x_1, \ldots, x_n\) , \(m\) connectives (of the usual type) and parenthesis
  • SAT is \(B\)
    • Certificate is satisfying assignment
    • Certificate verified by evaluation of satisfied assignment
    • Verification clearly takes polynomial time (\(\mathcal{O}(n + m)\))
  • SAT is in fact in NP
  • Will show CIRCUIT-SAT \(\leq_{P}\) SAT reducing CIRCUIT-SAT to SAT, and thus show that SAT is NP hard
  • Map any instance \(C\) of circuit sat to some formula \(\phi\) in polynomial time such that sat iff circuit sat
    • Define a variable for each wire in \(C\)
    • Then define a clause of \(\phi\) for each gate that defines the function for that gate
    • And include the output wire as a separate clause
    • Given a satisfying assignment for \(\phi\), extraction of \(x_{1}_{}\), \(x_{2}\), \(x_{3}\) as satisfying assignment to \(C\)
    • \(\phi\)’s size is polynomial in size of \(C\)
    • If circuit has a satisfying assignment, final output of circuit is \(1\) and value on each internal wire matches the output of the gate that feeds it, thus \(\phi\) evaluates to 1
    • If \(\phi\) has a satisfying assignment, then each of \(\phi\)’s clauses is equivalent to each internal wire.
    • Since satisfying assignment for \(C\) impplies satisfying assignment for \(\phi\) and vice-versa, we get \(C\) has a satisfying assignment iff \(\phi\) does.

3-CNF Satisfiability

  • Given a boolean formula than is in 1-conjunctive normal form, which is a conjunction of clauses, each a disjunction of 3 literals
  • Assignment is certificate
  • Certificate verified by evaluation of satisfied assignment
  • Reduce SAT \(\leq_{P}\) 3CNFSAT
  • Again, need to map any instance of \(SAT\) to some instance of 3CNFSAT
    • Parenthesize \(\phi\) and build parse tree, that can be a circuit
    • Assign variables to wires in the circuit, \(\phi^{\prime}^{}\)
    • Truth table for \(\phi_{i}^{\prime}\) to get dnf of negation, convert to cnf (by DeMorgan’s) which is double prime
    • Add auxilary vars
    • Take conjunction of all of double prime
      • If has three literals, add as clause
      • If has two literals, add \(p\) and \(\lnot p\) to two copies and add the copies
      • Otherwise, add auxillary variables \(p\) and \(q\) in the correct manner
    • Can show the biconditional relatively easily

Lecture 10: NP Completeness, continued <2019-11-25 Mon>


  • Given an undirected graph \(G = (V, E)\) and a value \(k\), does \(G\) contain a clique (complete subgraph) of size \(k\)?
  • Certified by vertices, \(V^{\prime}\) of clique
    • Verified that \(V^{\prime}\) forms a clique
    • \(|V^{\prime}| \geq k\)
  • Reduce 3CNF to Clique
    • \(\langle \phi \rangle\) becomes \(\langle G, k \rangle\)
    • \(\phi = C_{1} \land \cdots \land C_{k}\)
    • Creates triple \(v^{r}_{1}\), \(v^{r}_{2}\), \(v^{r}_{3}\) added to \(V\) for every clause
    • Create an edge between any pair of vertices, \((v^{k}_{j}, v^{l}_{m})\)
      • When \(k \neq l\)
      • They are not the negation of each other
    • Obviously done in polynomial time
  • If has satisfying assignment, at least one literal in each clause is true
  • \(V^{\prime}\) is clique of size \(k'\)
  • If G has size \(k\) clique, can assign 1 to corresponding literal of each vertex
  • Each vertex in its own triple, so each clause has a literal set to \(1\)
  • Will not set both literal and negation to 1
  • Gets satisfying assignment
  • See 10.37.

Vertex Cover

  • A vertex covers all edges incident to it
  • A vertex cover of a graph is a set of vertices that cvers all edges in the graph
  • An undirected graph \(G\) and value \(k\)
  • Does \(G\) contain a vertex cover of size \(k\)
  • Set of vertices is certificate, verify size and all edges are covered.
  • Reduce Clique to Vertex Cover
  • Given \(\langle G = (V, E), k \rangle\) of Clique, vertex cover is \(\langle \bar{G}, |V| - k \rangle\) where \(\bar{G} = (V, \bar{E})\) is \(G\)’s complement.
  • Clear that conversion is polynomial
  • Proof of biconditional trivial by construction given complement (see 10.41)

Lecture 10: NP Completeness, continued <2019-12-02 Mon>

  • Review method of reduction
  • Review Vertex Cover reduction

Lecture 10: NP Completeness, continued <2019-12-04 Wed>

Subset Sum

  • Given a finite set \(S\) of positive integers and a positive integer target \(t\), is there a subset \(S^{\prime} \subseteq S\) whose elements sum exactly to \(t\)?
  • In NP
    • Certificate is \(S^{\prime}\), verification in polytime
  • Subset sum is NP Hard, from 3-CNF-SAT
    • 3-CNF-SAT \(\leq_{P}\) SUBSET-SUM
    • No clause contains variable and its negation
    • Each variable appears in at least one clause
    • Reduction
      • Let \(\phi\) have \(k\) clauses \(C_{1}, \ldots C_{k}\) over \(n\) variables \(x_{1}, \ldots x_{n}\)
      • Reduction creates two numbers in \(S\) for each variable \(x_{i}\) and two numbers for each clause \(C_{j}\)
      • Each number has \(n + k\) digits, most significant \(n\) tied to variables, least significant \(k\) tied to clauses
        • \(t\) has a one in each digit tied to a variable, and a 4 in each digit tied to a clause.
        • For each \(x_{i}\), \(S\) contains \(v_{i}\) and \(v_{i}^{\prime}\) in the $i$th most significant position, and \(0\) for other variables, put a 1 in \(C_{j}\)’s digit for \(v_{i}\) if it apears unnegated, a 1 in \(C_{j}\)’s digit for \(v_{i}^{\prime}\) if it appears negated
        • For each \(C_{j}\), $S contains integers \(s_{j}\) and \(s_{j}_{}^{\prime}\) where \(s_{j}\) has a 1 in the \(C_{j}\) digit and \(s_{j}^{\prime}\) has a 2 in the \(C_{j}\) digit
      • Greatest sum of any digit is 6, therefore no carry,
      • Variable digits provide mutual exclusion of variables (excluded middle)
      • slack from \(s_{k}\) and excluded middle of \(x_{n}\) provide proof

Lecture 10: NP Completeness, continued <2019-12-06 Fri>

Subset Sum, Continued

  • 4 is used because there can be 1, 2 or 3 literals that are true, the additional 1 makes the problem easier and allows the slack to work
  • See 10.46, 10.47 for if and only if

Lecture 11: Maximum Flow <2019-12-09 Mon>


  • Can use digraph as a flow network to model things (data communications networks, water etc. through pipes, assembly lines, etc.)
  • a flow network is a digraph with two special vertices, a source (\(s\)) that produces flow, and a sink (\(t\)) that takes the flow
  • Diredge is a conduit with a capacity
  • Vertices and conduit junctions
  • Except \(s\) and \(t\), flow must be conserved, the flow into a vertex must match flow out
  • Max flow problem: given a flow network, determine max amount of flow that can get from \(s\) to \(t\), also: bipartite matching

Flow networks

  • every edge \((u, v) \in E\) has non negative capacity, \(c(u, v) \geq 0\)
  • If \((u, v) \in E\), then \((v,u) \not\in E\)
  • If \((u, v) \in E\), then \(c(u, v) = 0\)
  • No self-loops
  • Assume that every vertex in \(V\) likes on some path from the source \(s \in V\) to the sink vertex \(t \in V\)


  • A flow in \(G\) is a function \(f: V \times V \mapsto \mathcal{R}\) that satisfies
    1. The capacity constraint: for all \(u, v \in V\), \(0 \leq f(u, v) \leq c(u, v)\), i.e., flow is non-negative and does not exceed capacity
    2. Flow conservation: for all \(u \in V \setminus \{s, t\}\): \[\sum_{v \in V }f(v, u) = \sum_{v \in V }f(u, v)\]
  • Value of flow \(f\) is net flow out of \(s\), net flow into \(t\) \[|f| = \sum_{v \in V} f(s, v) - \sum_{v \in V} f(v, s)\]
  • Maximum Flow Problem: Given the graph and capacities, find a flow of maximum value


  • flow/capacity on edges

Multiple Sources/Sinks

  • Supersource \(s\) and supersink \(t\), edge from \(s\) to every source, from every sink to \(t\), add edges with infinite capacity

Ford-Fulkerson Method

  • method, not algorithm
  • many ways to implement
  • Uses residual network – a network wich is \(G\) with capacities updated based on the amount of flow \(f\) already going through it
  • augmenting path – a simple path from \(s\) to \(t\) in a residual nework, if such a path exists, then more flow can be pushed through
  • cut – partition of \(V\) into \(S\) and \(t\) where \(s \in S\) and \(t \in T\) can measure flow and capacity crossing cut
  • Repeatedly finds augmenting path in residual, adds in flow along path, updates residual network

Residual Networks

  • given capacity \(c\) and flow \(f\), residuale network consists of edges with capacities showing how one can change flow in $G$x
  • Residual capacity of an edge is: \[c_{f}(u, v) = \begin{cases}c(u, v) - f(u, v) & (u, v) \in E\\f(v, u) & (v, u) \in E\\0 & \end{cases}\]
  • \(E_{f} = \{(u, v) \in V \times V : c_{f}(u, v) > 0\}\)

Lecture 11: Maximum Flow, Continued <2019-12-11 Wed>

Flow Augmentation

  • if \(f\) is a flow in \(G\) and \(f^{\prime}\) is a flow in \(G_{f}\), the augmentation is: \[(f \uparrow f^{\prime})(u, v) = \begin{cases} f(u, v) + f^{\prime}(u, v) - f^{\prime}(v, u) & (u, v) \in E \\0 & \end{cases}\]
  • \(f \uparrow f^{\prime}\) is a flow in \(G\) with value \(|f \uparrow f^{\prime}| = |f| + |f^{\prime}|\)
  • Amount of flow that can be put on a path \(p\) is \(p\)’s residual capacity, \(c_{f}(p) = \min\{c_{f}(u, v) \forall(u, v)\in p\}\)
  • Then, \(f_{p}(u, v)\) is based on the value of \(c_{f}(p)\)
  • Full Ford-Fulkerson algorithm presented on 11.16


  • Assume oll of \(G\)’s capacities are integers
  • If we chose augmenting path arbitrarily, \(|f|\) has an upper bound of \(|f*|\) steps


  • Uses Ford-Fulkerson but BFS to find paths, time complexity is \(\mathcal{O}(|V||E|^{2})\)

Maximum Bipartite Matching

  • In an undirected graph, a matching is a subset of edges \(M \subseteq E\) such that for all \(v \in V\), at most one edge from \(M\) is incident on \(v\)
  • If an edge from \(M\) is incident on \(v\), \(v\) is matched, otherwise, unmatched
  • Find a matching of maximum cardinality
  • Special case: \(G\) is bipartite, i.e., meaning \(V\) partitioned into two disjoint sets \(L\) and \(R\) and all edges cross tbetween the two
  • Matching machines to tasks, arranging marriages between interested parties
  • Can cast bipartite matching as max flow, given bipartite graph \(G\), flow network is:
    • \(V^{\prime} = V \cup \{s, t\}\)
    • \(E^{\prime} = \{(u, v), (u, v) \in E\} \cup \{(s, u) : u \in L\} \cup \{(v, t): v \in R\}\)
    • \(c(u, v) = 1, (u, v) \in E^{\prime}\)
  • Value of flow is number of edges

Lecture 12: Final Exam Review

  • Monday, Dec 16 13:00-15:00, AvH 106
  • Topics: MSTs, Single Source Shortest Paths, All Pairs Shortest Paths, Lower Bounds, NP Completeness, Max Flow
  • Two reference sheets (double sided, mecanicaly printed, standard letter)
  • Four Questions, 25 pts each
  • Guaranteed NP-completeness, Max Flow (Edmons Karp)

Licensed under Creative Commons Attribution-ShareAlike 4.0 International License