# Design and Analysis of Algorithms

## Table of Contents

- Lecture 0
- Lecture 1
- Lecture 1, Cont.
- Lecture 1, Cont.
- Lecture 2: Medians and Order Statistics
- Lecture 2: Medians and Order Statistics, Continued
- Lecture 2: Medians and Order Statistics, Continued
- Lecture 2: Medians and Order Statistics, Continued
- Lecture 3: Dynamic Programming
- Lecture 3: Dynamic Programming, Continued
- Lecture 3: Dynamic Programming, Continued
- Lecture 3: Dynamic Programming, Continued
- Lecture 3: Dynamic Programming, Continued
- Lecture 3: Dynamic Programming, Continued
- Lecture 3: Dynamic Programming, Continued
- Lecture 4: Greedy Algorithms
- Lecture 4: Greedy Algorithms, Continued
- Lecture 4: Greedy Algorithms, Continued
- Lecture 4: Greedy Algorithms, Continued
- Lecture 5: Elementary Graph Algorithms
- Lecture 5: Elementary Graph Algorithms, Continued
- Lecture 5: Elementary Graph Algorithms, Continued
- Lecture 6: Minimum-Weight Spanning Trees
- Lecture 6: Minimum-Weight Spanning Trees, Continued
- Lecture 6: Minimum Spanning Trees, Continued
- Lecture 7: Single-Source Shortest Path
- Lecture 7: Single-Source Shortest Paths, Continued
- Lecture 7: Single-Source Shortest Paths, Continued
- Lecture 7: Single-Source Shortest Paths, Continued
- Lecture 7: Single-Source Shortest Paths, Continued
- Lecture 8: All Pairs Shortest Paths
- Lecture 8: All Pairs Shortest Paths, continued
- Lecture 8: All Pairs Shortest Paths, continued
- Lecture 9: Lower Bounds
- Lecture 9: Lower Bounds, continued
- Lecture 9: Lower Bounds, continued
- Lecture 10: NP Completeness
- Lecture 10: NP Completeness, continued

## Lecture 0

- Syllabus review
- Course Webpage
- Proof-heavy course
**Design**– methods used to create new algorithms**Analysis**– Mathematical assessment of an algorithm’s*correctness*and*efficiency***Correctness**– Does it do what it’s supposed to do*on all*?**possible**inputs**Efficiency**– Measure the algorithm’s*running time*, counting the number of basic operations

## Lecture 1

### Asymptotic Notation – Starts on Slide 3

- Big-\(O\)
- Big-\(\Omega\)
- Big-\(\Theta\)
- Little-\(o\)
- Little-\(\omega\)
*Not*Interchangeable- Asymptotic Upper Bound – Big-\(O\), \(O(g(n)) = \{f(n) : \exists c, n_0 > 0 \mid \forall n \geq n_0, 0 \leq f(n) \leq cg(n) \}\)
- Asymptotic Lower Bound, Big-\(\Omega\), \(\Omega(g(n)) = \{f(n) : \exists c, n_0 \mid \forall n \ge n_0, 0 \leq cg(n) \leq f(n) \}\)
- Asymptotic Tight Bound, Big-\(\Theta\), \(\Theta(g(n)) = \{f(n) : \exists c_1, c_2, n_0 > 0 \mid \forall n \geq n_0, 0 \leq c_1 g(n) \leq f(n) \leq c_2 g(n) \}\)
- Upper Bound, not tight, little-\(o\)
- Lower bound, not tight, little-\(\omega\)

## Lecture 1, Cont.

### Upper And Lower Bounds

- 01.9
- Analyze algorithms and problems in terms of time complexity (count of operations)
- Sometimes in space complexity (memory usage)
- upper and lower bounds are used
- applies to general problems
- 01.10 – upper is most common form
- Algorithm \(A\) has an
**upper bound**of \(f(n)\) for input of size \(n\) if there exists**no input**of size \(n\) such that \(A\) requires more than \(f(n)\) time - Consider sort, Quicksort and Bubblesort are \(\mathcal{O}(n^2)\), and Mergesort is \(\mathcal{O}(n \log{n})\)
- Quicksort is used more often in practice, when randomized tends to be closer to \(n \log{n}\) time

- Lower Bounds are like best-case results, typically not as interesting
- 01.11 – Uppper bound of a problem
- A problem has an upper bound of \(f(n)\) if there exists
*at least one*algorithm that has an uppperbound \(f(n)\)- there exists an algorithm with time/space complexity of at most \(f(n)\) on
**all**inputs of size \(n\)

- there exists an algorithm with time/space complexity of at most \(f(n)\) on
- Mergesort has worst-case \(\mathcal{O}(n \log{n})\), sorting has an upper bound of \(\mathcal{O}(n \log{n})\)
- 01.12 – Lower bounds of a problem
- A problem has a
**lower bound**of \(f(n)\) if, for any algorithm \(A\), to solve it, there exists*at least one*input of size \(n\) that forces \(A\) to take at least \(f(n)\) time/space.- Called the pathological input, depends on \(A\) itself.
- input of size \(n\) (reverse) that forces Bubblesort to take \(\Omega(n^2)\)
- Since
*every*sorting algorithm has an input of size \(n\) forcing \(\Omega(n \log{n})\) steps, the sorting problem has a**time comlpexity lower bound**of \(\Omega(n \log{n})\). Therefore, Mergesort is.*asymptotically*optimal

- 01.13 Lower Bounds, 2
- To argue a lower bound, use adversarial argument: algorithm simulates an
*arbitrary*algorithm \(A\) to build pathological input - Needs to be a general form, pathological input depends on specific algorithm \(A\)
- Can reduce one problem to another to establish lower bounds
- Show that if we can compute convex hull in \(o(n \log{n})\) time, then we can sort in time \(o(n \log{n})\); cannot be true, so convex hull takes \(\Omega(n \log{n})\)

- To argue a lower bound, use adversarial argument: algorithm simulates an

### Efficiency

- 01.14 – Efficiency
- An algorithm is time- or space- efficient if worst case time/space complexity is \(\mathcal{O}(n^c)\) for constant \(c\) and input size \(n\), polynomial in the input size
- Input size on in number of bits needed
- graph of \(n\) nodes takes \(\mathcal{O}(n \log{n})\) bits to represent nodes, \(\mathcal{O}(n^2 \log{n})\) to represent edges., algorithm in \(\mathcal{O}(n^c)\) is efficient
- Problem that includes as an input a number, \(k\) (
*e.g.,*threshold) onnly needs \(\mathcal{O}(\log{k})\) bits to represent, efficient must run in \(\mathcal{O}(\log^c{k})\) - If polynomial in \(k\), called pseudopolynomial

### Recurrence Relations

- 01.15 – Recurrence Relations
- Non-recursive relations are relatively easy to analyse.
- Recurrence relations are used for time complexity of recursive algorithms, bound relation asymptotically
- For example, Mergesort
- \(T(n) = 2 T(n/2) + \mathcal{O}(n)\)
- Bound on \(T(n)\) needed
- Use the Master theorem (See 235/156 notes)
- Other approaches include 01.17–18

### Graphs

- 01.19–24 – Graphs
- Set of vertices \(V\), set of unordered pairs between vertices \(E\) (edges)
- Directed graph has ordered pairs
- Weighted is directed augmented with weights for edges
- Adjacency list, each vertex has linked list listing edges, \(\mathcal{O}(n + m)\)
- Adjacency matrix, \(n \times n\) – \(\mathcal{O}(n^2)\)

## Lecture 1, Cont.

### Graphs

- 01.24 – Adjacency matrices
- \(n \times n\) matrix \(M\), \(M(i,j) = w\) for weight, \(\infty\) for non-edges, takes \(\Theta(n^2)\) space

### Algorithmic Techniques, Dynamic Programming

- 01.25
- Technique for solving optimization problems, where a best solution is needed, evaluated by an objective function
- Decomposes a problem into subproblems, optimally solves recursively, combines into final/optimal solution
- Exponential number of subproblems, many overlap, re-use solutions
- Number of distinct subproblems is polynomial
- Works if they have
**optimal substructure property**– optimal solution is made up of optimal solutions to subproblems - All-pairs shortest paths, knapsack
- Can prove correctness if optimal substructure property exists
- Must be careful, very carefully proven

### Algorithmic Techniques, Greedy Algorithms

- 01.26
- Must consider optimal substructure property
- Each step, commits to locally optimal
- Tend to be very quick, does not use all possible subproblems
- MST, Dijkstra’s

### Divide And Conquer

- 01.27
- Not limited to optimization, solve each sub-problem recursively, then combine
- Think Mergesort

### Proof By Contradiction

- 01.28
- Assume opposite of premise to be proved, arrive at a contradiction of some other assumption
- To prove there is no greatest even integer:
- Assume for a
*reductio*that there exists a greatest even integer \(N\), for all even integers, \(n\), we have \(N \geq n\). (1) - But \(M = N + 2\) is an even integer, since it’s the sum of two even integers, and \(M > N\).
- Therefore, (1) is false, so there is no greatest even integer. \(\box\)

- Assume for a

### Proof by Induction

- 01.29
- Base case, and inductive step, involves non-negative integers
- Useful for proving
**invariants**in algorithms, properties that always hold at*every*step (especially at the final step) - See PHIL 411 notes

### Proof by Construction

- 01.30
- Proof of existence by directly constructing it.
- Will be used extensively for NP-completeness

### Proof by Contraposition

- 01.31
- \(P \to Q \equiv \lnot Q \to \not P\)
- Proof that if \(x^2\) is even, then \(x\) is even. Contraposition is the simplest
- Useful for proving biconditionals as well
- Useful for NP-completeness

## Lecture 2: Medians and Order Statistics

- Given an array \(A\) of \(n\) distinct numbers, $i$th order statistic of \(A\) is its $i$th smallest element
- \(i = 1\) – minimum
- \(i = n\) – max
- \(i = \lfloor (n+1)/2 \rfloor\) – (lower) median
- Given array \(A\) of \(n\) elements and a number \(i \in \{1, \dots, n\}\) find the $i$th order statistic of \(A\)
- Can we do better?
- Minimum, linear, normal just scan through the array, keeping track of the smallest so far
- small is smallest from \(A_1\) to \(A_i\) – the loop invariant
- Loop invariant can be shown easily via induction
- Correctness by observing \(i = n\) at the end, but before return
- \(\Theta(n)\) – Cannot do better
- Must find a lower bound
- Must make at least \(n - 1\) comparisons
- no element can be eliminated until shown it’s greater than at least one other element
- All still eligible to be smallest are in a bucket, removed after shown to be \(>\) another.
- Each comparison removes at most one element, from the bucket, therefore at least \(n - 1\) comparisons are needed to remove all but one from the bucket
- Follows by pigeonhole principle

- Minimum and Maximum (02.06)
- Given \(A\) with \(n\) elementts, find both min and max
- Obvious algorithm, non asymptotic time complexity
- Do better by checking for both
- Look at 02.07

## Lecture 2: Medians and Order Statistics, Continued

### The simultaneous Minimum and Maximum (starts 02.06)

- find-min and find-max are optimal, as they are at \(n - 1\) comparisons for \(n\) elements.
Algorithm presented on 02.07 – be familiar with the algorithm (retype later)

(defun min-and-max (list-of-nums) (let ((len (length list-of-nums)) (max (max (first list-of-nums) (second list-of-nums))) (min (min (first list-of-nums) (second list-of-nums)))) (do ((i 1 (1+ i))) ((= i (1- (floor len 2)))) (setq max (max max (nth list-of-nums (* 2 i)) (nth list-of-nums (- (* 2 i) 1))) min (min min (nth list-of-nums (* 2 i)) (nth list-of-nums (- (* 2 i) 1))))) (when (oddp len) (setq max (max max (nth list-of-nums (1- len))) min (min min (nth list-of-nums (1- len))))) (values max min)))

- For this algorithm, in \(\mathcal{O}(\frac{3}{2}n)\)

### Selection of $i$th smallest value (02.10)

- Select $i$th smallest value. Obvious solution (sort, return $i$th element), complexity is \(\Theta(n \log n)\)
- Can do better, running in linear time
- Uses divide-and-conquer, similar to binary search
- Definition on 02.12
- Uses partition like in quicksort (deterministic)
- \(q\) from algorithm is in its final sorted position
- Select calls partition, choses
**pivot element**\(q\), reorders \(A\) such that all elements \(< A_q\) to the left of \(A_q\) and all elements \(> A_q\) to the right - If \(A_q\) is the element, return, if it’s greater learch left, less, search right.
- Partition chooses a pivot element \(A_x\), exchange \(A_x\) and \(A_r\), then begin to loop to exchange appropriately

## Lecture 2: Medians and Order Statistics, Continued

### Selection, Continued

- Recall algorithm, from 02.12
- Requires partition, defined later on
- Partition chooses the pivot element
- Partition helps to preserve the invariant
- Choce of pivot element is crucial

### Choosing a Pivot element

- choise is critical to low time-complexit
- best choice of pivot element to partition
- Use the median – circular to do this
- Median of Medians
- \(A\) of \(n\) elements, partition \(A\) into \(m = \lfloor n/3 \rfloor\) groups of 5 elements, atnd at most one othe group of \(n \mod 3\) elements
- \(A^{\prime} = [ x_1, x_2, \ldots, x_{\lceil n / 5 \rceil}]\), where \(x_i\) is median of group \(i\) from sorting \(i\)
- \(y = \mathrm{Select}(A^{\prime}, 1, \lceil n/5 \rceil, \lfloor(\lceil n/5 \rceil + 1)/2 \rfloor)\) is the pivot element, scan \(A[p \cdots r]\) and return \(y\) in as \(i\) – this is choose pivot element
- As a team, do 02.19

### Time Complexity

- See 02.20
- Recurrence form (02.22):
- \(T(n)\) total time for size \(n\)
- Pivot selection takes \(\mathcal{O}(n)\) to split and \(T(n/5)\) for finding median of medians, partitioning \(n\) elements takes \(\mathcal{O}(n)\) time
- Recursive select takes \(T(3n/4)\) (in choose pivot element)
- ∴, \(T(n) \leq T(n/5) + T(3n/4) + \mathcal{O}(n)\)
- Or, rewritten as \(T(\alpha n) + T(\beta n) + O(n)\) where \(\alpha = 1/5\) and \(\beta = 3/4\)
- By theorem from Lecture 1, when \(\alpha + \beta < 1\), \(T(n) \in \mathcal{O}(n)\)

- And thus, Select has complexity \(\mathcal{O}(n)\)
- Proof shown on 02.23
- How can we lower-bound to 1/4 of the elements – see 02.21

## Lecture 2: Medians and Order Statistics, Continued

### Practice, Median of Medians, 02.19

- \(A = [4, 9, 12, 17, 6, 5, 21, 14, 8, 11, 13, 29, 3]\)
- select(A, 1, 13, 4)
- Partition(A, 1, 13)
- ChoosePivotElement(A, 1, 13)
- \([4, 9, 12, 17, 6], [5, 21, 14, 8, 11], [13, 29, 3]\)
- \([9, 11, 13]\)
- \(11\) is median, \(x = 10\)

- Proceed to swap
- \(A = [4, 9, 6, 3, 8, 11, 12, 17, 21, 14, 13, 29]\), \(p = 6\), \(k = 5\)

- ChoosePivotElement(A, 1, 13)

- Partition(A, 1, 13)
- Make sure to use lower median!

## Lecture 3: Dynamic Programming

- solving optimization problems
- Decompose problem into subproblems, solving them recursively and combine the solutions into a final, optimal solution
- exponential number of subproblems to solve, but with overlap, means solutions can be resued
- Number of
*distinct*subproblems is polynomial

### The Rod cutting problem

- Rod of length \(n\), cut into integer length smaller rods to maximize profit.
- Rod of length \(i\) has price \(p_i\)
- Cuts are free, profit base solely on prices charged for rods
- Cuts only occur at integral boundaries, so there are \(n - 1\) positions to cut, total number of possible solutions is \(2^{n - 1}\)
- Cuts \(i_1, \ldots, i_k\), (\(i_1 + \cdots i_k = n\)), and revenue \(r_n = p_{i_1} + \cdots p_{i_k}\) is maximized
- Optimal substructure problem

## Lecture 3: Dynamic Programming, Continued

### Rod cutting, an implementation

- Recall the
**optimal substructure property**, an optimal solution is made up of optimal solutions to subproblems - Easy to prove via contradiction
- \(r_n = \mathrm{max}(p_n, r_1 + r_{n - 1}, \ldots, r_{n - 1} + r_1)\)
- Helpful to simply store previously calculated values

## Lecture 3: Dynamic Programming, Continued

### Rod Cutting, continued

- Requires considering all possibilities (with one initial cut) for a given \(r_n\), which gives two subproblems.
- Proof of correctness/proof of optimal substructure property by contradiction
- Assume for a
*reductio*that the solution is not optimal for the subproblem…

- Assume for a
- Some dynamic programming problems can be consider slightly differently.
- \(r_n = \max_{1 \leq i \leq n} (p_i + r_{n-i})\) is an alternative form (03.05)
- Rod cutting proof available in book, §15.1 (p. 362)
- Cut-rod on (03.06)
- Missing memoization

### Time Complexity

- \(T(0) = 1\), See 03.07
- Non-memoized complexity is huge
- Top-down with memoization
- Bottom-up, fill in table
- Have the same running time
- Memoized (recursive) implementation at 03.10
- Bottom up at 03.11 – easier to analyse

### Reconstructing a solution

- add a second array \(s\) to keep track of the optimal size of the first piece cut in each subproblem
- Printing from this table is reasonably simple

## Lecture 3: Dynamic Programming, Continued

### Matrix Chain Multiplication

- long chain of \(n\) matrices we want to multiply
- \(p \times q\) by \(q \times r\) requires \(pqr\) steps
- Brute force solution is \(\Omega(4^n / n^{3/2})\)
- Steps:
- Characterize structure of an optimal solution
- Recursively define the value of an optimal solution
- Compute the value of an optimal solution
- Construct an optimal solution from computed information (03.18)

- Step 1
- \(A_{i \dots j}\) is the matrix from \(A_i A_{i + 1} \cdots A_j\)
- Must split product and compute i to k and k+1 to j, then multiply the two together
- \(k\) is the optimal parenthesization

- Assume cost is not optimal, then there exists some other solution from \(i\) to \(k\) such that \(cost(s^{\prime}) < cost(s)\)
- Then replace \(s\) with \(s^{\prime}\)

- Step 2:
- \(m[i, j]\) is min number of mults for A
_{i}to A_{j} - \(m[i, j]\)
- \(i = j\) = 0
- \(i < j\), split at \(k\), \(m[i, j] = m[i, k] + m[k + 1, j] + p_{i - 1}p_{k}p_j\)
- \(k\) must be tried for all possible values, store in \(s[i, j]\) to find \(k\) for a given \(i, j\) pair. (03.20)

- \(m[i, j]\) is min number of mults for A
- Step 3:
- Is \(\Theta(n^2)\)
- Bottom-up is on chain length
- Linear to solve each problem, a quadratic number of subproblems, ∴, \(\Theta(n^3)\)
- Implementation on 03.22

## Lecture 3: Dynamic Programming, Continued

### Matrix Chain Multiplication, Continued

- Optimal substructure defined with regards to characterization of subproblems

### Optimal Substructure

- Consider shortest path?
- Does shortest path have optimal substructure?
- Yes!
- But not for Longest Simple Path. LSP is in fact NP complete, because subproblems are not independent

### Longest Common Subsequence

- Sequence \(Z\) is a subsequence of anothe sequence \(x\) if there is a strictly increasing sequence \(i_1, \ldots, i_k\) of indices of \(X\) such that for all \(j = 1, \ldots, k\), \(x_{i_j} = z_j\)
- As one reads through \(Z\), one can find a match to each symbol of \(Z\) in \(X\) in order.
- \(Z\) is a
**common subsequence**of \(X\) and \(Y\) if it is a subsequence of both - Goal of LCS is to find a maimum-length common subsequence of sequences \(X\) and \(Y\)

### Characterizing Structure of LCS Optimal Solution

- Given \(X\), the $i$th prefix of \(X\) is \(X_i\)
- If \(X\) and \(Y\) have LCS \(Z\), then
- \(x_m = y_n \implies z_k = x_m = y_n\) and \(Z_{k - 1}\) is an LCS of \(X_{m - 1}\) and \(Y_{n - 1}\)
- if \(z_k \neq x_m\), can lengthen \(Z\). Contradiction
- If \(Z_{k - 1}\) is not LCS of \(X_{m - 1}\) and \(Y_{n - 1}\), then a longer CS of \(X_{m - 1}\) and \(Y_{n - 1}\) could have \(x_m\) appended to it to get CS of \(X\) and \(Y\) that is longer than \(Z\). Contradiction

- See 03.31!

- \(x_m = y_n \implies z_k = x_m = y_n\) and \(Z_{k - 1}\) is an LCS of \(X_{m - 1}\) and \(Y_{n - 1}\)
- Last symbol must be equal!

### Recursive definition of Optimal Solution

- \(c[i, j]\) defined on 03.32
- Find the LCS of the prefixes, if \(x\) and \(y\)’s last chars match
- If not, find LCS of \(X\) and \(Y_{n - 1}\), find LCS of \(X_{m - 1}\) and \(Y\) and use the longer.
- Remember to use mathematical style

### Define the Algorithm

- 03.33
- \(b\) is used to keep direction
- Diagonal in an LCS is where it’s printed
- But LCS is prented backwards, use recursion to print in correct order

## Lecture 3: Dynamic Programming, Continued

- exactly 4
- 5 and 4 hinge on finding the correct subproblem

## Lecture 3: Dynamic Programming, Continued

### Optimal Binary Search Trees

- Goal is to construct BSTs such that most frequently sought values are near the root, minimizing expected search time.
- Given a sequence \(K\) of \(n\) distinct keys in sorted order
- Key \(k_i\) has a probability \(p_i\) that it will be sought on a particular search
- To handle searches for values not in \(k\), have \(n + 1\) dummy keys, \(d_0, d_1, \ldots, d_n\) to serve as leaves. \(d_i\) is reached with probability \(q_i\)
- If \(depth_{T}(k_i)\) is distance from root of \(k_i\) in tree \(T\), then the expected cost of \(T\) is \(1 + \sum_{i=1}^{n} p_i depth_{T}(k_i) + \sum_{i=0}^{n} q_i depth_{t}(d_i)\)
- An optimal BST is one with minimum expected search cost

#### Step 1: Structure of Optimal Solution

**Observation:**Since \(K\) is sorted and dummy keys are interspersed in order, any subtree of a BST must contain contiguous keys from \(i\) to \(j\) and leaves from \(d_{i - 1}\) to \(d_j\)- “Proof” in 03.44
- Given keys \(k_i, \ldots, k_j\), say that it’s optimal BST roots at \(k_r\) for some \(i \leq r \leq j\)
- Since we dont know optimal \(k_r\), all must be tried

#### Recursive definition

- \(e[i, j]\) is expected cost from \(i\) to \(j\)
- if \(j = i - 1\), then there is only the dummy key \(d_{i - 1}\) and \(e[i, i - 1] = q_{i - 1}\)
- if \(j \geq i\), then choose root \(k_r\) from \(k_i\) to \(k_j\), optimally solving subproblems.
- When combining optimal trees from subproblems and increase depth by one.
- Define \(w(i, j) = \sum_{\ell = i}^{j} p_{\ell} + \sum_{\ell = i - 1}^{j} q_{\ell}\) as the sum of the probabilities of the nodes in the subtree built on \(k_i, \ldots, k_j\), meaning \(e[i, j] = p_r + (e[i, r - 1] + w(i, r - 1)) + (e[r + 1, j] + w(r + 1, j))\)
- Becomes \(w(i, j) = w(i, r - 1) + p_r + w(r + 1, j)\)
- Minimize, adding root array

#### Define the function

- Pseudocode in 03.43
- is in \(\mathcal{O}(n^3)\)

## Lecture 4: Greedy Algorithms

- A technique for solving optimization problems
- Still examine subproblems, stil exploits optimal substructure
- Greedy algorithms instead choose the best choice at the moment, rather than examine all subproblems
- ability to do this is called the greedy choice property, this allows us to limit what subproblems we look at

- MST, single-source shortest path

### Activity Selection

- scheduling classes in a classroom
- Courses are candidates to be scheduled in that room, not all can have it
- Maximize utilization of the room in terms of number of classes scheduled
- Generally:
- Given \(S = \{a_1, a_2, \ldots, a_n\}\) of \(n\) proposed activities that wish to use a resource that can serve only on activity at a time
- \(a_i\) has a start time \(s_i\) and a finish time \(f_i\), such that \(O \leq s_i < f_i < \infty\)
- If \(a_i\) is scheduled to use the resource, it occupies it during the interval \([s_i, f_i)\) then we can schedule both \(_i\) and \(a_j\) iff \(s_i \geq f_j\) or \(s_j \geq f_i\) (if holds, \(a_i\) and \(a_j\) are compatible)
- Find a largest subset \(S^{\prime} \subseteq S\) such that all activities in \(S^{\prime}\) are pairwise compatible
- Assume that activities are sorted by finish time.

- Let \(S_{ij}\) be a set of activities that start after \(i\) and end before \(j\), and \(A_{ij}\) be the subset of \(S_{ij}\) that is maximal
- See optimal substructure (04.05)
- Recursive definition in 04.07, but we can optimize
- Greedy choice – choose the one with the earliest finish time of all still compatible with currently schedule activities
- Now, consider \(S_k = \{a_i \in S \mid s_i \geq f_k\}\) be the activities that start after \(a_k\) finishes
- Then greedily choose \(a_1\) first and the first thing in each

- Proof on 04.09
- Argument of Optimal Substructure
**and**Greedy Choice properties

## Lecture 4: Greedy Algorithms, Continued

### Rod cutting, redux

- Subproblem \(n\) is rod of length \(n\)
- Let \(r_n\) be value of optimal solution to subproblem \(n\)
- \(r_n = p_i + r_{n - i}\)
- Assume than \(r_{n - i}\) is not optimal. Falls in from there

### Activity Selection, Continued

- Optimal solutions are done with dynamic programming in \(\mathcal{O}(n^3)\)
- But greedy solution is better

### The Greedy Choice

- Maximizes amount of something left over
- State Greedy Choice Property
- Argue property
- argue optimal substructure
- Greedy Choice Property means that there exists a greedy choice that leads to an optimal solution
- Use a proof by construction

- See 04.09

## Lecture 4: Greedy Algorithms, Continued

### Proof of greedy choice

- Slide 04.09
- Let \(A_k\) be an optimal solution to \(S_k\), and let \(a_j\) have earliest finish time of all in \(A_k\)
- Make this choice, Activities in \(A^{\prime}\) are mutually compatible
- Therefore, there exists an optimal solution taking the greedy choice
- \(A_k\) is such that all elements in it are compatible with \(a_k\)
- Thus, \(A_0 \cap S_1\) is optimal for \(S_1\), by contradiction:
- Assume for a reductio that \(A_1^{\prime}\) exists such that \(|A_1^{\prime}| > |A_1|\) and all activities in \(A_1^{\prime}\) are mutually compatible
- \(A_0^{\prime} = A_0 \setminus A_1 \cup A_1^{\prime}\)
- \(|A_0^{\prime}| > |A_0|\), contradiction!

- Greedy Choice propertiy is such that the optimal solution \(A_0\) contains \(a_1\)

### Greedy Vs Dynamic Programming

- Gredy algorithms leverage optimal substructure
- When we can argue that greedy choice is part of a optimal solution, implying that we don’t need to explore all subproblems
- Consider the knapsack problem
- 0-1 is taken in entriety
- Fractional is not
- Fractional chooses \(v_i / w_i\) and choose items in descending order
- knapsack can always be filled in the end

- 0-1 is dynamic, but NP-complete, is \(\mathcal{O}(nW)\), and therefore pseudopolynomial

## Lecture 4: Greedy Algorithms, Continued

### Huffman Coding

- encoding a file of symbols from some alphabet
- Want to minimize size, based on frequency
- Fixed-length uses \(\lceil \log_2{n} \rceil\) bits per symbol
- Variable-length uses fewer bits for more frequent symbols
- Prefix codes can be represented as a binary tree, as can prefix-free codes
- Corresponds to a full binary tree
- Pseudocode on 04.18
- Computation is linear, log for heap ops, therefore, total, \(\mathcal{O}(n \log{n})\)

#### Greedy Choice

- 04.20–22
**Lemma:**Let \(C\) be an alpabet in which symbol \(c \in C\) has frequency \(c.freq\) and let \(x, y \in C\) have lowest frequencies. Then there exists an optimal prefix code for \(C\) in which codewords \(x\) and \(y\) have the same length and differ only in the last bit.*I.e.*, an optimal solution exists that merges lowest frequencies first.**Proof:**- Let \(T\) be a tree representing an arbitrary optimal prefx code, and let \(A\) and \(b\) be siblings of maximum depth in \(T\)
- Assume, without loss of generality, that \(x.freq \leq y.freq\) and \(a.freq \leq b.freq\)
- Since \(x\) and \(y\) are the two least frequent nodes, we get \(x.freq \leq a.freq\) and \(y.freq \leq b.freq\)
- Convert \(T\) to \(T^{\prime}\) by exchanging \(a\) and \(x\), then \(T^{\prime}\) to \(T^{\prime\prime}\) by exchanging \(b\) and \(y\)
- Then, in \(T^{\prime\prime}\), \(x\) and \(y\) are siblings of maximum depth
- Cost difference between \(T\) and \(T^{\prime}\) is \(B(T) - B(T^{\prime})\)
- \(= \sum_{c \in C} c.freq \cdot d_T(c) - \sum_{c \in C} c.freq \cdot d_{T^{\prime}}(c)\)
- \(= x.freq \cdot d_T(x) + a.freq \cdot d_T(a) - x.freq \cdot d_{T^{\prime}}(x) - a.freq \cdot d_{T^{\prime}}(a)\)
- \((a.freq - x.freq)(d_T(a) d_T(x)) \geq 0\)

- Since \(a.freq \geq x.freq\) and \(d_T(a) \geq d_T(x)\)
- Similarly, \(B(T^{\prime}) - B(T^{\prime\prime}) \geq 0\) , so \(B(T^{\prime\prime}) \leq B(T)\), so \(T^{\prime\prime}\) is optimal.
**QED**

#### Optimal Substructure

- 04.23–25

##### Lemma

- Let \(C\) be an alphabet in which symbol \(c \in C\) has frequency \(c.freq\) and let \(x, y \in C\) have lowest frequencies
- Let \(C^{\prime} = C \setminus \{x, y\} \cup {z}\) and \(z.freq = x.freq + y.freq\)
- Let \(T^{\prime}\) be any tree representing an optimal prefix code for \(C^{\prime}\)
- Then, $T, which is \(T^{\prime}\) with leaf \(z\) replaced by internal node with children \(x\) and \(y\), represents an optimal prefix code for \(C\)

##### Proof

- See 04.24–25
- Look back at it.

## Lecture 5: Elementary Graph Algorithms

- Graphs are ADTs applicable to numerous problems
- can capture entities and the relationships between them
- degree of the relationship
- and more

- Basics in graph theory, representation, algorithms for basic graph-theory problems

### Breadth-First Search

- Given a graph \(G = (V, E)\), and a source node \(s \in V\), BFS visits every vertex that is reachable from \(s\)
- Uses a queue to search in a breadth-first manner
- creates a structure called a
**BFS tree**such that for each vertex in \(v\), the distance from \(s\) to \(v\) is a shortest path in \(G\) - Initializes each node’s color to WHITE
- As nodes are visited, color to GREY when in the queue, then BLACK when finished.

## Lecture 5: Elementary Graph Algorithms, Continued

### Breadth-First Search, continued

- BFS tree is shortest paths tree
- Algorithm presented in 05.04
- Array \(d\) is distance from start node
- Array \(\pi\) is parent node.
- \(\mathcal{O}(|V|^2)\) for adjacency matrix
- With adjacency list, is \(\mathcal{O}(|V|)\), or actually, \(\mathcal{O}(|E| + |V|)\)
- \(d[v]\) is shortest distance from \(s\) to \(v\) for unweighted shortest paths
- Can print shortest path by recursively following \(\pi[v]\), \(\pi[\pi[v]]\),
*etc.* - If \(d[v] == \infty\), then \(v\) is not reachable from \(s\)

### Depth First Search

- Graph traversal, follows path as deep as possible before backtracking
- DFSis implemented recursively
- Tracks discovery time and finish time of each node
- All nodes coloured black at end
- Produces a “forest”
- Presented on 05.09 and 05.10
- Time complexity equivalent to BFS, \(\Theta(|V| + |E|)\)
- Vertex \(u\) is a proper descendent of vertex \(v\) in the DF forest iff \(d[v] < d[u] < f[u] < f[v]\)
**Parenthesis structure**If one prints “(u” when discovering \(u\) and “u)” when finishing \(u\), then the printed text will be a well-formed parenthesized expression

- Types of tree edges
- Tree edges are in depth-first forest
- Back edges \((u, v)\) connect a vertex \(u\) to its ancestor \(v\) in the DF tree (includes self-loops)
- Forward edges are non-tree edges connecting a node to one of its DF tree descendants
- Cross edges goe between non-ancestral edges within a DF tree or between DF trees

- A graph has a cycle iff DFS discovers a back edge (deadlock detection)
- When dfs first explores an edge, look at \(v\)’s color
- white implies tree edge
- grey implies back edge
- black implies forward or cross edge

## Lecture 5: Elementary Graph Algorithms, Continued

### Topological Sorting

- Begins with a directed acyclic graph, edge implies event must occur before another
- Topological sort of a dag \(G\) is a linear ordering of its vertices such that if \(G\) contains an edge \((u, v)\), then \(u\) appears before \(v\) in the ordering
- Call DFS on dag \(G\)
- As each vertex is finished, insert into the front of a linked list, return the linked list of vertices
- Thus, topological sort is a descending sort of vertices based on DFS finish times
- What is time complexity – \(\Theta(|V| + |E|)\)
- See 05.15

### Strongly Connected Components

- Given \(G = \langle V, E \rangle\), a
**strongly connected component**of \(G\) is a maximal set of vertices \(C \subseteq V\) such that for every pair of vertices \(u, v \in C\), \(u\) is reachable from \(v\) and \(v\) is reachable from \(u\) - Collapsing the edges within an SCC yields an acyclic component graph
- Algorithm requisres transpose of \(G\), \(G^T\), which is \(G\) with the edges reversed. \(G^T\) and \(G\) have the same SCCs
- Call DFS on \(G\)
- Call DFS on \(G^T\), Looping through vertices in order of decreasing finishing times from first DFS call
- Each DFS tree in second DFS run is an SCC in \(G\)
- Is \(\Theta(|V| + |E|)\), because \(G^T\) can be computed in \(\Theta(|V| + |E|)\) if in adjacency list, and let the first run of DFS use Topological Sorting to generate the list
- The first component node of \(G^T\) has not outgoing edges, since in \(G\) it has only incoming edges, second has edges into the first, etc. (Lemma 22.14 in book)
- And, DFS on \(G^T\) visits components in topological order of \(G\)’s component graph

## Lecture 6: Minimum-Weight Spanning Trees

### Intro

- Given a connected, undirected graph \(G = (V, E)\), a
**spanning tree**is an acyclic subset \(T \subseteq E\), that connects all vertices in \(V\)- \(T\) is acyclic implies tree
- \(T\) connects all vertices,
**spans**\(G\)

- If \(G\) is weighted, then \(T\)’s weight is \(w(T) = \sum_{(u, v) \in T} w(u, v)\)
- An MST is a spanning tree of minimum weight, not necessarily unique.

### Kruskal’s Algorithm

- Greedy algorithm, makes locally best choice at each step
- Starts by declaring each vertex to be its own tree (so all nodes together make a forest)
- Iteratively identify the minimum-weight edge \((u, v)\) that connects two distinct trees and add it to the MST \(T\), merging the two trees

## Lecture 6: Minimum-Weight Spanning Trees, Continued

### Kruskal’s Algorithm

- Relies on Disjoint Set data structure
- Algorithm presented on 06.05
- Algorithm is \(\mathcal{O}(|E| \log{|E|})\) (06.12)

### Disjoint Set Data Structure

- Given universe \(U\) of elements, DSDS maintains collection \(\mathcal{S} = \{S_1, \ldots, S_k\}\)
- Each element in \(U\) is in exactly one set \(S_j\)
- No set \(S_j\) is empty

- Dynamically updated throughout program
- Each set \(S \in \mathcal{S}\) has a representative element \(x \in S\)
- Chapter 21
- \(\alpha\) is the inverse Ackermann function
- very near constant

### Prim’s Algorithm

- Still greedy
- Maintains a single tree, not a forest
- Starts with an arbitrary root \(r\)
- Repeatedly finds minimum weight edge incident to the tree
- algorithm on 06.14
- Uses min queue
- Proof in 06.18
- Nodes already are those in \(V \setminus Q\)
- For all vertices \(v \in Q\), if \(\pi[v] \neq \mathsc{Nil}\), then \(key[v] < \infty\)

- Proof is same, more or less
- Prove invariant via
**cut**that respects \(A\) (no edges span the cut) - GRedy choise by looking for a safe edge \(e\)

### Minimum Heap

- Binary tree where key at each node is \(\leq\) keys of children
- Any subtree also a heap
- \(\theta(\log{n})\)
- Build is \(\mathcal{O}(n)\)
- Filter new min in \(\mathcal{O}(\log{n})\)

## Lecture 6: Minimum Spanning Trees, Continued

### Proof of correctness of both algorithms

- Both use greedy approach
- Maintain invariant that at any time, set of edges \(A\) selected so far is subset of some MST (optimal substructure)
- Each iteration of the algorithms look for a safe edge \(e\) such that \(A \cup \{e\}\) is also a subset of an MST (greedy choice)
- Prove invariant via use of cut \((S, V - S)\) that respects \(A\) (no edges span cut)
- Cut is partition of graph into two disjoint subsets
**Theorem**Let \(A \subseteq E\) be included in some MST of \(G\), \((S, V - S)\) be a cut respecting \(A\) and \((u, v) \in E\) be a minimum-weight edge crossing the cut. Then \((u, v)\) is a safe edge for \(A\). (06.20)*Proof.*- Let \(T\) be an MST including \(A\) and not including \((u, v)\)
- Let \(p\) be path from \(u\) to \(v\) in \(T\), and let \((x, y)\) be an edge from \(p\) crossing cut (not \(A\)).
- Since \(T\) is a spanning tree, so is \(T^{\prime} = (T \setminus \{(x, y)\}) \cup \{(u, v)\}\).
- Both \((u, v)\) and \((x, y)\) cross cut, so \(w(u, v) \leq w(x, y)\)
- So \(w(T^{\prime}) = w(T) - w(x, y) + w(u, v) \leq w(T)\)
- Thus, \(T^{\prime}\) is an MST
- And \((u, v)\) is a safe edge for \(A\) since \(A \cup \{(u, v)\} \subseteq T^{\prime}\)

## Lecture 7: Single-Source Shortest Path

- Given a weighted, directed graph \(G = (V, E)\) with weight function \(w: E \mapsto \mathbb{R}\)
- The weight of a path \(p = \langle v_0, v_1, \ldots, v_k \rangle\) is the sum of the weights of its edges: \(w(p) = \sum_{i = 1}^{k} w(v_{i - 1}, v_i)\)
- Shortest path weight from \(u\) to \(v\) is \(\delta(u, v)\) is the min of all paths between the two, infinite otherwise
- A shortest path from \(u\) to \(v\) is any path \(p\) such that \(w(p) = \delta(u, v)\)
- Network routing, driving directions

### Types

- Single-source
- finding shortest pth from source \(s\) to every other node.
- Single-destitination
- find shortest paths from every node to destination \(t\) (solved by running single-source on the transpose)
- Single-pair
- find shortest path from \(u\) to \(v\) (solveable with SSSP, nothing asymptoticaly faster)
- All-pairs
- find shortest paths between every pair of nodes (can be solved with repeated applications of single-source, but can be faster/simpler)

### Optimal substructure

- Shortest paths problem has optimal substructure property. Proof on 07.04.

### Negative-Weight Edges and Cycles

- What happens if \(G\) has edges with negative weights?
- Dijkstra’s algorithm cannot handle this!! Bellman-Ford can, under the right circumstances, namely no negative-weight cycles
- Cycles include
- Negative-weight cycles
- Zero-weight cycles
- Positive-weight cycles

## Lecture 7: Single-Source Shortest Paths, Continued

### Relaxation

- \(d[v]\) for \(v \in V\), is the upper bound on \(\delta(s, v)\)
- Relaxation of an edge \((u, v)\) is the process of testing whether we can decrease \(d[v]\) and yielding a tighter upper bound
- Initialize single source, initializes sets \(d[v]\) to \(\infty\), and the parent to non-existent
- Relax is such that if the current \(d[v]\) is greater than \(d[u] + w(u, v)\), update \(d[v]\) to that, and set \(v\)’s parent to \(u\)

## Lecture 7: Single-Source Shortest Paths, Continued

### Relaxation

- Provides core to shortest path
- Basic method works simply, varied based on implementation

### Bellman-Ford

- Works with negative-weight edges and detects negative-weight cycles
- Makes \(|V| - 1\) passes over all edges, relaxing each edge during each path
- No cycles implies all shortest paths have \(\leq |V| - 1\) edges, so that the number of relaxations is sufficient
- Can be modified to tag negative-weight cycles
- Algorithm shown on 07.13
- Pass is through every single edge, \(|V| - 1\) times
- 7-10 detects negative weight cycles
- Time Complexity:
- Initialize Single Source takes \(\mathcal{O}(|V|)\)
- Relax itself is \(\mathcal{O}(1)\)
- Negative Weight Cycle is \(\mathcal{O}(|E|)\)
- Dual loop is \(\mathcal{O}(|V| |E|)\)
- Thus, total time complexity is \(\mathcal{O}(|V| |E|)\)

- Correctness
- Assume no negative weight cycles
- Since no cycles appear in SPs, every SP has at most \(|V| - 1\) edges
- Then define sets \(S_0, \ldots, S_{|V| - 1}\)
- \(S_k = \{ v \in V | \exists p = (s \to v) \mathrm{s.t.} \delta(s, v) = w(p) \land |p| \leq k\}\)
**Invariant**: After the $i$th iteration of outer relaxation loop, for all \(v \in S_i\), we have \(d[v] = \delta(s, v)\)- This is the
**path-relaxation property**(Lemma 24.15) - Proven via induction on \(i\):
- obvious for \(i = 0\)
- If holds for \(v \in S_{i - 1}\), then relaxation and optimal substructure implies holding for \(v \in S_i\)

- This is the
- Thus, after \(|V| - 1\) iterations, \(d[v] = \delta(s, v)\) for all \(v \in V = S_{|V| - 1}\)

- Detection of Negative-weight cycles (07.18)
- Let \(c = \langle v_0, v_1, \ldots, v_k = v_0 \rangle\) be a neg-weight cycle reachable from \(s\): \(\sum_{i = 1}^k w(v_{i - 1}, v_i) < 0\)
- If the algorithm incorrectly returns true, then for all nodes in the cycle \(i = 0, 1, \ldots, k\), \(d[v_i] \leq d[v_{i - 1}] + w(v_{i - 1}, v_i)\)
- By summing we get thing
- Since \(v_0 = v_k\), the two are the same, then contradiction follows (07.18)

## Lecture 7: Single-Source Shortest Paths, Continued

- Billboard problem and avocado problem are very similar

### Shortest paths in a DAG

- topo sort vertices, initialize single source, relax from left to right
- Correctness follows from path-relaxation of Bellman-Ford, but relaxing in topo order implies edges relaxed in order
- Total time complexity is \(\mathcal{O}(|V| + |E|)\)

### Dijkstra’s Algorithm

- Greedy
- Faster than Bellman-Ford
- Requires all edge weights to be non-negative
- Maintains set \(S\) of vertices whose final shortest path weights from \(s\) have been determined
- Repeatedly selects \(u \in V \setminus S\) with minimum SP estimate, adds \(u\) to \(S\) and relaxes all edges leaving \(u\)
- Uses min-priority queue to repeatedly make greedy choice
- Pseudocode on 07.25
- Use array to implement min queue

#### Correctness

**Invariant**: At the start of each iteration of the while loop, \(d[v] = \delta(s, v)\) for all \(v \in S\)**Proof:**Let \(u\) be first node added to \(S\) where \(d[u] \neq \delta(s, u)\)- Let \(p = s \to x \to y \to u\) be SP to \(u\) and \(y\) first node on \(p\) in \(V \setminus S\)
- Since \(y\)’s predecessor \(x \in S\), \(d[y] = \delta(s, y)\) due to relaxacion of \((x, y)\)

## Lecture 7: Single-Source Shortest Paths, Continued

### Dijkstra’s and Time Complexity

- Initialize single source – \(\Theta(|V|)\)
- Create Q – array, \(\mathcal{O}(|V|)\); heap, \(\mathcal{O}(|V|)\)
- Extract-Min – \(\mathcal{O}(|V|)\) calls
- Time Complexity – \(\mathcal{O}(|V|)\) if array; \(\mathcal{O}(\log{|V|})\) if heap

- Relax – \(\mathcal{O}(|E|)\)
- Time Complexity – \(\mathcal{O}(1)\) ; \(\mathcal{O}(\log{|V|})\)

- Total Time Complexity
- \(\mathcal{O}(|V|^2)\) if array
- \(\mathcal{O}(|E| \log{|V|})\) if heap

- If \(E\) is \(o(|V|^2 / \log{|V|})\),, prefer heap

### Correctness, continued

- Since \(y\) precedes \(u\) in \(p\), and edge weights are non negative \(d[y] = \delta(s, y) \leq \delta(s, u) \leq d[u]\)
- Since \(u\) was chosen before \(y\) in line 3, \(d[u] \leq d[y]\), so \(d[y] = \delta(s, y) = \delta(s, u) = d[u]\), a contradiction
- Since all vertices eventually end up in \(S\), we see that the algorithm is correct. “QED”

## Lecture 8: All Pairs Shortest Paths

- Similar to SSSP, but for all pairs
- Given weighted directed graph, find \(\delta(u, v)\) for all \((u, v) \in V \times V\)
- Run SSP \(|V|\) times
- If no negative weights, use Dijkstra’s, gives \(\mathcal{O}(|V|^3)\)
- If negative weights, use Bellman-Ford, for \(\mathcal{O}(|V|^4)\)

- Can do better, and handle neg-weight edges
- Matrix-multiplication algorithm, \(\Theta(|V|^3 \log |V|)\)
- Floyd-Warshall, \(\Theta(|V|^3)\)

### Adjacency Matrix Representation

- Assume vertices are numbered, 1 to \(n\)
- Input is \(n \times n\) matrix: \[ w_{ij} = \begin{cases} 0 & i = j \\ w(i, j) & (i, j) \in E \\ \infty & (i, j) \not\in E \end{cases} \]
- Assume, for now, that neg weight cycles are absent
- Distance matrices \(L\) and \(D\) are produced by algorithms, predecessor matrix \(\Pi\) where \(\pi_{ij}\) is the predecessor of \(j\) on a shortest path from \(i\) to \(j\) or
*NIL*if \(i = j\) or no path exists (well-defined due to optimal substructure)

### Printing from \(\Pi\)

- defined recursively, printing after recursive call
- See 08.04

### Matrix Multiplication

- Maintain a series of matrices \(L^{(m)} = \left( \ell_{ij}^{(m)} \right)\), where \(\ell_{ij}^{(m)}\) is the minimum weight of any path from \(i\) to \(j\) that uses at most \(m\) edges
- When \(m = 0\), \(\ell_{ij}^{(0)} = 0\) if \(i = j\), \(\infty\) otherwise

- Exploits opitmal substructure to get a recursive definition of \(\ell\)
- Uses \(\min\), see 08.06

## Lecture 8: All Pairs Shortest Paths, continued

### Matrix Multiplication, continued

- Allows better paths to come up
- Recursive solution due optimal substructure property (05.06) \[ \ell_{ij}^{(m)} = \min_{1 \leq k \leq n} \left(\ell_{ik}^{(m - 1)} + w_{ik}\right) \]
- Ignoring negative weight cycles means \(\delta(i, j) = \ell_{ij}^{(n - 1)}\) (because shortest path uses \(n - 1\) edges)
- Bottom up computation \(L^{(m)}\) using \(L^{(1)} = W, L^{(2)}, \ldots, L^{(n - 1)}\) by iterative computation
- Defined on 05.08–09
- Faster version on 05.13

## Lecture 8: All Pairs Shortest Paths, continued

### Floyd-Warshall

- Shaves : log factor
- Start assuming no neg weight cycles, can detect neg cycles as before
- Considers decomposition by
*intermediate vertex*,- If a simple path \(p = \langle v_1, v_2, v_3, \ldots, v_{\ell - 1}, v_{\ell} \rangle\), then the set of intermediate vertices is \(\{v_2, v_3, \ldots, v_{\ell - 1}\}\)

- Let \(V = \{1, \ldots, n\}\), and fix \(i, j \in V\)
- For some \(1 \leq k \leq n\), consider set of vertices \(V_k = \{1, \ldots, k\}\)
- Consider all paths from \(i\) to \(j\) whose intermediate vertices come from \(V_k\) and let \(p\) be a minimum-weight path from them
- Is \(k \in p\)?
- If not, then all intermediate vertices of \(p\) are in \(V_{k - 1}\) and an SP from \(i\) to \(j\) based on \(V_{k - 1}\) is also an SP from \(i\) to \(j\) based on \(V_k\)
- If so, then we can decompose \(p\) into \(i \overset{p_1}{\leadsto} k \overset{p_2}{\leadsto} j\)

- Note, \(V_0\) is no intermediate vertices, i.e., only the direct edges between pairs of vertices
- Now \(D^{(k)}_{ij}\) is the weight of an SP from \(i\) to \(j\) using only \(V_k\) as intermediates
- \(D^{(0)} = W\)
- \(D^{(n)}\) is the shortest paths

- \(k \neq p \implies D^{(k)}_{ij} = D^{(k - 1)}_{ij}\)
- \(k = p \implies D^{(k - 1)}_{ik} + D^{(k - 1)}_{kj} = D^{(k)}_{ij}\)
- Shortest path from \(i\) to \(j\) based on \(V_k\) is either going to be the same as that based on \(V_{k - 1}\) or will be going to go through \(k\)
- \(D^{(k)} = \left(d_{ij}^{(k)}\right)\), where \(d_{ij}^{(k)}\) is the weight of the shortest path at \(k\) (08.17), is in \(\Theta(n^3)\)
- Algorithm on (08.18)
- Transitive Closure, whether paths exist between pairs of vertices for \(G = (V, E)\), transitive closure of \(G^{*} = (V, E^{*})\)
- \(E^{*} = \{(i, j) : \exists i \overset{p}{\leadsto} j \in G\}\)
- Can be solved with Floyd-Warshall but with bitwise ops instead of min/plus

- Memory can potentially be saved, operation can be in place

## Lecture 9: Lower Bounds

### Upper Bound

- Of an algorithm, has \(f(n)\) for \(n\) input; such that if there exiusts no input of size \(n\) such that \(A\) requires more than \(f(n)\) time
- Of a problem, has an upper bound of \(f(n)\) if there exists at least one algorithm that has an upper bound of \(f(n)\)
- Generally, tightest is preferred

### Lower Bound

- A problem has a
*lower bound*\(f(n)\) if for any algorithm \(A\) to solve the problem there exists at least one input of size \(n\) that forces \(A\) to take at least \(f(n)\) time or space - Pathological input depends on the sepecific algorithm \(A\), consider Bubble sort, takes reverse order, giving \(\Omega(n^2)\)
- Since every sorting algorithm has an input of size \(n\) forcing \(\Omega(n \log{n})\) steps, the sorting problem has time complexit lower bound of \(\Omega(n \log{n})\)
- Adversarial argument, simulate arbitrary algorithm \(A\) to build pathological input
- Reduce one problem to another to establish lower bounds

## Lecture 9: Lower Bounds, continued

### Decision trees

- full binary tree that represent comparisons bteween elementts performed by a particular sorting algorithm operating on a certain sized input
- Tree represents an algorithm’s behavior on all possible inputs of size \(n\)
- Each node represents one comparison made by the algorithm
- Each node is labeld as \(i:j\) which represents \(A[i] \leq A[j]\)
- If, in particular input, holds, then flo moves left, otherwise to right
- Each leaf represents a possible output of the algorithm, a permutation of the input
- All permutations must be in the tree in order for algorithm to work property

### Proof of lower bound (using an adversarial approach)

- Length of path from root to output leaf is number of comparisons made by algorithm
- Worst-case number of comparisons is the length of the longest path
- Adversary chooses a deepest leaf to create worst-case input
- Number of leaves is \(n!\)
- A binary tree of height \(h\) has at most \(2^h\) leaves
- Thus, \(2^h \geq n# \geq \sqrt{2\pi n}\left(\frac{n}{e}\right)^n\)
- Take base-2 logsx
- \(h \geq \log \sqrt{2\pi} + (1/2) \log n + n \log n - n \log e = \Omega(n \log n)\)
- Every comparison-based sorting algorithm has
*some*input that forces it to make \(\Omega(n \log n)\) comparisons. QED - And thus, mergesort and heapsort are asymptotically optimal.

### Reductions, using Convex Hull

- Use sorting lower bound to get lower bound on convex hull
- Given a set $Q = \{p
_{1}, p_{2}, \ldots, p_{n}\} of \(n\) points, each from \(\mathbb{R}^2\), output \(CH(Q)\), which is the smallest convex polygon \(P\) such that each point from \(Q\) is on \(P\)’s boundary or in its interior - Output is an ordered list of points that make up the boundary edges such that from one to the next, the hull will be drawn
- Reduce problem of sorting to convex hull
- Given any instance of the sorting problem, \(A = \{x_1, \ldots, x_n\}\), we will transform it to an instance of convex hull.
- Transformation is \(Q = \{(x_1, x_1^2), \ldots, (x_n, x_n^2)\}\)
- Feed into convex hull, convex hull transformed to sorted array

## Lecture 9: Lower Bounds, continued

### Reductions, using Convex Hull, continued

- Known Lower Bound of Sorting, reducing to Convex Hull
- Since Sorting is in \(\Omega(n \log{n})\), by reduction, we will find that Convex Hull is \(\Omega(n \log{n})\)
- Takes
*any*instance of problem A (the problem reduced from) to*some*instance of problem B (the problem reduced to) - Because of mapping, transformation will include all points (as will all lie on a parabola)
- By definition, only one chunk
- Conversion is still linear time
- Proves
*a*lower bound of convex hull, but not necessarily*the*lower bound (it is*the*lower bound of Convex Hull) - Can easily be done via contradiction. (09.12)

## Lecture 10: NP Completeness

- So far, focused on problems with efficient algorithms (
*i.e.*, those that run in polynomial time)- Called efficient even if \(c\) is large, since it is likely that another, more efficient algorithm exists
- Polynomial in terms of
**size**of input, \(n\).

- Some problems have a fastest known algorithm in superpolynomial time (
*i.e.*, exponential, sub-exponential (\(2^{n^{1/3}}\)) - Some problems cannot be solved in any amount of time
- Focus on lower bounds, but used to argue that some problems don’t have any efficient solution

## Lecture 10: NP Completeness, continued

- Focus on complexity classes P and NP
- P is deterministic polynomial time, the set of problems that can be solved with a deterministic TM (deterministic algorithm in polynomial time)
- NP is non-deterministic polynomial time, i.e., set of problems that can be solved by a non-deterministic TM in polynomial time (can explore many possible paths of computation at once)
- NP is the set of problems whose solutions, if given, can be verified in polynomial time
- Hamiltonian cycle – \(G = (V, E)\) contain a hamiltonian cycle (a simple cycle visiting every vertex exactly once?)
- NP, since given a specific \(G\), plus yes/no answer, plus certificate, we can verify a yes in polynomial time using the certificate
- What would be an appropriate certificate?
- Not known if \(HAM-CYCLE \in P\)

- Euler – does a directed graph \(G = (V, E)\) contain an Euler tour – a cycle that visits every edge in \(E\) exactly once and can visit vertices multiple times
- Check if each vertex’s in-degree is equivalent to its out-degree
- Certificate is just the tour, or even the graph it self

- Any problem in \(P\) is also in \(NP\), since if we can efficiently solve the problem, we get poly-time verification for free,
*i.e.,*\(P \subseteq NP\) - Not known if \(P \subset NP\) unknown if there exists a problem in \(NP\) that’s not \(P\)
- A subset of the problems in \(NP\) is the set of NP complete problems
- Every problem in NPC is at least as hard as all others in NP
- These problems are believed to be intractable (no efficient algorithm), but not yet proven to be so
- If any NPC problem is in \(P\), then \(P = NP\) and life is glorious and scary

### Proving NP completeness

- Prove that a problem is NPC, we can just take a different approach
- To prove \(B \in NPC\):
- Prove that \(B \in NP\) by identifying certificate that can be used to verify a YES answer in polynomial time
- Use obvious choice
to argue that verification requires polynomial time*Need*- Certificate is
*not*instance unless \(B \in P\)

- Show that \(B\) is as hard as any other NP problem by showing that if we can efficiently solve \(B\) then we can efficiently solve all problems in NP

- Prove that \(B \in NP\) by identifying certificate that can be used to verify a YES answer in polynomial time
- Step one is easy
- Second looks hard, but just use a reduction (probably to 3SAT)

### Reducing to NP

- Takes instance of \(A\) and transforms to \(B\) such that if \(B\) is solveable, then \(A\) is solvable
- TC of reduction for \(A\) is reduction to \(B\) plus time to solve \(B\)
- Start with \(A\) and \(B\)
- Trasform any instance \(\alpha\) of \(A\) to some instance \(\beta\) of \(B\) such that
- Transformation
*must*take polynomial time - The answer for \(\alpha\) is yes iff the answer for \(\beta\) is yes.

- Transformation

- Trasform any instance \(\alpha\) of \(A\) to some instance \(\beta\) of \(B\) such that
- If such a reduction exists, then \(B\) is at least as hard as \(A\), since if an efficient algo iexists, any \(A\) can be solved similarly
- \(A \leq_{p} B\) (read \(A\) is reducible to \(B\) modulo polynomial-time conversion)

### Only decision problems

- Only focus on decision problems, simplifies things
- Only output is yes or no
- Not asked for shortest path or cycle or similar
- Don’t ask for shortest path from \(i\) to \(j\), just ask if there exists a path from \(i\) to \(j\) with weight at most \(k\)
- Optimization problem implies decision problem
- They are no harder than the original optimization problem, show if decision is hard, so is optimization
- Decision are especially convenient if you think about TM and languages