# Data Structures & Algorithms

## Table of Contents

- Lecture 1: Algorithms and Analysis
- Lecture 2: Brute Force Approach
- Lecture 3: Knapsack, Hamiltonian Path and Backtracking
- Lecture 4: Divide and Conquer: repeated squaring, karatsuba multiplicaiton
- Lecture 5: Karatsuba and Strassen Matrix Multiplication
- Lecture 6: Closest Pair, Convex Hull and Presorting & Gaussian Elimination and Applications
- Lecture 7: Gaussian Elimination, LU Decomposition, Linear Programming and Coding Theory
- Lecture 8: Huffman Coding and AVL Trees
- Lecture 9: AVL Trees, 2-3 Trees and Balancing

## Lecture 1: Algorithms and Analysis

### Psuedocode

- Selection Sort
- Input: A collection \(A = \{a_1, \ldots, a_n\}\)
- Output: \(A\), Sorted
- For (\(i = 1\ldots n-1\))
- min \(\gets\) \(a_i\)
- For (\(j = i+1 \ldots n\))
- if (min \(> a_j\))
- min \(\gets a_j\)

- swap min, \(a_i\)

- if (min \(> a_j\))

- output \(A\)

### Analysis

- Identify the input
- Identify the input size
- collection size
- tree/graph number of nodes/edges

- Identify the elementary operation(s)
- most common or most expensive
- comparisons, assignments, and incrementations ignored

- Analyze elementary op executed with regard to the input size
- Provide an asymptotic characterization from step 4

### Example

- Collection A
- Size or cardinality of \(A\), \(n\)
- Comparison on line 4
- \(\sum_{i = 1}^{n-1} (n - i) = \sum_{i = 1}^{n - 1} n - \sum_{i = 1}^{n - 1} i = n(n - 1) - \frac{n(n - 1)}{2} = \frac{n(n-1)}{2}\)
- \(O(n^2)\), \(\Theta(n^2)\) (Prefer \(\Theta\))

### Brute Force

- Given: a set of \(n\) points, \(P = \{(x_1, y_1), (x_2, y_2), \ldots, (x_n, y_n)\}\)
- Find: the two points \(p\) and \(q\) that are closest to each other with regard to \(L_2\) norm (\(\sqrt{(x_1-x_2)^2 + (y_1 - y_2)^2}\))

#### Looping Method

- \(minDist \gets \infty\)
- For (\(i = 1 \ldots n-1\))
- For(\(j = i+1 \ldots n\))
- If(distance(\(P_i\), \(P_j\)))
- \(p \gets P_i\)
- \(q \gets P_j\)
- \(minDist \gets distance(P_i, P_j)\)

- If(distance(\(P_i\), \(P_j\)))

- For(\(j = i+1 \ldots n\))
- output p, q

#### Simpler

- \(minDist \gets \infty\)
- For Each Pair of Points \(p = (x, y)\), \(q = (x^\prime, y^\prime)\)
- d ← distance(p, q)
- if (d < minDist)
- \(minDist \gets distance(p,q)\)

- Finish

##### Analysis

- \(\binom{n}{2} = \frac{n!}{(n - 2)!2!} = \frac{n(n-1)}{2} = \Theta(n^2)\)

### Common Big Os

- Constant
- Logarithmic
- Linear
- Iterated Linear
- Quadratic
- Polynomial
- \(2^n\)
- Factorial

## Lecture 2: Brute Force Approach

- Examining all possibilities
- optimisation to find the best solution
- decision problems: find a solution, or determine none exists
- generally infeasable or inefficient
- may involve:
- generating combinatorial objects
- backtracking

- Ex: What percentage of 52 card shuffles has been generated by humans
Ex: Compute \(a^n \mod m\) for a very large \(n\): Dumbly

`int a, n, m; int result = 1; for(int i = 1 ; i <= n; i++) { result = (result * a) % m; }`

- Pairs: 2 for loops
- triples: 3 for loops
- for each subset of k elements: write k for loops where k is a variable
- Given a set of n elements \(A = \{a_1, a_2, \ldots, a_n\}\) and an integer \(k\), \(1 \leq k \leq n\), output all subsets of \(A\) with cardinality \(k\).
- \(\binom{n}{k} \in O(n^k)\)

- It suffices to consider combinations of integers \(1, 2, \ldots, n\).
- Algorithm:
- Start with \(a_1, a_2, \ldots, a_k\)
- Assume we have \(a_1, a_2, \ldots, a_k\), we want to generate the next combination
- Locate the last element \(a_i\) in the current such that \(a_i \neq n - k + i\)
- Replace \(a_i\) with \(a_i + 1\), \(a_j\) with \(a_i + j - i\) for \(j = i+1, i+2, \ldots k\)

- Ex: \(k = 3\) on \(\{1, 2, 3, 4, 5\}\)
- \(\{1, 4, 5\}\)
- \(\{2, 3, 4\}\)
- \(\{2, 3, 5\}\)
- \(\{2, 4, 5\}\)

- Goal: generate all permutations of a set of size \(n\), \(A = \{a_1, a_2, \ldots, a_n\}\)
- Steinhouse-Jonson-Trotter Algorithm

- Generate all \(n!\) permutations in lexicographic order
- Start with \(1, 2, 3, \ldots, n\)
- Given \(a_1, a_2, \ldots, a_n\)
- Find the rightmost pair $a
_{i}, \(a_{i + 1}\) such that \(a_i < a_{i + 1}\) - Find the smallest element to the right of this pair that is larger than \(a_i\), \(a^{\prime}\)
- Swap \(a_i\) and \(a^{\prime}\), sort all elements to the right of \(a^{\prime{}}\)

- ex: \(n = 6\)
- \(\{1, 6, 3, 5, 4, 2\}\)
- \(\{1, 6, 4, 2, 3, 5\}\)
- \(\{1, 6, 4, 2, 5, 3\}\)
- \(\{1, 6, 4, 3, 2, 5\}\)

- Ordered combinations (\(n\) distinct elements), and a length \(k\), generate all permutations of of length \(k\), \(P(n, k)\)
- Generate all subsets of size k, for each, generate all permutations of it

- Permutations with repetition: given \(n\) distinct elements, generate all ordered sequences of length \(k\), where each element can appear multiple times. \(n^k\)
- Count from 0 to \(n^k - 1\), converting to base \(n\), map to the set.

### Satisfiability (SAT)

- Given a predicate \(P(x_1, x_2, \ldots x_n)\) on \(n\) boolean variables \(x_1, x_2, \ldots x_n\)
- Determine if there exists an assignment of \(x_1, x_2, \ldots, x_n\) that \(P\) evaluates to true
- (x1 or x2 or x3) and (x1 or not x2) and (x2 and not x3) and (x3 and not x1) and (not x1 or not x2 or not x3)
- by brute force: No. All \(2^n\) possibilities were checked

- Possible solutions:
- First:
- Input: A predicate \(P(x_1,\ldots, x_n)\)
- Output: True if satisfiable, false otherwise
- For each Truth assignment T
- if \(\mathrm{apply} P, T\)
- output True

- if \(\mathrm{apply} P, T\)
- Output False
- \(O(2^n)\)

- Second: Backtracking, pruning, think trees

- First:

## Lecture 3: Knapsack, Hamiltonian Path and Backtracking

- Backtracking requires recursion
- SAT traversal, w/ backtracking: SAT(P, \(\overrightarrow{t}\))
- Input: A Predicate \((x_1, \ldots, x_n\)), a partial truth assignment \(\overrightarrow{t} = t_1, \ldots, t_k\)
- Output: True if P is satisfiable, false otherwise
- If \(k = n\)
- Return \(P(\overrightarrow{t})\)

- Else
- \(t_{k + 1} \gets 0\)
- If SAT(P, \(\overrightarrow{t}\))
- return True

- Else
- \(t_{k + 1} \gets 1\)
- return SAT(P, \(\overrightarrow{t}\))

### Hamiltonian Path/Cycle

- given an undirected graph \(G = (V, E)\)
- output true if G has a Hamiltonian path, false otherwise
- A Hamiltonian Path is a path in \(G\) that visits every vertex exactly once.
- A Hamiltonian Cycle is a path in \(G\) that visits every vertex exactly once, save for the first and last, which must be the same.
- Given a a graph \(G = (V, E)\), and a vertex \(v \in V\), the neighborhood of \(v\) is \(N(v) = \{x | (v, x) \in E\}\)
- Naive approach: Go by permutations
- Smart approach: Go by neighborhood
- Given a Graph and a partial path:
- Return true if length of path \(p = n - 1\), false otherwise

- Algorithm (HamWalk(G, \(p = v_1, \ldots, v_k\))):
- Input \(G = (V, E)\) and partial path \(p\)
- Return true if \(G\) contains a hamiltonian path, false otherwise
- If \(k = n\), return True
- for each \(v \in N(v_k)\)
- if (\(v \not\in p\))
- HamWalk(G, p + v)

- if (\(v \not\in p\))
- return false

### 0-1 Knapsack Problem

- Given a collection of items \(A = \{a_1, \ldots, a_n\}\) a Weight Function \(wt: A \to \mathbb{R}^+\) \(w(a_i)\), a value function \(val: A \to \mathbb{R}^+\) \(v(a_i)\) and a capacity \(W\)
- Output \(S \subseteq A\) such that \(\sum_{s \in S} w(s) \leq W\) but \(\sum_{s \in S} v(s)\) is maximized
Ex:

item value weight 1 15 1 2 10 5 3 9 3 4 5 4 - W = 8
- Solutions
- 1, 2, 3
- 1, 3, 4

- For Each Subset in A
- if total weight of subset is less than max, and total value is greater than the best value
- update solution
- update best value

- Optimally:
- Use Backtracking, cutting off at a possibility if it’s greater than the weight, not checking any children

- Algorithm
- Input: Knapsack \(A = \{a_1, \ldots, a_n\}\), weight function \(w(a_i)\), value function \(v(a_i)\), max weight \(W\) and a Partial Solution \(S \subseteq A\) constituting elements not indexed > j
- Output: A solution \(S^{\prime} \subseteq A\) at least as good as \(S\).
- If \(j = n\):
- Return \(S\)

- \(S_{best} \gets S\)
- For \(k = j+1 \ldots n\)
- \(S^{\prime} \gets S \cup \{a_k\}\)
- If \(w(S^{\prime}) } \leq W\):
- \(T \gets\) Knapsack(\(A\), \(w\), \(v\), \(W\), \(S^{\prime}\), \(k\))
- If \(v(T) > val(S_{best})\)
- \(S_{best} \gets T\)

## Lecture 4: Divide and Conquer: repeated squaring, karatsuba multiplicaiton

### Traveling Salesman Problem

- Given: A collection of cities \(C = \{c_1, \ldots, c_n\}\)
- Output an output of cities \(\pi(C) = (c_1^{\prime}, \ldots, c_n^{\prime})\)
- such that the total distance travelled \((\sum_{i = 1}^{n - 1} (\delta(c_i^{\prime}, c_{i+1}^{\prime}))) + \delta(c_1^{\prime}, c_n^{\prime})\) is minimized
- This is on a permutation of cities where there is a complete graph with weighted edges, all solutions (permutation of the set) are therefore hamiltonian

- an optimization problem: back tracking not necessarily applicable, but not the most appropriate, requires being smart
- Question: Ham Path or TSP: Which is harder or more complex? Which requires more resources?
- They’re equivalent as they’re both NP-complete

### Convex Hull

- Given a field of points, find a polygon that encloses all points and is also convex and has a minimal count of sides
- Given a set of \(n\) points, compute a convex hull that contains them
- brute force is \(0(n^3)\)
- Smart way is based on presorting, \(O(n)\)

### Assignment Problem

- Given a collecetion of tasks \(T\) and persons \(P\) and an \(n \times n\) cost matrix \(C\) such that \(C_{i, j}\) is the cost of assigning person \(p_i\) to task \(t_j\).
- Output an assignment (a bijection) \(f: P \to T\) such that the \(\sum_{p_i \in P} c(i, f(p_i))\) is minimized.
- Naive approach is \(O(n!)\)
- Better Solutions
- Hungarian Algorithm – Polynomial Time
- Solving a Linear Program – Much faster, simpler

### Integer Mod

- Given integer \(a\), \(n\), \(m\)
- Compute \(a^n \mod{m}\)
- Naively: Perform \(n\) multiplications (input size \(\log{n}\), elem op multiplication, \(n\) times, \(O(2^N)\))
- Ex: \(a^77 = a^64 \cdot a^8 \cdot a^4 \cdot a\)
- Repeated Squaring, Binary Exponentiation

- Algo: \(a, m, n = (b_k, \ldots, b_0)\)
- \(term \gets a\)
- \(prod \gets 1\)
- if \(b_0 = 1\)
- \(prod \gets a\)

- for \(i = 1 \ldots k\)
- \(term \gets term^2 \mod{m}\)
- if \(b_i = 1\)
- \(prod \gets (prod \cdot term) \mod{m}\)

- output \(prod\)
- \(O(\log{n})*\), but really linear, as \(O(k)\)

### Karatsuba Multiplication

- ad + bc = (a + b)(c + d) - ac - bd
- Given 2 \(n\) bit numbers, \(a, b\)
- split into 4 numbers
- \(a = a_1 \cdot 10^{\frac{n}{2)} + a_0\)
- \(b = b_1 \cdot 10^{\frac{n}{2)} + b_0\)
- \(ab = a_1 b_1 10^n + (a_1 b_0 + a_0 b_1)10^{\frac{n}{2}} + a_0 b_0 = a_1 b_1 10^n + \[(a_1 + a_0)(b_1 + b_0) - a_1 b_1 - a_0 b_0\] 10^{\frac{n}{2}} + a_0 b_0\)

- Do this recursively
- Is \(O(n^{\log_2 3})\)

## Lecture 5: Karatsuba and Strassen Matrix Multiplication

- karatsuba alternatives
- toom-cook \(O(n^{1.463})\)
- fast-fourier transform (Schonege-Strassen) \(O(n\log n \log \log n)\)
- Furer $O(nlog n 2
^{log* n})

- Log * 64 = 4

### Matrix Multiplication

- \(A \cdot B = C\)
- \(c_{ij} = \sum_{k = 1}^{n} a_{ik}\cdot b_{kj}\)
- Traditional Method is \(O(n^3)\) for multiplications, \(O(n^3)\) for additions
- Input: Matrix, size \(n^2 = N\), then \(O(N^1.5)\)

### Strassen’s Matrix Multiplication Algorithm

- For \(2 \times 2\):
- M
_{1}+ M_{4}- M_{5}+ M_{7}, M_{3}+ M_{5}, M_{2}+ M_{4}, M_{1}+ M_{3}+ M_{2}+ M_{6} - M
_{1}= A00 + A11 * B00 + B11 - M
_{2}= A10 + A1 * B00 - M
_{3}= A00 * B01 - B11 - M
_{4}= A11 * B10 - B00 - M
_{5}= A00 + A01 * B11 - M
_{6}= A10 - A00 * B00 + B01 - M
_{7}= A01 - A11 * B11 + B11 - 7 mults, 18 additions for strassens
- 8 mults, 4 adds for traditional

- M
- For an \(n \times n\) matrix, split into Quadrants, \(A_00, A_01, A_10, A_11\) which are each of \(\frac{n}{2} \times \frac{n}{2}\) and the same for \(B\)
- Use the previous identities recursively

- \(M(n)\) is the number of scalar multiplications by strassen on 2 matrices of \(n \times n\)
- \(= 7M(\frac{n}{2})\)
- \(M(2) = 7\)

- By Master Theorem:
- \(a = 7, b = 2, d = 0\)
- \(7 > 2^0\)
- \(\Theta(n^{\log_2 7})\)

- Alternatives:
- Winagard: \(O(n^{2.375477})\)
- Williams: \(O(n^{2.3727})\) (2012)

- Algorithm Details
- design and implement splitting or your functions should handle arbitrary indices
- matrix add and subtract
- matrix combine
- \(M_1, \ldots, M_7\) functions
- pad-with-zeros function, for when \(n\) is not a power of 2 and unpad

## Lecture 6: Closest Pair, Convex Hull and Presorting & Gaussian Elimination and Applications

### Closest Pairs, Revisited

- Divide and Conquer
- Divide the points into 2 sets along the \(x\) axis
- Conquer by recursively find the two closest points in each partition
- Combine by considering combinations that cross the split
- \(D(n)\) the number of distance calculations for divide and conquer
- \(2D\left(\frac{n}{2}\right) + \frac{k}{2}n\)
- By Master Theorem \(a = 2\), \(b = 2\), \(d = 1\)
- \(a \equiv b^d\), thus \(\mathcal{O}(n\log n)\)

- Use sorting to help reduce work

### Gaussian Elimination

- \(3x + 2y = 35\)
- \(x + 3y = 27\)
- \(a_{11} x_2 + a_{12} x_2 + \cdots a_{1n} x_n = b_1\)
- \(a_{n1} x_1 + a_{n2} x_2 + \cdots a_{nn} x_n = b_n\)
- Provides an \(n \times n\) matrix, and with \(\overrightarrow{b}\) added on the rhs, you get an \(n \times (n + 1)\) augmented matrix
- Goal: transform augmented matrix into an upper diagonal matrix
- Tools: We can perform any operation (any op that does not change the solution)
- row multiply by any non-zero constant
- reorder rows
- add any constant multiple of one row to another

- This can be done with Gauss-Jordan elimination
- Gaussian Elimination Algorithm
- Input \(n \times n\) matrix \(A\), \(1 \times n\) solution vector \(\overrightarrow{b}\)
- \(A^{\prime} \gets \left[ A | \overrightarrow{b} \right]\)
- For \(i = 1 \ldots n - 1\)
- \(\mathrm{pivotrow} \gets i\)
- for \(j = i + 1 \ldots n\)
- if \(|A^{\prime}_{ij}| > |A^{\prime}_{\mathrm{pivotrow}j}|\)
- \(\mathrm{pivotrow} \gets j\)

- if \(|A^{\prime}_{ij}| > |A^{\prime}_{\mathrm{pivotrow}j}|\)
- for \(k = i \ldots n + 1\)
- swap \(A^{\prime}_{ik}\), \(A^{\prime}_{\mathrm{pivotrow}j}\)

- for \(j = i + 1 \ldots n\)
- \(t \gets \frac{A^{\prime}_{ji}}{A^{\prime}_{i,i}}\)
- for \(k = i \ldots n + 1\)
- \(A^{\prime}_{jk} \gets A^{\prime}_{jk} - tA^{\prime}_{ik}\)

- return \(A^{\prime}\)

- Most common op, multiplication in the for i, for j, for k$
- \(\sum_{i = 1}^{n - 1} \sum_{j = i + 1}^{n} \sum_{k = i}^{n + 1} 1 = \sum_{i = 1}^{n - 1} \left(n_2 - 2ni + i^2\right) = \frac{(2n + 3)(n-2)n}{6}\)
- Therefore \(\mathcal{O}(n^3)\)

- Back Solving
- Input \(n \times (n + 1)\) augmented upper triangualr matrix \(A\)
- Output a solution vector \(\overrightarrow{x}\)
- For \(i = n \ldots 1\)
- \(t \gets A_{i n+1}\)
- For \(j = i + 1 \ldots n\)
- \(t \gets t - x_j A_{ij}\)

- \(x_i \gets \frac{t}{A_{ii}}\)

- return \(\overrightarrow{x}\)

- Gotchas:
- divide by bigger numbers or use a language with good numerics support
- There do exist some that have infinite solutions, a indeterminite system
- There are some with no solution, a degenerate system

- After Gaussian Elimination:
- if last row is all zeros, infinite solutions
- if last row is all zeros but the augment, no solutions

## Lecture 7: Gaussian Elimination, LU Decomposition, Linear Programming and Coding Theory

### LU Decomposition

- Uses gaussian elimination for two matrices
- L is lower triangular, 1’s on the diagonal
- U is upper Triangular
- Works with unknowns on \(\overrightarrow{b}\)

### Matrix Inverses

- A Matrix \(A^{-1}\) is a matrix such that \(A \cdot A^{-1} = I\)
- A Matrix is invertible or singular iff its determinant is non-zero.
- \([A | I]\) and row ops to \([I | A^{-1}]\) (Gauss-Jordan)

- A Determinant is \(\mathrm{det}(A) = \sum_{\sigma \in S_n} (\mathrm{sign}(\sigma)\prod_{i = 1}^{n} A_{i\sigma_n})\)
- The Determinant of an Upper Triangular Matrix is equal to the product of its diagonal, \(\mathrm{det}(A) = \prod_{i = 1}^{n} a_{ii}\)
- The determinant is invariant with regard to adding/subtracting a constant multiple of a row to another
- The sign flips whenever rows are exchanged.
- any constant multiplication of a row modifies the determinant as follows
- if multiplied by c, then \(\mathrm{det}(A) = \frac{\mathrm{det}(A)}{C}\)

### Linear Programming

- Not the modern sense of programming
- A methematical model for linear optimization
- there’s an objective function: cost(minimize)/benefit(maximize)
- constraints: linear equations that
**cannot**be violated - \(n\) variables, \(x_1, \ldots, x_n\)
- Objective: output a solution vector \(\overrightarrow{x}\) that is feasable and optimizes the objective function
- Standard Form:
- maximize \(\overrightarrow{c}^{T} \cdot \overrightarrow{x}\)
- Subject To \(A\overrightarrow{x} \leq \overrightarrow{b}\), \(x_i \geq 0\)

- Ex:
- Maximize \(4x + 3y\)
- \(x \geq 0\), \(y \geq 0\), \(x \leq 9\), \(2y + x \leq 12\), \(2y - x \leq 8\)

- For LP in standard form, the optimal solution always lies at a critical point (in this situation, all vertices bounding the feasible region)
- this is the simplex method or “gradient descent”

#### Use of Linear Programs

- Knapsack to ILP (Integer Linear Program)
- Maximize the sum of all values of items takenp
- Subject to the total weight constraint, \(W\), and an Integer Constraint: each item \(a_i\) can be taken, or not (0 or 1) \[ x_i = \left\{ \begin{matrix}0 & \mathrm{not\ taken}\\1 & \mathrm{taken}\end{matrix} \right. \]
- Maximize \(\sum_{i = 1}^{n} x_iv_i\), subject to \(\sum_{i = 1}^{n}x_iw_i \leq W\) and \(x_i \in \{0, 1\} \forall x_i\)
- Can do a relaxation, converting an ILP to an LP, such that \(0 \leq x_i \leq 1\), but this generates incorrect solutions

- In General, ILP are NP-Complete
- Some Problems, however, can be relaxed and preserve all optimal solutions (totally unimodular)
- In General, Simplex Method may be inefficient
- though good in practice
- there are, however, proven polynomila time algorithms
- Khachyan (1979)
- Interior Point Method (1984)

### Coding Theory

- Claude Shannon
- Compressing Data for efficient storage and/or transmission
- lossless: tarballs (encodes all information)
- lossy: jpegs, mpegs, mp3s, oggs (looses information)

- Adding redundancies to make it more reliable
- sending \(n + \log n\), where \(n\) is the number of bits, and \(\log n\) is the redunancy
- Reed-Solomon codes

#### Huffman Coding

- Lossless compression scheme for text and text-like data
- Let \(\Sigma\) be a fixed alphabet of characters appearing in a file, \(|\Sigma| = n\), (n distinct characters)
- A block encoding scheme encodes every character with a binary codeword of equal length, ex: ASCII (256 chars, in codewords of 8 bits)
- in general \(n\) characters requires \(\log n\) length codewords in a block encoding scheme

- But some characters may have a higher frequency, and others a lower frequency, thus variable length codewords
- high-freq gets short codewords
- low-freq gets long codewords
- \(\{0, 01, 101, 010\}\)
- vlcw require a prefix-free code (no codeword appears as the prefix of another)
- \(\{10, 010, 110, 0110\}\)

- Goal: Encode a file using prefix-free variable-length encoding such that common characters get short codewords, less common get longer codewords

## Lecture 8: Huffman Coding and AVL Trees

### Huffmann Coding Continued

- Create \(n\) binary tree nodes with the characters.
- Combine 2 trees at a time with the smallest frequencies
- Do this until 1 huffmann tree exists
- Characters only appear as leave, thus the path from the root to any leaf is unique
- high frequency leaves are shallow, high frequency leaves are deeper
- Minimize average codeword length
- Make sure to generate from root!
- compression ratio is original minus avg codeword over original
- in general english text files get 25-40% compression ratios
- pack is the most common
- Algorithm:
- Input: An alphabet of \(n\) symbols, \(\Sigma\), and a frequenci mapping \(\mathrm{freq}: \Sigma \to [0, 1]\)
- Output: A Huffmann Tree
- \(H \gets \mathrm{minHeap}\)
- For each \(x \in \Sigma\)
- \(T_x \gets \mathrm{single\ node\ tree}\)
- \(wt(T_x) \gets \mathrm{freq}(x)\)
- \(\mathrm{insert}(H, T_x)\)

- While \(\mathrm{size}(H) > 1\)
- \(T_r \gets \mathrm{new tree node}\)
- \(T_a \gets \mathrm{get}(H)\)
- \(T_b \gets \mathrm{get}(H)\)
- \(\mathrm{left}(T_r) \gets T_a\)
- \(\mathrm{right}(T_r) \gets T_b\)
- \(\mathrm{wt}(T_r) \gets \mathrm{wt}(T_a) + \mathrm{wt}(T_b)\)

- Return \(\mathrm{get}(H)\)

- Analysis
- While Loop executes \(n - 1\) times
- Input alphabet and frequency function, input size \(n = |\Sigma|\)
- heap insertion is performed \(2n - 1\) times, \(\in\mathcal{O}(n)\), at worst
- truly \(\mathcal{O}(\log n)\) when heap comparision operations considered
- Or \(\widetilde{\mathcal{O}}(n)\)

### BSTs

- Every node has a unique key, \(k\)
- Every node in a node \(u\)’s left subtree has a key value less than \(u\)’s key
- Every node in a node \(u\)’s right subtree has a key value greater than \(u\)’s key
- In general, there is an \(\mathcal{O}(\log n)\) lower bound on a BST with \(n\) elements, an upper bound of \(\mathcal{O}(n)\)
- Goal: Have a BST that is
**balanced**: \(d \in \mathcal{O}(\log n)\)

### AVL Trees

- Accomplishes the goal
- The height of a node is the length of the unique path from the root to the node
- The balance factor of the node \(u\) is as follows \(\mathrm{bf}(u) = \mathrm{h}(u_l) - \mathrm{h}(u_r)\)
- The height of an empty tree is always \(-1\), the heigh of a single node tree is \(0\)
- balance factors of \(-1\), \(0\), \(1\) indicate a balanced tree
- positive balance factors indicate left skew
- negative balance factors indicate right skew
- left rotation pulls the tree to the left
- right rotation pulls the tree to the right
- Can have left and right rotations within subtrees, such as RL or LR rotations
- Suppose that \(r\) has on it’s left \(T_1\), on right \(c\) which on the left, has \(T_2\) and on the right \(T_3\), and \(r\) is -2, \(c\) is -1$
- perform a generalized left rotation, with \(c\) as new root, and $T
_{2}on right of \(r\), and a new node on \(T_3\)

- perform a generalized left rotation, with \(c\) as new root, and $T
- Suppose that \(r\) has \(c\) on left, \(T_3\) on right. \(c\) has \(T_1\) and \(T_2\) on left and right respectively
- Perform generalized right rotation, \(c\) has \(T_1\) on left, \(r\) on right. \(r\) has \(T_2\) and \(T_3\) on left and right respectively

`(r (c T_1 (g T_2 T_3)) T_4)`

`(r (g (c T_1 T_2) T_3) T_4)`

->`(g (c T_1 T_2) (r T_3 T_4))`

- Swinging over lets you cut one off and put it on the other side of the tree

## Lecture 9: AVL Trees, 2-3 Trees and Balancing

### AVL Trees

- Insert 8, 4, 7, 3, 2, 5, 1, 10, 6
`(8)`

`(8 4)`

`(8 (4 nil 7))`

`(8 (4 3 7)) -> (8 (7 4 3)) -> (7 (4 3) 8)`

`(7 (4 (3 2)) 8) -> (7 (3 2 4) 8)`

`(7 (3 2 (4 nil 5)) 8) -> (7 (4 (3 2) 5) 8) -> (4 (3 2) (7 5 8))`

`(4 (3 (2 1)) (7 5 8)) -> (4 (2 1 3) (7 5 8))`

`(4 (2 1 3) (7 5 (8 nil 10)))`

`(4 (2 1 3) (7 (5 nil 6) (8 nil 10)))`

- Observations:
- Search is \(\mathcal{O}(d)\)
- \(d \in \mathcal{O}(\log n)\)

- Insertion is \(\mathcal{O}(d)\)
- $d ∈ \(\mathcal{O}(\log n)\)

- A single rotation is \(\mathcal{O}(1)\)
- Deletion is \(\mathcal{O}(d)\)
- \(\mathcal{O}(d)\) to find replacement
- May need to rotate all the way up to the root. (\(\mathcal{O}(\log n)\))

- In worst case, \(\Theta{\log n}\), but guaranteed to be \(\mathcal{O}(\log n)\) in all cases

- Search is \(\mathcal{O}(d)\)

### B-Trees

- Each node has at most \(m\) children
- Every node except the root has at least \(\left\lceil\frac{m}{2}\right\rceil\)
- The root has at least 2 children unless it is a leaf
- Non-leaf nodes with \(k\) children have \(k - 1\) keys
- All leaves are at the same level and non-empty
- commonly used in databases
- Ex:
- Each node has at most 3 children
- Each node has at least 2 children

### 2-3 trees

- This is a specialization of the B-tree where \(m = 3\)
- 2 kinds of nodes:
- 2-nodes: A single key \(k\) and 2 children, \(\ell, r\)
- 3-nodes: 2 keys \(k_1, k_2\) such that \(k_1 < k_2\) and 3 children \(\ell, c, r\)

- Ex:
`(10 (5 (7 4) 7) ((15 20) 12 17 (21 25)))`

- Search is simple, but varies based on node size
- Insertion is always done at a leaf
- sometimes requires changing node size and ordering
- sometimes requires reordering of node keys

- Ex: 8 4 7 3 2 5 1 10 6
`(8)`

`((4 8))`

`((4 7 8)) -> (7 4 8)`

`(7 (3 4) 8)`

`(7 (2 3 4) 8) -> ((3 7) (2 4) 8) -> ((3 7) 2 4 8)`

`((3 7) 2 (4 5) 8)`

`((3 7) (1 2) (4 5) 8)`

`((3 7) (1 2) (4 5) (8 10))`

`((3 5 7) (1 2) (4 6) (8 10)) -> (5 (3 (1 2) 4) (7 6 (8 10))`

- Analysis:
- Min key count depth \(d\)
- all be 2-nodes (1 key per node)
- all nodes are present at all levels (always 2 children)
- \(2^{d + 1} - 1\) nodes
- \(2^{d + 1} - 1\) keys

- Max key count for depth \(d\)
- every node is 3-node (2 keys per node, 3 children per node)
- \(2(3^{d + 1} - 1)\) nodes/keys

- \(d \leq \mathcal{O}(\log_2 (n + 1) - 1)\)
- \(\log_3(\frac{n}{2} + 1) - 1 \leq d\)
- Therefore \(d \in \Theta(\log n)\)

- Min key count depth \(d\)
- Don’t even think about deletion. It’s a pain.

### Other Balanced BSTs

- Splay Trees
- Tries
- Red-Black Trees (default implementation in java for tree-set)
- A Trees
- Scapegoat Trees
- Treap

### Heaps

- A (min) heap is a binary tree data structure with the following properties
- all nodes are present at all levels except possibly the last level
- at the last level, it is full to the left
- the key of any node is less than bth its children
- The minimum is therefore always at the root
- The max is always at the bottom towards the left. It will only be at the very bottom if it’s full.

- 2 efficient operations
- remove min
- insert