# Automata

## Table of Contents

- Lecture 1
- Lecture 2: Chapter 0
- Lecture 3: Chapter 1
- Lecture 4: Chapter 1, Continued
- Lecture 5: Chapter 1, continued
- Lecture 6: Chapter 1, continued
- Lecture 7: Chapter 1, continued
- Lecture 8: Chapter 1, continued
- Lecture 9: Chapter 1, continued
- Lecture 10: Chapter 1, continued
- Lecture 11: Chapter 1, continued
- Lecture 12: Chapter 1, Continued
- Lecture 13: Chapter 2, Context Free Languages
- Lecture 14: Review
- Lecture 15: Context Free Grammars
- Lecture 16: PDA (Pushdown Automaton)
- Lecture 17: PDAs, Continued
- Lecture 18: Equivalence of PDA and CFG
- Lecture 19: PDA to CFG
- Lecure 20: The Pumping Lemma for CFGs
- Lecture 21: The Pumping Lemma, Continued
- Lecture 22: Context-Free Languages, continued
- Lecture 23: Chapter 3, The Turing Machine
- Lecture 24: Chapter 3, The Turing Machine
- Lecture 25: Chapter 3, The Turing Machine – Extensions

## Lecture 1

- will study computability theory – whether a problem can be solved by a computer (ch 4, 5, 6)
- not studying complexity theory (if a problem can be solved, how much time/space it will take to solve) (ch 7, 8, 9, 10, CSCE 423, 424)

- Requires some preliminaries – ch 1-3
- finite automata
- push-down automata
- Turing machines

- Ch 0 will be a review

## Lecture 2: Chapter 0

- review of
- sequence
- set
- string
- language

### Sequence

- a list of objects in some order (may be infinite)
- objects are referred to as elements or members
- parens and commas for notation
- two sequences are equal iff they have the same number of elements, the same elements, and the elements are in the same order
- \(A = (a_1, a_2, \ldots, a_n)\)
- \(B = (b_1, b_2, \ldots, b_m)\)
- equal when \(n = m\) and \(a_1 = b_1\), \(a_2 = b_2\), \(\ldots\), \(a_n = b_n\)
- Order matters
- repetition matters
- \(|A|\) is the number of elements in A

### Sets

- A group of unordered objects (also called elements or members)
- braces are used, an empty set is the null set
- Two sets are equal iff they contain the same group of distinct elements – \(A = B \leftrightarrow \forall x (x \in A \land x \in B)\)
- Size again denoted w/ vertical bars, but is number of
**distinct**elements in A - A is a subset of B if every element of A is also an element of B
- \(P(A)\) – the powerset of A – all subsets of A
- Cross product of A and B – \(A \times B = \{(a, b) | a \in A \land B \in B \}\)

### Strings

- A finite sequence of symbols over an alphabet
- An alphabet is a set of symbols
- empty string is denoted with \(\epsilon\)
- Concatenation \(xy\), \(x = 01\), \(y = 10\), \(xy = 0110\), the empty string is ignored
- Repetition \(x^i\) – \(x\) concatenated \(i\) times, \(x=01\), \(x^2 = 0101\), \(x^0 = \epsilon\)
- Reverse \(x^R\) – \(x = 01\), \(x^R = 10\), \(x = 001, x^R = 100\), \(x = \epsilon, x^R = \epsilon\)
- \(x\) is a palindrome if \(x = x^R\)
- dictionary order – sort by first symbol, then second for those with the same first, and so forth
- shortlex order – sort by length, and then in each length, sort by dictionary order

### Language

- A language is a set of string over an alphabet
- empty language denoted by empty set
- size denoted similarly

## Lecture 3: Chapter 1

- Regular Languages
- deterministic FA
- non-deterministic FA
- regular expressions
- what exactly is a regular language – how do they connect to problems that are solveable by the three models
- models used by many search/parse/editing tools
- testing and verification/model checking
- modelling OS/network protocols

### Deterministic Finite Automata

- state transition diagram – inputs are edges from a state (a node) to a state (node)
- a finite automaton – a state diagram, plus a start state and one or more final states
- start state and final states together define the set of strings we’re interested in
- start state indicated by arrow pointing from nowhere
- final/accept states are indicated by a double circle
- languages recognised by finite automaton
- for a finite automaton \(M\) and input \(x\), if \(M\) stops at a final state after reading \(x\), we say \(M\) accepts \(x\), otherwise, \(M\) rejects \(x\)
- the language \(L(M)\) recognised by a finite automaton \(M\) contains all the strings accepted by \(M\) (all strings that can go from start to end)
- DFA is 5-tuple \((Q, \Sigma, \delta, q_0, F)\)
- \(Q\)
- set of states
- \(\Sigma\)
- set of input symbols
- \(\delta\)
- transition function \(\delta : Q \times \Sigma \to Q\)
- \(q_0\)
- initial state
- \(F\)
- set of final/accepting states

- State transition function can be described as diagram
- As a function
- as a table – first row is start state, starred rows is final state

- swapping final states and non-final states, you get the complement of the original
- when all states are accepting, it accepts all strings
- when no states are accepting, it accepts nothing

## Lecture 4: Chapter 1, Continued

### Design of DFA

- If a page contains a keyword – page, a string, and keyword, a substring
- contains unl, uno, or unk; also contains cse
- this is \(L\), build DFA, minimize DFA to get minimum DFA

- Designing the DFA – 3 steps for \(M\) from \(L\)
- Steps:
- Determine information to remember about part of an input string that \(M\) has already read
- Represent information as a set of states
- determine start and final states

- Assign transitions (\(\delta: Q \times \Sigma \to Q\))

- Steps:
- \(L = \{x over \{0, 1\} | x no end 01\}\)
- Whether \(01\) are the last two symbols
- Symbols
- 0
- 1
- 01
- 0 is start
- 0, 1 are accept

- Transitions:
- \(\epsilon \to 0\)
- \((0, 0) \to 0\)
- \((0, 1) \to 01\)
- \((01, 0) \to 0\)
- \((01, 1) \to 1\)

- Strings containing 10
- States
- 1
- 0
- 10
- 1 is start
- 10 is accept

- Transitions
- \(\epsilon \to 0\)
- \((0,0) \to 0\)
- \((0,1) \to 1\)
- \((1,0) \to 10\)
- \((1,1) \to 1\)
- \((10,_) \to 10\)

- States

## Lecture 5: Chapter 1, continued

### DFA Design, continued

- L - over 0, 1, contains 10, doesn’t end with 01
- Combine from two others
- Requires same alphabet
- need to keep state of both machines
- use pairs, built by the cross product of the two sets \(Q\)
- State transition function built as:
- \(\delta_3((a, b), sym) = (\delta_1(a, sym), \delta_2(b, sym))\)
- again, the cross product
- ensure that invalid states are removed
- From start state, add each tranision iteratively, going by what is present, as if doing breadth-first search

- For a \(\cup\) combination:
- do as above
- but change final state, \(F_4 = \{(s_1, s_2) | s_1 \in F_1 \lor s_2 \in F_2\}\)

### DFA Minimization

- Three steps
- steps:
- Partition all states into two groups \(G_1\) – final states, \(G_2\) – all non-final states
- For each group, check whether all states in the group go to the same next group for each symbol.
- If not, partition into smaller groups
- repeating untill no mare partitions

- All states in a group can be combined to a single state
- groups with original final states become a single, new final state
- The group with the original start state becomes the new start state

- steps:

## Lecture 6: Chapter 1, continued

### Regular langugages

- a language is said to be regular if there exists some DFA recognizing it
- Theorem: For a regular language there exists a unique minimum DFA up to isomorphism
- def’n: two DFA are isomorphic if they have different state names but the same transition function

### NFAs

- why?
- many different models
- consider a dfa, all languages recognizable are regular, as do NFAs
- for some regular languages, NFAs are easier to design (ex all pages containing CSE, CS or CE)

- a transition may go to multiple states
- transitions may be on the empty string
- NFAs are describe similarly, but \(\delta: Q \times \Sigma_{\epsilon} \to \np(Q)\)
- So long as at least one of the states currently held is accepting, the string is accepted
- a decision tree/path forms

## Lecture 7: Chapter 1, continued

- all languages that can be recognized by a DFA are regular languages, likewise for NFA
- A proof that DFA and NFA are equivalent
- To show a languageg is regular, show either a DFA or an NFA
- DFAs are a special type of NFA
- So prove that NFAs are a subset of DFAs

### NFA to DFA

- Steps:
- Information – copies of the NFA states, as sets of individual states
- States – the powerset of \(Q\) from the NFA, only consider the distinct states, accepting if any member of the state set was accepting
- Transitions – \(\delta_d(R, sym) = \cup_{r \in R} \delta_n(r, sym)\)

- Remember to minimize the resulting DFA
- when handling epsilon transitions, use \(E(A) = A \cup \{ \mathrm{all states reached by following only epsilon transitions} \}\) for the next state, \(\delta_d(R, sym) = E(\cup_{r \in R}\delta_n(r,sym))\), start state is \(E(q_{n1})\)

## Lecture 8: Chapter 1, continued

### The Regular Expression

- Reasons
- new model
- used in many programming languages and software
- vi
- grep
- perl
- java
- python
- awk

- write a program to check whether a string matches a given regular expression

An alphabet \(\Sigma\)

RE language a {a} \(\epsilon\) {ε} ∅ ∅ R _{1}∪ R_{2}L(R _{1}) ∪ L(R_{2})R _{1}R_{2}L(R _{1}) ˆ L(R_{2})R _{1}^{*}L(R _{1})^{*}- Σ = {0, 1}
- \(11* \cup 0*00*\)

- \(*\) means zero or more
- \(\circ\) means a followed by b, \(A \circ B = \left\{ xy \mid x \in A \land y \in B \right\}\), essentialy the cross product
- if is the empty set, the concatenation is the empty set

- \(A^n\) is A concatenated with itself \(n\) times
- \(A^*\) is \(A^0 \cup A^1 \cup A^2 \cup A^3 \cup \cdots\)
- \(A^+\) is \(A^1 \cup A^2 \cup A^3 \cup \cdots\)

## Lecture 9: Chapter 1, continued

- Showing that REs accept regular expressions
- RE to DFA/NFA
- \(a\) - start state, accept a, to final state
- \(\epsilon\) - start state is final state, no transitions other than the initial epsilon
- \(\emptyset\) - start state, no transitions, no final state
- \(R_1 \cup R_2\) - given a machine for each, use an epsilon transition to each as an NFA
- \(R_1 \circ R_2\) - all final states of \(R_1\) have an epsilon transition to the initial state of \(R_2\)
- \(R_1^*\) - Empty state before initial, epsilon transition from each final state to the original start state

## Lecture 10: Chapter 1, continued

- DFA/NFA to RE
- Requires GNFA
- NFA + RE as transition labels

- Conversion:
- DFA to GNFA in a special form
- Start state only has outgoing transitions – often just add a new start state w/ ε transition
- one final state, only incoming transitions – Give all final states an ε transition to a new final state
- Only Any state to any state: at most one transition – merge transitions with union

- Remove states one by one, except the start and final state – label each transition with concat of each relevant transition
- The label between start and final is the regular expression

- DFA to GNFA in a special form

## Lecture 11: Chapter 1, continued

- HW1 Due 09-21, No class 09-24, HW reviewed by TA 09-26, Exam 1 09-28
- The Pumping lemma
- describes a common property of all regular languages, and some non-regular languages
- if a language has the property is it regular?
- it may be regular

- if it doesn’t have the property, than it isn’t regular
- based on pigeonhole principle
- if more than \(p\) pigeons are placed into \(p\) holes, then at least one hole will have at least one pigeon

- Consider a reg. lang. \(A\) and a DFA for \(A\) as well as a string \(s \in A\) of \(n\) symbols
- The path must then consist of \(n + 1\) states
- if \(n + 1 > |D|\), then the path repeats some states
- \(xz\) – \(x\) takes it to 9, \(z\) takes it from 9 to 13
- \(xyyz\) – \(x\) takes to 9

- For any integer \(i \geq 0\), \(xy^iz\) is accepted by the machine
- If \(A\) is a regular language, then there is a number \(p\) (the pumping length) such that for any string \(s\) in \(A\) with \(|s|\geq p\) it can be divided into two three pieces \(s = xyz\) satisfyining the following conditions
- for each \(i \geq 0\), \(xy^iz \in A\)
- \(|y| > 0\)
- \(|xy|\leq p\)

- Number of states is often the pumping number \(p\) (in particular of a minimized DFA)

- contrapositive of the pumping lemma
- normal: if \(A\) is regular, then for any long enough string, some part of its first \(p\) symbols can be pumped
- contrapositive: if for some long enough string \(s\) in \(A\) none of its first \(p\) symbols can be pumped, then \(A\) is not regular
- Language \(A\) is not regular if, for every arbitrary number \(p\), there exists a string \(s\) in \(A\) with \(|s|\geq p\) having the following property:
- for any decomposition of \(s = xyz\) in which \(|xy| \leq p\), and \(|y| > 0\),…

- Do it.
- Let \(p\) denote an arbitrary number
- pick a string \(s \in A\) with \(|s| \geq p\)
- identify all possible decomps of \(s\) into \(xyz\) with \(|xy| \leq p\) and \(|y| \geq 0\)
- show that for each decomp, there exists an \(i \geq 0\) s.t. \(xy^iz \not\in A\) (i may be different for each decomp, if it isn’t a good choice, try again)
- conclude that \(A\) is not regular.

- \(L_1 = \left\{ w \mid w \mathrm{ has the same number of a's and b's} \right\}\)
- let \(p\) be arbitrary
- Let \(s = (ab)^p\) and we have \((ab)^p \in L_1\), and \(|(ab)^p| = 2p > 1\)
- wrong
- \(s = xyz\), \(x = \epsilon\), \(y = aba\), \(z = b(ab)^{p-2}\)
- let \(i = 0\), then \(xy^iz = xz = b(ab)^{p-2} \not\in L_1\)
- Therefore, \(L_1\) isn’t regular

- Identify all possible decomps of \(s\) into \(xyz\) with \(|xy| \leq p\) and \(|y| > 0\)
- follow from here
- Try
- \(p\) is arbit.
- \(s = a^pb^p\) and we have \(a^pb^p \in L_1\)
- Ident all possible decomps of \(s\) into \(xyz\) with \(|xy| \leq p\) and \(|y| > 0\)
- \(x = \epsilon\), \(y = a^p\), \(z = b^p\)
- \(k = a^n\), \(p-k \geq n \geq 0\)
- \(y = a^k\), \(p \geq k \geq 1\)
- \(z = a^{p-n-k}b^p\)

- ex, \(y = 2\), thus, language isn’t regular.

- Consider \(L_2 = \{ww | w \in \{a,b\}^*\}\)
- \(p\) is arbitrary
- \(s = a^pba^pb\)
- \(y = a^k\), \(p \geq k \geq 1\), \(x = a^n\), \(p-k \geq n \geq 0\); \(z\) is the remainder
- falls through as before

## Lecture 12: Chapter 1, Continued

- proving non-regularity
- \(L_3 = \{a^mb^n | m > n\}\)
- \(p\) is arbitrary
- \(s = a^{p+1}b^p\)
- \(y=a^k\), \(p \geq k \geq 1\)
- \(x=a^n\), \(p-k \geq n \geq 0\), \(z = remaining\)
- \(i = 0\), \(xy^iz = xz = a^{p+1-k}b^p\), \(\not\in L_3\)

- Closure properties
- \(A \cup B\) – regular
- combine NFA/DFA – easy

- \(A \circ B\) – regular
- combine NFA/DFA – similarly easy

- \(A^*\) – regular
- Allow accepting empty string and repeats

- \(A \cap B\) – regular
- see notes for DFA, can be extended to NFA

- \(\bar{A}\) – regular
- switch final/non-final states

- Reverse of a language a is defined as follows: \(A^R = \{w^R | w \in A\}\)
- change direction of transitions, start state becomes final, if multiple final states, add a new state w/ epsilon transitions to old final states
- Also regular

- \(\Sigma = \{a, b\}\), \(\Gamma = \{0, 1\}\)
- \(h(a) = 00\), \(h(b) = 01\)
- if \(A\) is regular, than \(h(A)\) is also regular

- \(A \cup B\) – regular
- Closure properties can be used as for a contrapositive w/ the pumping lemma.
- \(L_4 = \{w | w \text{has different number of a's and b's}\}\)
- We see that \(\bar{L_4} = L_1\)
- we have already proven that L
_{1}is not regular, therefore L_{4}is not regular

- \(L_5 = \{a^nb^n | n \geq 0\}\)
- \(L_5 = L_1 \cap a*b*\)
- therefore, \(L_5\) is not regular – wrong! L
_{1}is not regular to begin with! - Subset property doesn’t apply either

## Lecture 13: Chapter 2, Context Free Languages

- is it possible to design a computer to solve a problem?
- chapter 1 , whether or not it’s possible to design a DFA/NFA/RE for a given language
- to simplify – consider only decision problems – is a number divisible by 5
- Is it possible to design a computer to solve a decision problem
- what does it mean to solve – determine yes or no given an input
- input must be encoded
- whether or not strings are accepted
- is it possible to design a computer to recognize lang of decision problem

- Models and Languages:
- DFA/NFA/RE – smallest group, regular languages
- PDA/CFG – Context Free Languages
- Real Computers – problems solvable by computer
- Turing Machines – Turing-recognizable languages

- The CFG
- mathematical model to describe languages
- widely used in many applications
- specify programming languages
- generate images (contextfreeart.org)
- automatically generate CS papers (pdos.csail.mit.edu/scigen)

- Ex:
- \(A \to 0A1\)
- \(A \to \epsilon\)
- LHS is a variable
- non variables, non empty string are terminals

- \((V, \Sigma, R, S)\)
- \(V\) – Set of Variables
- \(\Sigma\) – Set of Terminals (alphabet)
- \(R\) – Set of Rules
- \(S\) – Start Variable

- Rules may be combined – \(A \to 0A1 | \epsilon\)
- Start from variable – find a rule that matches, and continue by variable.
- \(\left\{ w \mathrm{over} \Sigma | S \Rightarrow^* w \right\}\)
- parse trees may be used to explain a CFG
- graphical/tree-based structure

- Ex:
- \(S \to S + S\)
- \(S \to a\)

- CFGs are more powerful that DFA/NFA/RE
- Convert REs to a CFG
- \(a \iff S \to a\)
- \(\epsilon \iff S \to \epsilon\)
- \(\emptyset \iff S \to S\) (or no rules)
- \(R_1 \cup R_2 \iff S_1 \to R_1, S_2 \to R_2, S \to S_1 | S_2\)
- \(R_1 \circ R_2 \iff S \to SR_1 SR_2\)
- \(R_1^* \iff S \to \epsilon | SR_1 S\)

- For NFA/DFA:
- \(S_i = sym S_j\)

- Convert REs to a CFG

## Lecture 14: Review

- be careful when designing DFAs
- use the examples
- make sure to minimize

## Lecture 15: Context Free Grammars

- \(L_1 = \left\{a^nb^n | n \geq 0 \right\}\)
- \(R \to \epsilon\)
- \(R \to a R b\)

- Two methods:
- Union Method – union of multiple smaller languages
- concatenation method – concatenation of multiple smaller languages
- recursive – \(L_1L_2L\)
- special cases

- \(L_2 = \left\{a^m b^n | m > n \geq 0\right\}\)
- \(S_2 \to a S_2\)
- \(S_2 \to a S_1\)

- \(L_2 = \left\{a^mb^n | m \neq n\right\}\)
- \(S_3 \to S_2 | B\)
- \(B \to B b\)
- \(B \to S_1 b\)
- \(B^{\prime} \to B b | b\)

- \(L_4 = \{ a^ib^ja^k | j > i + k, i \geq 0, k \geq 0 \}\)
- \(S_4 \to S_1 B^{\prime} S_1^R\)

- $L
_{5}= \{ x over \{a, b\} | x is a palindrome \}- \(S_5 \to \epsilon\)
- \(S_5 \to a\)
- \(S_5 \to b\)
- \(S_5 \to b S_5 b\)
- \(S_5 \to a S_5 a\)

- \(L_6 = \{ x over \{a, b\} | x is not a palindrome \\)}
- \(S_6 \to a S_6 a\)
- \(S_6 \to b S_6 b\)
- \(S_6 \to a T b\)
- \(S_6 \to b T a\)
- \(T \to aT | bT | \epsilon\)

- \(L_7 = \{ x over \{a, b\} | x has the same number of a's and b's \}\)
- \(S_7 = a S_7 a\)
- \(S_7 = b S_7 a\)
- \(S_7 = S_7 S_7\)
- \(S_7 = \epsilon\)

## Lecture 16: PDA (Pushdown Automaton)

- a new model
- Adds a stack to a DFA/NFA

- cfg – specify a programming language
- PDAs can be used to design the compiler (See CSCE 425)
- PDAs may be built upon either DFA or NFA, based upon NFA is more powerful
- Stacks have 2 ops
- push – push a symbol onto the top of the stack
- pop – pop the symbol off of the top of the stack

- Edges are now labeled differently: (\(a, B \to C\))
- if current state is \(P\), and \(a\) is input symbol, and \(B\) is the top of the stack symbol, then take transition, read input symbol, POP symbol B, and push \(C\), going to new state \(Q\)
- with \(\epsilon\) as \(B\), the transition may be taken no matter the current symbol, and no symbol is to be popped
- with \(\epsilon\) as \(C\), don’t push any symbol
- \(\epsilon, \epsilon \to \epsilon\) – don’t read, don’t change the stack
- shortcuts do exist

- Formal Description:
- NFA:
- \(Q\)
- States
- \(\Sigma\)
- Alphabet
- \(\delta\)
- \(Q \time \Sigma_\epsilon \to P(Q)\)
- \(q_0\)
- Start state
- \(F\)
- Final States

- PDA:
- \(Q\)
- States
- \(\Sigma\)
- Input Alphabet
- \(\Gamma\)
- Stack Alphabet – all possible stack symbols
- \(\delta\)
- \(Q \time \Sigma_\epsilon \times \Gamma_\epsilon \to P(Q \times \Gamma_\epsilon)\)
- \(F\)
- Final States

- NFA:
- \$ is used to denote the first thing (popping will be empty stack)
- An input string is accepted by a PDA if at least one copy of the PDA stops at a final state after reading the input string.
- Use a ’configuration’ to show how it works in homework:
- (current state, remaining input symbols, contents of the stack)
- \((1, aab, \epsilon) \Rightarrow (2, aab, \\)) ⇒ (2, ab, A\\() \Rightarrow (2, b, AA\\)) ⇒ (3, ε, A\\() \Rightarrow (4, \epsilon, \\))$

## Lecture 17: PDAs, Continued

- Designing a PDA for a language
- Split input string
- \(S_1 = S_2^R\)
- \(|S_1| \leq |S_2|\) or other comparison

- \(L_1 = \{ wcw^R | w over \{a,b\}\}\)
- \(Q = \{1,2,3,4}\)
- \(q_0 = 1\)
- \(F = \{4\}\)
- 2 makes a copy of the first part
- 3 compares the first part with the second
- transition to 4 on acceptance

- \(L_2 = \{ ww^R | w over \{a,b\}\}\)
- \(Q = \{1,2,3,4}\)
- \(q_0 = 1\)
- \(F = \{4\}\)
- 2 makes a copy of the first part
- epsilon, epsilon, epsilon transition between 2 and 3
- 3 compares the first part with the second
- transition to 4 on acceptance

- \(L_3 = \{ w | w = w^R over \{a, b\} \}\)
- add a transition between 2,3 tha reads either char but doesn’t change the stack

## Lecture 18: Equivalence of PDA and CFG

- PDA to CFG
- Special form of PDA as first step
- Only one final state – epsilon transitions from old final sates to new
- Stack must be empty at the final state
- Ex: \(\Gamma = \{ \\), A, B \}$
- Requires loop-back transitions to pop remaining stack symbols at the end

- A transiion is either a push transition (\(a, \epsilon \to x\)) or a pop transition (\(a, x \to \epsilon\))

- States P, Q – CFG Var \(A_pq\) (all strings that can go from p to q without changinging the top stack symbol at p, and the top stack at p has never been popped) – Three types:
- \(p\to r\) (\(a, \epsilon \to x\)) (push), \(s \to q\) (\(b, x \to e\)) (pop) – \(A_pq = a A_rs b\)
- \(p v q\) – \(A_pq \to A_pv A_vq\)
- \(p\) – \(A_pp \to \epsilon\)

## Lecture 19: PDA to CFG

- Continue given previous rules
- Generating rules from the three types of rules
- In \(A_pq \to a A_rs b\), \(A_rs\) must satisfy the same conditions as \(A_pq\)
- push to pop in order for generating rules

## Lecure 20: The Pumping Lemma for CFGs

- Review of the Pumping Lemma for Regular Langugages
- The loop

- Pumping Lemma for CFGs
- Assume each node has at most 3 child nodes
- If height is 1, it has at most 3 nodes
- If a tree has at least 3 leaf nodes, its height is at least 1
- \(S \to aSb\), \(S \to \epsilon\), \(ab\)
- Nuber of child nodes of a variable node is the number of symbols on the rhs of the rule
- Assume that each rule has at most \(b\) symbols on the RHS, if a string as at least \(B\) symbols, the height of the parse tree is at least \(b\)
- Generally,
- \(b\) is the max number of symbols on the RHS of a rule
- \(v\) is the total number of variables

- Consider a string \(s\) that can be derived by the grammar, if \(s\) has at least \(b^{v+1}\) symbols, then
- the height of the trea is at least \(v + 1\)
- and the number of nodes is at least \(v + 2\)
- The number of variables – the longest path has at least \(v + 1\) variable nodes
- According to the pigeonhole principle, some variable occurs at least once on the longest path
- the number of terminals – \(v + 1\)
- here are two subtrees under this parse tree – one for the earlier \(T\) and another for the later.
- The inner tree may be replaced with the outer, or likewise the outer with the inner

- Requires 2 pumped sequences?
- Lemma: If \(A\) is a CFL, then there exists a pumping length \(p\) such that for any stirng \(s \in A\) with \(|s| \geq p\), it can be devided into five pieces, \(s = uvxyz\) satisfying the following conditions:
- For any integer \(i \geq 0\), \(uv^ixy^iz \in A\)
- \(|vy| > 0\) (one of \(v\) and \(y\) could be empty, but not both at the same time)
- \(|vxy| \leq p\) (\(vxy\) are not necessarily the first \(p\) symbols since \(|u|\) is unknown)

- Contrapositive: if for any \(p\) there exists a string \(s \in A\) with \(|s| \geq p\), having the following property: For any decomposition of \(s = uvxyz\) in which \(|vy| > 0\) and \(|vxy| \leq p\), there is an integer \(i \geq 0\) for which \(uv^ixy^iz \not\in A\), then the language is not context free.

- Using the new Pumping Lemma
- Prove that \(L = \left\{ a^nb^nc^n | n \geq 0\right\}\)
- Let \(p\) be an arbitrary number,
- Consider \(s = a^pb^pc^p\) and consider all possible decompositions as specified
- Cases:
- \(vy\) has only \(a\)
- if we set \(i = 0\), we remove both \(v\) and \(y\)
- note, \(i \neq 0\) prevents validity

- \(vy\) consists of both \(a\) and \(b\)
- \(i = 0\), any \(i \neq 1\)

- \(vy\) has only \(b\)
- See above

- \(vy\) consists of both \(b\) and \(c\)
- See above

- \(vy\) has only \(c\)
- See above

- \(vy\) has only \(a\)
- Likewise, \(\{w over \{a, b, c\} | |w|_a = |w|_b = |w|_c\}\) follows, by the same string

## Lecture 21: The Pumping Lemma, Continued

- For a string \(s\) in a cfl, \(A\), if \(s\) is sufficiently long, then on the longest path from the root to a leaf in the parse tree, some variable appears more than once.
- If \(A\) is a cfl, then there exists a pumping length \(p\) such that for any stirng \(s \in A\), with \(|s| > p\), it can be dividied into five pieces, \(s = uvxyz\), satisfying the following conditions:
- For any integerr \(i \ge 0\), \(uv^ixy^iz \in A\)
- \(|vy| > 0\)
- \(vxy \leq p\)

- Describes a common property
- if has the property, it may be cf, but if it doesn’t, it isn’t CF, see contrapositive above
- Prove that the following language is not context free: \(L = \{ ww | w over \{a,b\}\}\)
- Given cases 1-8, if \(i \neq 1\), then \(s \not\in L\)
- For case 9:
- if \(v\) and \(y\) have the same number of $a$s, string \(a^pba^pb\) isn’t a good choice

- Let \(p\) be arbitrary, and \(s = a^pb^pa^pb^p\)
- For each case, if \(i = 0\), the string doesn’t hold. Any integer \(i \neq 1\) holds. Language is not CFG

- For \(L = \{ wcw | w over \{a,b\}\}\)
- First Proof:
- \(s = a^pb^pca^bp^b\)
- 11 cases
- For each case, if \(i = 0\), the string doesn’t fit

- Different Proof:
- List all possible decomps depending on whether \(vy\) contains symbol \(c\)
- 2 cases:
- \(c \in vy\)
- \(i = 0\), new string \(uxz\) doesn’t have \(c\), and thus not in \(L\), in fact, \(i \neq 1\) works.

- \(c \not\in vy\)
- Because \(|vxy| \leq p\), \(v\) and \(y\) must contain a different number of symbols. Therefore, \(s \not\in L\), and language not CF.

- \(c \in vy\)

- First Proof:

### Closure Property

- CLASS is closed under \(\odot\) if \(A,B \in CLASS\), then \(A\dotcirc B \in CLASS\)
- Properties:
- Union – Closed
- Concatenation – Closed
- Star – Closed
- Intersection – Not Closed
- Complement – Not Closed

### Proving \(L\) is CF

- Design a PDA
- Design a CFG
- Use closure property
- To prove not CF
- use pumping lemma
- and closure property

## Lecture 22: Context-Free Languages, continued

### CFG to PDA

- Ex:
- \(S \to S_1 S_2\)
- \(S_1 a S_1\)
- \(S_1 \to a\)
- \(S_2 \to a S_2 b\)
- \(S_2 \to \epsilon\)

- Generate a transition for each rule, and for each terminal
- terminals pop items, rules may push as well as pop
- Use left-most derivation to follow the PDA

### Group Activity

- Be careful when traversing a tree from a PDA

### Midterm Review

- pumping lemma was a problem
- \(s\) must be defined generally, and \(|s| \geq p\)
- Consider all possible decompositions, \(s = xyz\), \(s\) must be specific!

## Lecture 23: Chapter 3, The Turing Machine

- Initially only the standard turing machine, but from there, some variants
- Recall DFA/NFA and PDA/CFG
- Turing Machine
- Note: These are deterministic
- semi-infinite input tape, infinite to the right
- blank symbols after the original string, infinitely many
- finite number of states
- Input tape may be written to
- tape head can be moved left or right
- Transition, ex: from \(P\) to \(Q\), \(a \to b, c\)
- \(P\) – current state
- \(Q\) – next state
- \(a\) is the current tape symbol
- \(b\) is the new tape symbol
- \(c\) is either \(L\) or \(R\), \(L\) to move head to left, \(R\) to move to right
- Moving left at leftmost edge

- Recognizes Turing-recogniseable languages (i.e., \(a^nb^nc^n\))
- Has 2 final states – accept or reject
- When the machine enters the accept state, the machine accepts
- Enters the reject state or no more transitions to take
- Status may be described as a “configuration”, (symbols on lhs of tape head, current state, current symbol, symbol on rhs of tape head)
- Described as:
- \(Q\) – States
- \(\Sigma\) – Input Symbols
- \(\Gamma\) – Tape Alphabet, \(\Sigma \subseteq \Gamma\), Space \(\in \Gamma\)
- \(\delta\) – \(Q \times \Gamma \to Q \times \Gamma \times \{l, r\}\)
- \(q_{0}\) – Initial State
- \(q_{accept}\) – Accept State
- \(q_{reject}\) – Reject State

## Lecture 24: Chapter 3, The Turing Machine

- Design of a TM is like writing a program
- Algorithm is a high-level idea
- TM is the implementation
- \(\{a^{2^n} | n \geq 0\}\)
- Count number of $a$s, say \(m\)
- check if \(m\) is a power of 2
- Unary number to represent \(m\)
- use a binary number to represent \(m\)

- Unary numbers, single symbol, \(m\) is represented by \(m\) symbols
- binary, two symbols, etc
- \(a\) with unary number
- input string is already a unary number
- Checking whether a decimal number is a power of 2
- divide by 2 until an odd number. If the odd number is \(1\), \(m\) is a power of 2

- Divide by 2, by replacing every other \(a\) with a special symbol
- See turing machine from slides (TM1)

- \(a\) with binary counter
- remove an a from the string, increase counter by one
- count should be lsb first
- check to make sure that only the last bit is 1
- only count \(m - 1\), and then check only 1s in counter
- See TM2 from slides

- \(#x#y#z\) where \(x = y + z\) and $x, y, and \(z\) are unsigned binary numbers, and the MSB is 1, MSB is first bit
- sum = y + z ; if sum == x, accept, reject
- while y > 0, i–, z++; if x == z, accept, reject
- while y > 0, y–, x–; while z > 0, z–, x–, if x == 0, accept, reject
- mark left end with submachine
- mini machine for initial validity (MSB check)
- mini machine for checking if \(y = 0\)
- mini machine for \(y--\)
- find LSB, if 1, replace with 0
- if 0, replace with 1, if next is 1, replace with 0, else, if next is 0, loop from start of step (if 0, replace w/ 1…)

- mini machine for \(z++\)
- check if overflow, if so, shift \(z\) right and add a 1, else, continue

- Another machine to check if \(x = z\)

## Lecture 25: Chapter 3, The Turing Machine – Extensions

- Standard TM: Left, Right, state machine and rewrite rules
- Variants
- Stay
- bi-infinite tape
- jump option
- non-deterministic
- multiple tapes
- multiple tape heads
- 2 dimensional tape

- Stay option
- \(\delta : Q \times \Gamma \to Q \times \Gamma \times \{L, R, S\}\)
- simulated using an intermediate state
- which doesn’t change the intermediate symbol

- Bi-infinite tape
- Mapping
- cell-to-cell – each of T2’s tape is mapped to a cell in T1’s tape – every other mapping
- string mapping – write the string of T2’s tape on T1’s tape

- Will require special transitions, with one or more special intermediate state

- Mapping