# Automata

## Lecture 1

• will study computability theory – whether a problem can be solved by a computer (ch 4, 5, 6)
• not studying complexity theory (if a problem can be solved, how much time/space it will take to solve) (ch 7, 8, 9, 10, CSCE 423, 424)
• Requires some preliminaries – ch 1-3
1. finite automata
2. push-down automata
3. Turing machines
• Ch 0 will be a review

## Lecture 2: Chapter 0

• review of
• sequence
• set
• string
• language

### Sequence

• a list of objects in some order (may be infinite)
• objects are referred to as elements or members
• parens and commas for notation
• two sequences are equal iff they have the same number of elements, the same elements, and the elements are in the same order
• $$A = (a_1, a_2, \ldots, a_n)$$
• $$B = (b_1, b_2, \ldots, b_m)$$
• equal when $$n = m$$ and $$a_1 = b_1$$, $$a_2 = b_2$$, $$\ldots$$, $$a_n = b_n$$
• Order matters
• repetition matters
• $$|A|$$ is the number of elements in A

### Sets

• A group of unordered objects (also called elements or members)
• braces are used, an empty set is the null set
• Two sets are equal iff they contain the same group of distinct elements – $$A = B \leftrightarrow \forall x (x \in A \land x \in B)$$
• Size again denoted w/ vertical bars, but is number of distinct elements in A
• A is a subset of B if every element of A is also an element of B
• $$P(A)$$ – the powerset of A – all subsets of A
• Cross product of A and B – $$A \times B = \{(a, b) | a \in A \land B \in B \}$$

### Strings

• A finite sequence of symbols over an alphabet
• An alphabet is a set of symbols
• empty string is denoted with $$\epsilon$$
• Concatenation $$xy$$, $$x = 01$$, $$y = 10$$, $$xy = 0110$$, the empty string is ignored
• Repetition $$x^i$$ – $$x$$ concatenated $$i$$ times, $$x=01$$, $$x^2 = 0101$$, $$x^0 = \epsilon$$
• Reverse $$x^R$$ – $$x = 01$$, $$x^R = 10$$, $$x = 001, x^R = 100$$, $$x = \epsilon, x^R = \epsilon$$
• $$x$$ is a palindrome if $$x = x^R$$
• dictionary order – sort by first symbol, then second for those with the same first, and so forth
• shortlex order – sort by length, and then in each length, sort by dictionary order

### Language

• A language is a set of string over an alphabet
• empty language denoted by empty set
• size denoted similarly

## Lecture 3: Chapter 1

• Regular Languages
• deterministic FA
• non-deterministic FA
• regular expressions
• what exactly is a regular language – how do they connect to problems that are solveable by the three models
• models used by many search/parse/editing tools
• testing and verification/model checking
• modelling OS/network protocols

### Deterministic Finite Automata

• state transition diagram – inputs are edges from a state (a node) to a state (node)
• a finite automaton – a state diagram, plus a start state and one or more final states
• start state and final states together define the set of strings we’re interested in
• start state indicated by arrow pointing from nowhere
• final/accept states are indicated by a double circle
• languages recognised by finite automaton
• for a finite automaton $$M$$ and input $$x$$, if $$M$$ stops at a final state after reading $$x$$, we say $$M$$ accepts $$x$$, otherwise, $$M$$ rejects $$x$$
• the language $$L(M)$$ recognised by a finite automaton $$M$$ contains all the strings accepted by $$M$$ (all strings that can go from start to end)
• DFA is 5-tuple $$(Q, \Sigma, \delta, q_0, F)$$
$$Q$$
set of states
$$\Sigma$$
set of input symbols
$$\delta$$
transition function $$\delta : Q \times \Sigma \to Q$$
$$q_0$$
initial state
$$F$$
set of final/accepting states
• State transition function can be described as diagram
• As a function
• as a table – first row is start state, starred rows is final state
• swapping final states and non-final states, you get the complement of the original
• when all states are accepting, it accepts all strings
• when no states are accepting, it accepts nothing

## Lecture 4: Chapter 1, Continued

### Design of DFA

• If a page contains a keyword – page, a string, and keyword, a substring
• contains unl, uno, or unk; also contains cse
• this is $$L$$, build DFA, minimize DFA to get minimum DFA
• Designing the DFA – 3 steps for $$M$$ from $$L$$
• Steps:
1. Determine information to remember about part of an input string that $$M$$ has already read
2. Represent information as a set of states
• determine start and final states
3. Assign transitions ($$\delta: Q \times \Sigma \to Q$$)
• $$L = \{x over \{0, 1\} | x no end 01\}$$
• Whether $$01$$ are the last two symbols
• Symbols
• 0
• 1
• 01
• 0 is start
• 0, 1 are accept
• Transitions:
• $$\epsilon \to 0$$
• $$(0, 0) \to 0$$
• $$(0, 1) \to 01$$
• $$(01, 0) \to 0$$
• $$(01, 1) \to 1$$
• Strings containing 10
• States
• 1
• 0
• 10
• 1 is start
• 10 is accept
• Transitions
• $$\epsilon \to 0$$
• $$(0,0) \to 0$$
• $$(0,1) \to 1$$
• $$(1,0) \to 10$$
• $$(1,1) \to 1$$
• $$(10,_) \to 10$$

## Lecture 5: Chapter 1, continued

### DFA Design, continued

• L - over 0, 1, contains 10, doesn’t end with 01
• Combine from two others
• Requires same alphabet
• need to keep state of both machines
• use pairs, built by the cross product of the two sets $$Q$$
• State transition function built as:
• $$\delta_3((a, b), sym) = (\delta_1(a, sym), \delta_2(b, sym))$$
• again, the cross product
• ensure that invalid states are removed
• From start state, add each tranision iteratively, going by what is present, as if doing breadth-first search
• For a $$\cup$$ combination:
• do as above
• but change final state, $$F_4 = \{(s_1, s_2) | s_1 \in F_1 \lor s_2 \in F_2\}$$

### DFA Minimization

• Three steps
• steps:
1. Partition all states into two groups $$G_1$$ – final states, $$G_2$$ – all non-final states
2. For each group, check whether all states in the group go to the same next group for each symbol.
• If not, partition into smaller groups
• repeating untill no mare partitions
3. All states in a group can be combined to a single state
• groups with original final states become a single, new final state
• The group with the original start state becomes the new start state

## Lecture 6: Chapter 1, continued

### Regular langugages

• a language is said to be regular if there exists some DFA recognizing it
• Theorem: For a regular language there exists a unique minimum DFA up to isomorphism
• def’n: two DFA are isomorphic if they have different state names but the same transition function

### NFAs

• why?
• many different models
• consider a dfa, all languages recognizable are regular, as do NFAs
• for some regular languages, NFAs are easier to design (ex all pages containing CSE, CS or CE)
• a transition may go to multiple states
• transitions may be on the empty string
• NFAs are describe similarly, but $$\delta: Q \times \Sigma_{\epsilon} \to \np(Q)$$
• So long as at least one of the states currently held is accepting, the string is accepted
• a decision tree/path forms

## Lecture 7: Chapter 1, continued

• all languages that can be recognized by a DFA are regular languages, likewise for NFA
• A proof that DFA and NFA are equivalent
• To show a languageg is regular, show either a DFA or an NFA
• DFAs are a special type of NFA
• So prove that NFAs are a subset of DFAs

### NFA to DFA

• Steps:
1. Information – copies of the NFA states, as sets of individual states
2. States – the powerset of $$Q$$ from the NFA, only consider the distinct states, accepting if any member of the state set was accepting
3. Transitions – $$\delta_d(R, sym) = \cup_{r \in R} \delta_n(r, sym)$$
• Remember to minimize the resulting DFA
• when handling epsilon transitions, use $$E(A) = A \cup \{ \mathrm{all states reached by following only epsilon transitions} \}$$ for the next state, $$\delta_d(R, sym) = E(\cup_{r \in R}\delta_n(r,sym))$$, start state is $$E(q_{n1})$$

## Lecture 8: Chapter 1, continued

### The Regular Expression

• Reasons
• new model
• used in many programming languages and software
• vi
• grep
• perl
• java
• python
• awk
• write a program to check whether a string matches a given regular expression
• An alphabet $$\Sigma$$

RE language
a {a}
$$\epsilon$$ {ε}
R1 ∪ R2 L(R1) ∪ L(R2)
R1R2 L(R1) ˆ L(R2)
R1* L(R1)*
• Σ = {0, 1}
• $$11* \cup 0*00*$$
• $$*$$ means zero or more
• $$\circ$$ means a followed by b, $$A \circ B = \left\{ xy \mid x \in A \land y \in B \right\}$$, essentialy the cross product
• if is the empty set, the concatenation is the empty set
• $$A^n$$ is A concatenated with itself $$n$$ times
• $$A^*$$ is $$A^0 \cup A^1 \cup A^2 \cup A^3 \cup \cdots$$
• $$A^+$$ is $$A^1 \cup A^2 \cup A^3 \cup \cdots$$

## Lecture 9: Chapter 1, continued

• Showing that REs accept regular expressions
• RE to DFA/NFA
• $$a$$ - start state, accept a, to final state
• $$\epsilon$$ - start state is final state, no transitions other than the initial epsilon
• $$\emptyset$$ - start state, no transitions, no final state
• $$R_1 \cup R_2$$ - given a machine for each, use an epsilon transition to each as an NFA
• $$R_1 \circ R_2$$ - all final states of $$R_1$$ have an epsilon transition to the initial state of $$R_2$$
• $$R_1^*$$ - Empty state before initial, epsilon transition from each final state to the original start state

## Lecture 10: Chapter 1, continued

• DFA/NFA to RE
• Requires GNFA
• NFA + RE as transition labels
• Conversion:
1. DFA to GNFA in a special form
1. Start state only has outgoing transitions – often just add a new start state w/ ε transition
2. one final state, only incoming transitions – Give all final states an ε transition to a new final state
3. Only Any state to any state: at most one transition – merge transitions with union
2. Remove states one by one, except the start and final state – label each transition with concat of each relevant transition
3. The label between start and final is the regular expression

## Lecture 11: Chapter 1, continued

• HW1 Due 09-21, No class 09-24, HW reviewed by TA 09-26, Exam 1 09-28
• The Pumping lemma
• describes a common property of all regular languages, and some non-regular languages
• if a language has the property is it regular?
• it may be regular
• if it doesn’t have the property, than it isn’t regular
• based on pigeonhole principle
• if more than $$p$$ pigeons are placed into $$p$$ holes, then at least one hole will have at least one pigeon
• Consider a reg. lang. $$A$$ and a DFA for $$A$$ as well as a string $$s \in A$$ of $$n$$ symbols
• The path must then consist of $$n + 1$$ states
• if $$n + 1 > |D|$$, then the path repeats some states
• $$xz$$ – $$x$$ takes it to 9, $$z$$ takes it from 9 to 13
• $$xyyz$$ – $$x$$ takes to 9
• For any integer $$i \geq 0$$, $$xy^iz$$ is accepted by the machine
• If $$A$$ is a regular language, then there is a number $$p$$ (the pumping length) such that for any string $$s$$ in $$A$$ with $$|s|\geq p$$ it can be divided into two three pieces $$s = xyz$$ satisfyining the following conditions
1. for each $$i \geq 0$$, $$xy^iz \in A$$
2. $$|y| > 0$$
3. $$|xy|\leq p$$
• Number of states is often the pumping number $$p$$ (in particular of a minimized DFA)
• contrapositive of the pumping lemma
• normal: if $$A$$ is regular, then for any long enough string, some part of its first $$p$$ symbols can be pumped
• contrapositive: if for some long enough string $$s$$ in $$A$$ none of its first $$p$$ symbols can be pumped, then $$A$$ is not regular
• Language $$A$$ is not regular if, for every arbitrary number $$p$$, there exists a string $$s$$ in $$A$$ with $$|s|\geq p$$ having the following property:
• for any decomposition of $$s = xyz$$ in which $$|xy| \leq p$$, and $$|y| > 0$$,…
• Do it.
1. Let $$p$$ denote an arbitrary number
2. pick a string $$s \in A$$ with $$|s| \geq p$$
3. identify all possible decomps of $$s$$ into $$xyz$$ with $$|xy| \leq p$$ and $$|y| \geq 0$$
4. show that for each decomp, there exists an $$i \geq 0$$ s.t. $$xy^iz \not\in A$$ (i may be different for each decomp, if it isn’t a good choice, try again)
5. conclude that $$A$$ is not regular.
• $$L_1 = \left\{ w \mid w \mathrm{ has the same number of a's and b's} \right\}$$
• let $$p$$ be arbitrary
• Let $$s = (ab)^p$$ and we have $$(ab)^p \in L_1$$, and $$|(ab)^p| = 2p > 1$$
• wrong
• $$s = xyz$$, $$x = \epsilon$$, $$y = aba$$, $$z = b(ab)^{p-2}$$
• let $$i = 0$$, then $$xy^iz = xz = b(ab)^{p-2} \not\in L_1$$
• Therefore, $$L_1$$ isn’t regular
• Identify all possible decomps of $$s$$ into $$xyz$$ with $$|xy| \leq p$$ and $$|y| > 0$$
• Try
• $$p$$ is arbit.
• $$s = a^pb^p$$ and we have $$a^pb^p \in L_1$$
• Ident all possible decomps of $$s$$ into $$xyz$$ with $$|xy| \leq p$$ and $$|y| > 0$$
• $$x = \epsilon$$, $$y = a^p$$, $$z = b^p$$
• $$k = a^n$$, $$p-k \geq n \geq 0$$
• $$y = a^k$$, $$p \geq k \geq 1$$
• $$z = a^{p-n-k}b^p$$
• ex, $$y = 2$$, thus, language isn’t regular.
• Consider $$L_2 = \{ww | w \in \{a,b\}^*\}$$
• $$p$$ is arbitrary
• $$s = a^pba^pb$$
• $$y = a^k$$, $$p \geq k \geq 1$$, $$x = a^n$$, $$p-k \geq n \geq 0$$; $$z$$ is the remainder
• falls through as before

## Lecture 12: Chapter 1, Continued

• proving non-regularity
• $$L_3 = \{a^mb^n | m > n\}$$
• $$p$$ is arbitrary
• $$s = a^{p+1}b^p$$
• $$y=a^k$$, $$p \geq k \geq 1$$
• $$x=a^n$$, $$p-k \geq n \geq 0$$, $$z = remaining$$
• $$i = 0$$, $$xy^iz = xz = a^{p+1-k}b^p$$, $$\not\in L_3$$
• Closure properties
• $$A \cup B$$ – regular
• combine NFA/DFA – easy
• $$A \circ B$$ – regular
• combine NFA/DFA – similarly easy
• $$A^*$$ – regular
• Allow accepting empty string and repeats
• $$A \cap B$$ – regular
• see notes for DFA, can be extended to NFA
• $$\bar{A}$$ – regular
• switch final/non-final states
• Reverse of a language a is defined as follows: $$A^R = \{w^R | w \in A\}$$
• change direction of transitions, start state becomes final, if multiple final states, add a new state w/ epsilon transitions to old final states
• Also regular
• $$\Sigma = \{a, b\}$$, $$\Gamma = \{0, 1\}$$
• $$h(a) = 00$$, $$h(b) = 01$$
• if $$A$$ is regular, than $$h(A)$$ is also regular
• Closure properties can be used as for a contrapositive w/ the pumping lemma.
• $$L_4 = \{w | w \text{has different number of a's and b's}\}$$
• We see that $$\bar{L_4} = L_1$$
• we have already proven that L1 is not regular, therefore L4 is not regular
• $$L_5 = \{a^nb^n | n \geq 0\}$$
• $$L_5 = L_1 \cap a*b*$$
• therefore, $$L_5$$ is not regular – wrong! L1 is not regular to begin with!
• Subset property doesn’t apply either

## Lecture 13: Chapter 2, Context Free Languages

• is it possible to design a computer to solve a problem?
• chapter 1 , whether or not it’s possible to design a DFA/NFA/RE for a given language
• to simplify – consider only decision problems – is a number divisible by 5
• Is it possible to design a computer to solve a decision problem
• what does it mean to solve – determine yes or no given an input
• input must be encoded
• whether or not strings are accepted
• is it possible to design a computer to recognize lang of decision problem
• Models and Languages:
1. DFA/NFA/RE – smallest group, regular languages
2. PDA/CFG – Context Free Languages
3. Real Computers – problems solvable by computer
4. Turing Machines – Turing-recognizable languages
• The CFG
• mathematical model to describe languages
• widely used in many applications
• specify programming languages
• generate images (contextfreeart.org)
• automatically generate CS papers (pdos.csail.mit.edu/scigen)
• Ex:
• $$A \to 0A1$$
• $$A \to \epsilon$$
• LHS is a variable
• non variables, non empty string are terminals
• $$(V, \Sigma, R, S)$$
• $$V$$ – Set of Variables
• $$\Sigma$$ – Set of Terminals (alphabet)
• $$R$$ – Set of Rules
• $$S$$ – Start Variable
• Rules may be combined – $$A \to 0A1 | \epsilon$$
• Start from variable – find a rule that matches, and continue by variable.
• $$\left\{ w \mathrm{over} \Sigma | S \Rightarrow^* w \right\}$$
• parse trees may be used to explain a CFG
• graphical/tree-based structure
• Ex:
• $$S \to S + S$$
• $$S \to a$$
• CFGs are more powerful that DFA/NFA/RE
• Convert REs to a CFG
1. $$a \iff S \to a$$
2. $$\epsilon \iff S \to \epsilon$$
3. $$\emptyset \iff S \to S$$ (or no rules)
4. $$R_1 \cup R_2 \iff S_1 \to R_1, S_2 \to R_2, S \to S_1 | S_2$$
5. $$R_1 \circ R_2 \iff S \to SR_1 SR_2$$
6. $$R_1^* \iff S \to \epsilon | SR_1 S$$
• For NFA/DFA:
• $$S_i = sym S_j$$

## Lecture 14: Review

• be careful when designing DFAs
• use the examples
• make sure to minimize

## Lecture 15: Context Free Grammars

• $$L_1 = \left\{a^nb^n | n \geq 0 \right\}$$
• $$R \to \epsilon$$
• $$R \to a R b$$
• Two methods:
• Union Method – union of multiple smaller languages
• concatenation method – concatenation of multiple smaller languages
• recursive – $$L_1L_2L$$
• special cases
• $$L_2 = \left\{a^m b^n | m > n \geq 0\right\}$$
• $$S_2 \to a S_2$$
• $$S_2 \to a S_1$$
• $$L_2 = \left\{a^mb^n | m \neq n\right\}$$
• $$S_3 \to S_2 | B$$
• $$B \to B b$$
• $$B \to S_1 b$$
• $$B^{\prime} \to B b | b$$
• $$L_4 = \{ a^ib^ja^k | j > i + k, i \geq 0, k \geq 0 \}$$
• $$S_4 \to S_1 B^{\prime} S_1^R$$
• $L5 = \{ x over \{a, b\} | x is a palindrome \} • $$S_5 \to \epsilon$$ • $$S_5 \to a$$ • $$S_5 \to b$$ • $$S_5 \to b S_5 b$$ • $$S_5 \to a S_5 a$$ • $$L_6 = \{ x over \{a, b\} | x is not a palindrome \$$} • $$S_6 \to a S_6 a$$ • $$S_6 \to b S_6 b$$ • $$S_6 \to a T b$$ • $$S_6 \to b T a$$ • $$T \to aT | bT | \epsilon$$ • $$L_7 = \{ x over \{a, b\} | x has the same number of a's and b's \}$$ • $$S_7 = a S_7 a$$ • $$S_7 = b S_7 a$$ • $$S_7 = S_7 S_7$$ • $$S_7 = \epsilon$$ ## Lecture 16: PDA (Pushdown Automaton) • a new model • Adds a stack to a DFA/NFA • cfg – specify a programming language • PDAs can be used to design the compiler (See CSCE 425) • PDAs may be built upon either DFA or NFA, based upon NFA is more powerful • Stacks have 2 ops • push – push a symbol onto the top of the stack • pop – pop the symbol off of the top of the stack • Edges are now labeled differently: ($$a, B \to C$$) • if current state is $$P$$, and $$a$$ is input symbol, and $$B$$ is the top of the stack symbol, then take transition, read input symbol, POP symbol B, and push $$C$$, going to new state $$Q$$ • with $$\epsilon$$ as $$B$$, the transition may be taken no matter the current symbol, and no symbol is to be popped • with $$\epsilon$$ as $$C$$, don’t push any symbol • $$\epsilon, \epsilon \to \epsilon$$ – don’t read, don’t change the stack • shortcuts do exist • Formal Description: • NFA: $$Q$$ States $$\Sigma$$ Alphabet $$\delta$$ $$Q \time \Sigma_\epsilon \to P(Q)$$ $$q_0$$ Start state $$F$$ Final States • PDA: $$Q$$ States $$\Sigma$$ Input Alphabet $$\Gamma$$ Stack Alphabet – all possible stack symbols $$\delta$$ $$Q \time \Sigma_\epsilon \times \Gamma_\epsilon \to P(Q \times \Gamma_\epsilon)$$ $$F$$ Final States • \$ is used to denote the first thing (popping will be empty stack)
• An input string is accepted by a PDA if at least one copy of the PDA stops at a final state after reading the input string.
• Use a ’configuration’ to show how it works in homework:
• (current state, remaining input symbols, contents of the stack)
• $$(1, aab, \epsilon) \Rightarrow (2, aab, \$$) ⇒ (2, ab, A\$$) \Rightarrow (2, b, AA\$$) ⇒ (3, ε, A\$$) \Rightarrow (4, \epsilon, \$$)$## Lecture 17: PDAs, Continued • Designing a PDA for a language • Split input string • $$S_1 = S_2^R$$ • $$|S_1| \leq |S_2|$$ or other comparison • $$L_1 = \{ wcw^R | w over \{a,b\}\}$$ • $$Q = \{1,2,3,4}$$ • $$q_0 = 1$$ • $$F = \{4\}$$ • 2 makes a copy of the first part • 3 compares the first part with the second • transition to 4 on acceptance • $$L_2 = \{ ww^R | w over \{a,b\}\}$$ • $$Q = \{1,2,3,4}$$ • $$q_0 = 1$$ • $$F = \{4\}$$ • 2 makes a copy of the first part • epsilon, epsilon, epsilon transition between 2 and 3 • 3 compares the first part with the second • transition to 4 on acceptance • $$L_3 = \{ w | w = w^R over \{a, b\} \}$$ • add a transition between 2,3 tha reads either char but doesn’t change the stack ## Lecture 18: Equivalence of PDA and CFG • PDA to CFG • Special form of PDA as first step • Only one final state – epsilon transitions from old final sates to new • Stack must be empty at the final state • Ex: $$\Gamma = \{ \$$, A, B \}$
• Requires loop-back transitions to pop remaining stack symbols at the end
• A transiion is either a push transition ($$a, \epsilon \to x$$) or a pop transition ($$a, x \to \epsilon$$)
• States P, Q – CFG Var $$A_pq$$ (all strings that can go from p to q without changinging the top stack symbol at p, and the top stack at p has never been popped) – Three types:
• $$p\to r$$ ($$a, \epsilon \to x$$) (push), $$s \to q$$ ($$b, x \to e$$) (pop) – $$A_pq = a A_rs b$$
• $$p v q$$ – $$A_pq \to A_pv A_vq$$
• $$p$$ – $$A_pp \to \epsilon$$

## Lecture 19: PDA to CFG

• Continue given previous rules
• Generating rules from the three types of rules
• In $$A_pq \to a A_rs b$$, $$A_rs$$ must satisfy the same conditions as $$A_pq$$
• push to pop in order for generating rules

## Lecure 20: The Pumping Lemma for CFGs

• Review of the Pumping Lemma for Regular Langugages
• The loop
• Pumping Lemma for CFGs
• Assume each node has at most 3 child nodes
• If height is 1, it has at most 3 nodes
• If a tree has at least 3 leaf nodes, its height is at least 1
• $$S \to aSb$$, $$S \to \epsilon$$, $$ab$$
• Nuber of child nodes of a variable node is the number of symbols on the rhs of the rule
• Assume that each rule has at most $$b$$ symbols on the RHS, if a string as at least $$B$$ symbols, the height of the parse tree is at least $$b$$
• Generally,
• $$b$$ is the max number of symbols on the RHS of a rule
• $$v$$ is the total number of variables
• Consider a string $$s$$ that can be derived by the grammar, if $$s$$ has at least $$b^{v+1}$$ symbols, then
• the height of the trea is at least $$v + 1$$
• and the number of nodes is at least $$v + 2$$
• The number of variables – the longest path has at least $$v + 1$$ variable nodes
• According to the pigeonhole principle, some variable occurs at least once on the longest path
• the number of terminals – $$v + 1$$
• here are two subtrees under this parse tree – one for the earlier $$T$$ and another for the later.
• The inner tree may be replaced with the outer, or likewise the outer with the inner
• Requires 2 pumped sequences?
• Lemma: If $$A$$ is a CFL, then there exists a pumping length $$p$$ such that for any stirng $$s \in A$$ with $$|s| \geq p$$, it can be devided into five pieces, $$s = uvxyz$$ satisfying the following conditions:
1. For any integer $$i \geq 0$$, $$uv^ixy^iz \in A$$
2. $$|vy| > 0$$ (one of $$v$$ and $$y$$ could be empty, but not both at the same time)
3. $$|vxy| \leq p$$ ($$vxy$$ are not necessarily the first $$p$$ symbols since $$|u|$$ is unknown)
• Contrapositive: if for any $$p$$ there exists a string $$s \in A$$ with $$|s| \geq p$$, having the following property: For any decomposition of $$s = uvxyz$$ in which $$|vy| > 0$$ and $$|vxy| \leq p$$, there is an integer $$i \geq 0$$ for which $$uv^ixy^iz \not\in A$$, then the language is not context free.
• Using the new Pumping Lemma
• Prove that $$L = \left\{ a^nb^nc^n | n \geq 0\right\}$$
• Let $$p$$ be an arbitrary number,
• Consider $$s = a^pb^pc^p$$ and consider all possible decompositions as specified
• Cases:
1. $$vy$$ has only $$a$$
• if we set $$i = 0$$, we remove both $$v$$ and $$y$$
• note, $$i \neq 0$$ prevents validity
2. $$vy$$ consists of both $$a$$ and $$b$$
• $$i = 0$$, any $$i \neq 1$$
3. $$vy$$ has only $$b$$
• See above
4. $$vy$$ consists of both $$b$$ and $$c$$
• See above
5. $$vy$$ has only $$c$$
• See above
• Likewise, $$\{w over \{a, b, c\} | |w|_a = |w|_b = |w|_c\}$$ follows, by the same string

## Lecture 21: The Pumping Lemma, Continued

• For a string $$s$$ in a cfl, $$A$$, if $$s$$ is sufficiently long, then on the longest path from the root to a leaf in the parse tree, some variable appears more than once.
• If $$A$$ is a cfl, then there exists a pumping length $$p$$ such that for any stirng $$s \in A$$, with $$|s| > p$$, it can be dividied into five pieces, $$s = uvxyz$$, satisfying the following conditions:
1. For any integerr $$i \ge 0$$, $$uv^ixy^iz \in A$$
2. $$|vy| > 0$$
3. $$vxy \leq p$$
• Describes a common property
• if has the property, it may be cf, but if it doesn’t, it isn’t CF, see contrapositive above
• Prove that the following language is not context free: $$L = \{ ww | w over \{a,b\}\}$$
• Given cases 1-8, if $$i \neq 1$$, then $$s \not\in L$$
• For case 9:
• if $$v$$ and $$y$$ have the same number of $a$s, string $$a^pba^pb$$ isn’t a good choice
• Let $$p$$ be arbitrary, and $$s = a^pb^pa^pb^p$$
• For each case, if $$i = 0$$, the string doesn’t hold. Any integer $$i \neq 1$$ holds. Language is not CFG
• For $$L = \{ wcw | w over \{a,b\}\}$$
• First Proof:
• $$s = a^pb^pca^bp^b$$
• 11 cases
• For each case, if $$i = 0$$, the string doesn’t fit
• Different Proof:
• List all possible decomps depending on whether $$vy$$ contains symbol $$c$$
• 2 cases:
• $$c \in vy$$
• $$i = 0$$, new string $$uxz$$ doesn’t have $$c$$, and thus not in $$L$$, in fact, $$i \neq 1$$ works.
• $$c \not\in vy$$
• Because $$|vxy| \leq p$$, $$v$$ and $$y$$ must contain a different number of symbols. Therefore, $$s \not\in L$$, and language not CF.

### Closure Property

• CLASS is closed under $$\odot$$ if $$A,B \in CLASS$$, then $$A\dotcirc B \in CLASS$$
• Properties:
• Union – Closed
• Concatenation – Closed
• Star – Closed
• Intersection – Not Closed
• Complement – Not Closed

### Proving $$L$$ is CF

• Design a PDA
• Design a CFG
• Use closure property
• To prove not CF
• use pumping lemma
• and closure property

## Lecture 22: Context-Free Languages, continued

### CFG to PDA

• Ex:
• $$S \to S_1 S_2$$
• $$S_1 a S_1$$
• $$S_1 \to a$$
• $$S_2 \to a S_2 b$$
• $$S_2 \to \epsilon$$
• Generate a transition for each rule, and for each terminal
• terminals pop items, rules may push as well as pop
• Use left-most derivation to follow the PDA

### Group Activity

• Be careful when traversing a tree from a PDA

### Midterm Review

• pumping lemma was a problem
• $$s$$ must be defined generally, and $$|s| \geq p$$
• Consider all possible decompositions, $$s = xyz$$, $$s$$ must be specific!

## Lecture 23: Chapter 3, The Turing Machine

• Initially only the standard turing machine, but from there, some variants
• Recall DFA/NFA and PDA/CFG
• Turing Machine
• Note: These are deterministic
• semi-infinite input tape, infinite to the right
• blank symbols after the original string, infinitely many
• finite number of states
• Input tape may be written to
• tape head can be moved left or right
• Transition, ex: from $$P$$ to $$Q$$, $$a \to b, c$$
• $$P$$ – current state
• $$Q$$ – next state
• $$a$$ is the current tape symbol
• $$b$$ is the new tape symbol
• $$c$$ is either $$L$$ or $$R$$, $$L$$ to move head to left, $$R$$ to move to right
• Moving left at leftmost edge
• Recognizes Turing-recogniseable languages (i.e., $$a^nb^nc^n$$)
• Has 2 final states – accept or reject
• When the machine enters the accept state, the machine accepts
• Enters the reject state or no more transitions to take
• Status may be described as a “configuration”, (symbols on lhs of tape head, current state, current symbol, symbol on rhs of tape head)
• Described as:
• $$Q$$ – States
• $$\Sigma$$ – Input Symbols
• $$\Gamma$$ – Tape Alphabet, $$\Sigma \subseteq \Gamma$$, Space $$\in \Gamma$$
• $$\delta$$ – $$Q \times \Gamma \to Q \times \Gamma \times \{l, r\}$$
• $$q_{0}$$ – Initial State
• $$q_{accept}$$ – Accept State
• $$q_{reject}$$ – Reject State

## Lecture 24: Chapter 3, The Turing Machine

• Design of a TM is like writing a program
• Algorithm is a high-level idea
• TM is the implementation
• $$\{a^{2^n} | n \geq 0\}$$
• Count number of $a$s, say $$m$$
• check if $$m$$ is a power of 2
• Unary number to represent $$m$$
• use a binary number to represent $$m$$
• Unary numbers, single symbol, $$m$$ is represented by $$m$$ symbols
• binary, two symbols, etc
• $$a$$ with unary number
• input string is already a unary number
• Checking whether a decimal number is a power of 2
• divide by 2 until an odd number. If the odd number is $$1$$, $$m$$ is a power of 2
• Divide by 2, by replacing every other $$a$$ with a special symbol
• See turing machine from slides (TM1)
• $$a$$ with binary counter
• remove an a from the string, increase counter by one
• count should be lsb first
• check to make sure that only the last bit is 1
• only count $$m - 1$$, and then check only 1s in counter
• See TM2 from slides
• $$#x#y#z$$ where $$x = y + z$$ and \$x, y, and $$z$$ are unsigned binary numbers, and the MSB is 1, MSB is first bit
• sum = y + z ; if sum == x, accept, reject
• while y > 0, i–, z++; if x == z, accept, reject
• while y > 0, y–, x–; while z > 0, z–, x–, if x == 0, accept, reject
• mark left end with submachine
• mini machine for initial validity (MSB check)
• mini machine for checking if $$y = 0$$
• mini machine for $$y--$$
• find LSB, if 1, replace with 0
• if 0, replace with 1, if next is 1, replace with 0, else, if next is 0, loop from start of step (if 0, replace w/ 1…)
• mini machine for $$z++$$
• check if overflow, if so, shift $$z$$ right and add a 1, else, continue
• Another machine to check if $$x = z$$

## Lecture 25: Chapter 3, The Turing Machine – Extensions

• Standard TM: Left, Right, state machine and rewrite rules
• Variants
• Stay
• bi-infinite tape
• jump option
• non-deterministic
• multiple tapes
• $$\delta : Q \times \Gamma \to Q \times \Gamma \times \{L, R, S\}$$