TOC Notes 2020
TOC Notes 2020
TOC Notes 2020
Computation
Compiled by
Department of Computing
Abasyn University Peshawar
2020
This question goes back to the 1930s when mathematical logicians first began to explore the
meaning of computation. Technological advances since that time have greatly increased our
ability to compute and have brought this question out of the realm of theory into the world of
practical concern.
In each of the three areas-automata, computability, and complexity-this question is interpreted
differently, and the answers vary according to the interpretation. Following this, we explore
each area in separately. Here, we introduce these parts in reverse order because starting from
the end you can better understand the reason for the beginning.
This is the central question of complexity theory. Remarkably, we don't know the answer to it,
though it has been intensively researched for the past 35 years. In one of the important
achievements of complexity theory thus far, researchers have discovered an elegant scheme
for classifying problems according to their computational difficulty. It is analogous to the
periodic table for classifying elements according to their chemical properties. Using this
scheme, we can demonstrate a method for giving evidence that certain problems are
computationally hard, even if we are unable to prove that they are.
You have several options when you confront a problem that appears to be computationally
hard.
The theories of computability and complexity are closely related. In complexity theory, the
objective is to classify problems as easy ones and hard ones, whereas in computability theory
the classification of problems is by those that are solvable and those that are not.
Computability theory introduces several of the concepts used in complexity theory.
Another model, called the context-free grammar, is used in programming languages and
artificial intelligence.
Central Question in Automata Theory: Do these models have the same power, or
can one model solve more problems than the other?
Definition 1:
A finite automaton is a 5-tuple (Q, Σ, δ, qo, F), where
1. Q is a finite set called the states,
2. Σ is a finite set called the alphabet,
3. δ:Q x Σ → Q is the transition function,
4. qo Q is the start state, and
5. F Q is the set of accept states.
2.2 Sets
Importance: languages are sets
A set is a collection of "things," called the elements or members of the set. It is
essential to have a criterion for determining, for any given thing, whether it is or is not
a member of the given set. This criterion is called the membership criterion of the set.
There are two common ways of indicating the members of a set:
o List all the elements, e.g. {a, e, i, o, u}
o Provide some sort of an algorithm or rule, such as a grammar
Notation:
o To indicate that x is a member of set S, we write x S
o We denote the empty set (the set with no members) as {} or
o If every element of set A is also an element of set B, we say that A is a subset
of B, and write A B
o If every element of set A is also an element of set B, but B also has some
elements not contained in A, we say that A is a proper subset of B, and write A
B
2.5 Functions
A function is a rule which relates the values of one variable quantity to the values of another
variable quantity, and does so in such a way that the value of the second variable quantity is
uniquely determined by (i.e. is a function of) the value of the first variable quantity.
2.6 Graphs
Automata are graphs. A graph consists of two sets:
A set V of vertices (or nodes), and
A set E of edges (or arcs).
o An edge consists of a pair of vertices in V. If the edges are ordered, the graph
is a digraph (a contraction of "directed graph").
o A walk is a sequence of edges, where the finish vertex of each edge is the start
vertex of the next edge. e.g.: (a, e), (e, i), (i, o), (o, u).
o A path is a walk with no repeated edges.
o A simple path is a path with no repeated vertices.
2.7 Trees
Trees are used in some algorithms.
A tree is a kind of digraph:
o It has one distinguished vertex called the root;
o There is exactly one path from the root to each vertex; and
o The level of a vertex is the length of the path to it from the root.
Terminology:
o If there is an edge from A to B, then A is parent of B, and B is the child of A.
o A leaf is a node with no children.
o The height of a tree is the largest level number of any vertex.
Alphabet
An alphabet is a finite, nonempty set of symbols, Σ (sigma) is used for an alphabet.
For example
{0, 1} is an alphabet with two symbols
{a, b} is another alphabet with two symbols, and
English alphabet is also an alphabet
Σ = {0,1}, or Σ = the binary alphabet
Σ = {a,b,…..z}, or Σ=the set of all lower-case letters
String
o A string is a finite sequence of symbols chosen from some alphabet. e.g. 01101
is a string from the binary alphabet Σ = {0, 1}.
o The number of symbols in a string is called the length of a string. e.g. 011101
has length 5 and the number of symbols are 2. The standard notation for the
length of a string w is |w|. e.g. |011| = 3 and | ε | = 0
o The empty string (also called null string) is the string with length zero. That is,
it has no symbols. It is denoted with ε (epsilon), that may be chosen from any
alphabet whatsoever. Thus | ε | = 0.
o Powers of an Alphabet: If Σ is an alphabet, we define Σk to be the set of strings
of length k, each of whose symbols is in Σ. e.g. Σ0 = {ε}. If Σ = {0,1}, then
Σ1 = {0,1}
Σ2 = {00,01,10,11}
Σ3 = {000,001,010,011,100,101,110,111}
o Kleene Star: The set of all strings over an alphabet is denoted as Σ*. For
instance, {0,1}* = {ε,0,1,00,01,10,11,000,….}. Put another way,
Σ* = Σ0 U Σ1 U Σ2 U Σ3 ….
Σ+ = Σ1 U Σ2 U Σ3 ….
o Concatenation of Strings: Let x and y be strings. Then xy denotes the string
obtained by concatenating x with y, that is, xy is the string obtained by
appending the sequence os symbols of y to that of x. e.g. if x = aab and y =
bbab, then xy = aabbbab. Note that xy ≠ yx.
Language Examples
o L1 = {w {a, b}* : each a in w is immediately preceded and immediately
followed by a b}.
o L2 = {w {a, b}* : w has abab as substring}.
o L3 = {w {a, b}* : w has neither aa nor bb as a substring}.
o L4 = {w {a, b}* : w has an even number of substring ab}.
o L5 = {w {a, b}* : w has both ab & ba as substrings}.
o L6 = {w {a, b}* : w contains a’s and b’s and end in bb}.
o L7 = {w {a, b}* : w contains different first and last letters. If word begins
with a a, to be accepted it must end with a b and vice versa}.
o L8 = {w {a, b}* : w starts with a and has odd number of a’s or starts with b
and has even number of b’s }.
o L9 = {w {a, b}* : w has three consecutive b’s (not necessarily at the end)}.
o L10 = {w {a, b}* : w contain odd number of a’s and b’s}
There are three fundamental concepts that we will be working with in this course:
Languages
o A language is a subset of the set of all possible strings formed from a given set
of symbols.
o There must be a membership criterion for determining whether a particular
string is in the set.
Grammars
o A grammar is a formal system for accepting or rejecting strings.
o A grammar may be used as the membership criterion for a language.
Automata
o An automaton is a simplified, formalized model of a computer.
o An automaton may be used to compute the membership function for a
language.
o Automata can also compute other kinds of things.
Assignment No. 1
One designated state is the start state. Some states (possibly including the start state) can be
designated as final states. Final states are represented by double-circle.
Arcs between states represent state transitions -- each such arc is labeled with the symbol that
triggers the transition.
Example 3.1: Design a DFA to accept all strings of even number of 0’s and 1’s.
Example input string: 1 0 0 1 1 1 0 0
Operation
Start with the "current state" set to the start state and a "read head" at the beginning of the
input string; while there are still characters in the string:
Sample trace:
Definition δ*: The fact that δ is a function implies that every vertex has an outgoing arc for
each member of Σ. We can also define an extended transition function δ* as δ*: Q x Σ* Q.
Basis: δ*(q0, ε) = q0
Induction: Suppose w = xa then δ*(q, w) = δ(δ*(q, x), a)
If a DFA M = (Q, Σ, δ, q0, F) is used as a membership criterion, then the set of strings
accepted by M is a language. That is, L(M) = {w Σ* : δ*(q0, w) F}.
Languages that can be defined by DFAs are called regular languages.
A state may have two or more arcs emanating from it labeled with the same symbol (Figure
5). When the symbol occurs in the input, either arc may be followed.
These are all the same as for a DFA except for the definition of δ:
Transitions on ε (or λ) are allowed in addition to transitions on elements of Σ, and
The range of δ is 2 rather than Q. This means that the values of δ are not elements of
Q, but rather are sets of elements of 2Q.
Definition: The fact that δ is a function implies that every vertex has an outgoing arc for each
member of Σ. We can also define an extended transition function δ*as δ*: Q Σ* 2Q.
Basis: δ(q, ε) = q
Induction: Suppose w = xa then δ*(q, w) = ⋃𝑘𝑖=1 δ(δ∗ (𝑝𝑖 , 𝑥), 𝑎)
Example 3.2:
(q0,aba) =?
(q0, ε ) = {q0}
:
For any state p of an NFA we define the ε-closure of p to be set ε-closure(p) consisting of all
states q such that there is a path from p to q whose spelling is ε. This means that either q = p,
or that all the edges on the path from p to q have the label .
Definition
Basis: State p is in ε-closure of p.
Induction: If state p is in ε-closure(q), and there is transition from state p to state r labelled ,
then r is also in ε-closure(q)
Definition: The extended transition function δ*: Q (Σ {}) 2Q for ε-NFA is define as:
Basis: δ* (q, ) = ε-closure(q)
Induction: Suppose w = xa then δ*(q, xa) = ⋃𝑝∈δ(δ∗(𝑞𝑖 , 𝑥),𝑎) ε − closure(𝑝))
Example 3.3
δ(q0, +12.5) =?
δ(q0, ) = ε-closure({q0})= { q0, q1}
:
Assignment No. 2
For any NFA, we can construct an equivalent DFA (see below). So NFAs are not more
powerful than DFAs. DFAs and NFAs define the same class of languages -- the regular
languages.
To translate an NFA into a DFA, the trick is to label each state in the DFA with a set of states
from the NFA. Each state in the DFA summarizes all the states that the NFA might be in. If
the NFA contains |Q| states, the resultant DFA could contain as many as |2 | states. (Usually
far fewer states will be needed.)
Q = states = {1, 2, 3, 4, 5}
Start state: { 1 }
Accepting state(s): { 5 }
The ε-closure of a set of states, R, of the NFA will be denoted by E(R). E(R) = R ∪ { q | there
is an r in R with an ε transition to q }. In the example, E({1}) = {1} ∪ { 2 } = {1,2}
a b
{1,2}
3. Compute the transition function for the DFA from the start state.
a. For one of the inputs, say 'a', consider all possible states that can be reached in
the NFA from any one of the states in {1,2} on input 'a'. These are states that
are reachable along paths labeled 'a', also allowing any edges labeled ε.
a b
{1,2} {3,4,5} ∅
b. Next compute the transitions from the start state with input 'b'. But when the
NFA transitions are examined there are no paths from either state in {1,2} with
label 'b'. So the subset of states that can be reached is the empty set, ∅.
4. If a new state is reached when the transitions on 'a' and 'b' are computed, the process
has to be repeated this new state. For example, {3, 4, 5} is a new state for the DFA and
so we must compute transitions from this state.
5. Continuing filling in the table as long as any states are entered that do not yet have a
row. For example neither {5} or {4, 5} have a row yet. So pick one and compute its
transitions.
The final states of the DFA are the sets that contain 5 since that is the only final state of the
NFA. The final table and corresponding DFA state diagram are:
a b
{1,2} {3,4,5} ∅
{3,4,5} {5} {4,5}
{5} ∅ ∅
{4,5} {5} {5}
∅ ∅ ∅
Converted DFA and its transitions.
Assignment No. 3
Grammar
V is a finite set of (meta) symbols, or variables.
T is a finite set of terminal symbols.
S V is a distinguished element of V called the start symbol.
P is a finite set of productions.
The above is true for all grammars. We will distinguish among different kinds of grammars
based on the form of the productions. If the productions of a grammar all follow a certain
pattern, we have one kind of grammar. If the productions all fit a different pattern, we have a
different kind of grammar.
Different types of grammars can be defined by putting additional restrictions on the left-hand
side of productions, the right-hand side of productions, or both.
We know that languages can be defined by grammars. Now we will begin to classify
grammars; and the first kinds of grammars we will look at are the regular grammars.
To be a right-linear grammar, every production of the grammar must have one of the two
forms V T*V or V T*.
You do not get to mix the two. For example, consider a grammar with the following
productions:
S
S aX
X Sb
This grammar is neither right-linear nor left-linear, hence it is not a regular grammar. We have
no reason to suppose that the language it generates is a regular language (one that is generated
by a DFA).
In fact, the grammar generates a language whose strings are of the form a b . This language
cannot be recognized by a DFA. (Why not?)
V T*V
or
V T*
That is, the left-hand side must consist of a single variable, and the right-hand side consists of
any number of terminals (members of ) optionally followed by a single variable. (The
"right" in "right-linear grammar" refers to the fact that, following the arrow, a variable can
occur only as the rightmost symbol of the production.)
iii) f = {A | A P} {f} if f Q
{A| A P} Otherwise
A xyzB
A B
A x
S
S 0B
S 1A
A 0C
A 1S
B 0S
B 1C
C 0A
C 1B
We won't pay much attention to left-linear grammars, because they turn out to be equivalent
to right-linear grammars. Given a left-linear grammar for language L, we can construct a
right-linear grammar for the same language, as follows:
Step Method
Construct a right-linear Replace each production A x of L with a production A
grammar for the (different) xR, and replace each production A B x with a
language LR. production A R
x B.
Construct an NFA for LR We talked about deriving an NFA from a right-linear
from the right-linear grammar on an earlier page. If the NFA has more than one
grammar. This NFA should final state, we can make those states nonfinal, add a new
have just one final state. final state, and put transitions from each previously final
state to the new final state.
Reverse the NFA for LR to 1. Construct an NFA to recognize language L.
obtain an NFA for L. 2. Ensure the NFA has only a single final state.
3. Reverse the direction of the arcs.
4. Make the initial state final and the final state initial.
Construct a right-linear This is the technique we just talked about on an earlier page.
grammar for L from the
NFA for L.
4.2 Closure I
A set is closed under an operation if, whenever the operation is applied to members of the set,
the result is also a member of the set.
For example, the set of integers is closed under addition, because x+y is an integer whenever
x and y are integers. However, integers are not closed under division: if x and y are integers,
x/y may or may not be an integer.
L1 L2 Strings in either L1 or L2
L1 L2 Strings in both L1 and L2
L1L2 Strings composed of one string from L1 followed by one string from L2
-L1 All strings (over the same alphabet) not in L1
L1* Zero or more strings from L1 concatenated together
L1 - L2 Strings in L1 that are not in L2
L1R Strings in L1, reversed
4.3.6 Reverse of L1
Start with an automaton with just one final state.
Make the initial state final and the final state initial.
Reverse the direction of every arc.
In these constructions you form a completely new machine, whose states are each labeled with
an ordered pair of state names: the first element of each pair is a state from L1, and the second
element of each pair is a state from L2. (Usually you won't need a state for every such pair,
just some of them.)
Begin by creating a start state whose label is (start state of L1, start state of L2).
Repeat the following until no new arcs can be added:
Find a state (A, B) that lacks a transition for some x in .
Add a transition on x from state (A, B) to state (δ(A, x), δ(B, x)). (If this state
doesn't already exist, create it.)
The same construction is used for both intersection and set difference. The distinction is in
how the final states are selected.
Intersection: Mark a state (A, B) as final if both (i) A is a final state in L1, and (ii) B is a final
state in L2.
Set difference: Mark a state (A, B) as final if A is a final state in L1, but B is not a final state
in L2.
Assignment No. 4
The pumping lemma for regular languages is another way of proving that a given (infinite)
language is not regular. (The pumping lemma cannot be used to prove that a given language is
regular.)
If L is an infinite regular language, then there exists some positive integer m such that any
string w L whose length is m or greater can be decomposed into three parts, xyz, where
We can view this as a game wherein our opponent makes moves 1 and 3 (choosing m and
choosing xyz) and we make moves 2 and 4 (choosing w and choosing i). Our goal is to show
that we can always beat our opponent. If we can show this, we have proved that L is not
regular.
Since T* (V U T)* and T*V (V U T)*, it follows that every right-linear grammar is also
a context-free grammar. Similarly, right-linear grammars and linear grammars are also
context-free grammars. A context-free language (CFL) is a language that can be defined by a
context-free grammar.
Is L(G) a regular language? Yes -- the language L(G) is regular because it can be defined by
the regular grammar:
Example 5.3: We have shown that L = {anbk | k > n 0} is not regular. Here is a context-free
grammar for this language.
Example 5.4: The language L = {wwR | w {a, b}*}, where each string in L is a palindrome,
is not regular. Here is a context-free grammar for this language.
Example 5.5: The language L = {w | w {a, b}*, na(w) = nb(w)}, where each string in L has
an equal number of a's and b's, is not regular. Consider the following grammar:
1. Does every string recognized by this grammar have an equal number of a's and b's?
2. Is every string consisting of an equal number of a's and b's recognized by this
grammar?
Consider the linear grammar: ({S, B}, {a, b}, S, {S aS, S B, B bB, B }).
Each of {S, aS, aB, abB, abbB, abb} is a sentential form. Because this grammar is linear, each
sentential form has at most one variable. Hence there is never any choice about which variable
to expand next.
With this grammar, there is a choice of variables to expand. Here is a sample derivation:
If we always expanded the leftmost variable first, we would have a leftmost derivation:
Conversely, if we always expanded the rightmost variable first, we would have a rightmost
derivation:
This tree represents not just the given derivation, but all the different orders in which the same
productions could be applied to produce the string abbc.
A partial derivation tree is any subtree of a derivation tree such that, for any node of the
subtree, either all of its children are also in the subtree, or none of them are.
The yield of the tree is the final string obtained by reading the leaves of the tree from left to
right, ignoring the s (unless all the leaves are , in which case the yield is ). The yield of the
above tree is the string abbc, as expected.
The yield of a partial derivation tree that contains the root is a sentential form.
5.4.1 Ambiguity
The following grammar generates strings having an equal number of a's and b's.
G = ({S}, {a, b}, S, S aSb | bSa | SS | )
The string "abab" can be generated from this grammar in two distinct ways, as shown by the
following derivation trees:
Each derivation tree can be turned into a unique rightmost derivation, or into a unique
leftmost derivation. Each leftmost or rightmost derivation can be turned into a unique
derivation tree. So these representations are largely interchangeable.
Grammars are used in compiler construction. Ambiguous grammars are undesirable because
the derivation tree provides considerable information about the semantics of a program;
conflicting derivation trees provide conflicting information.
Ambiguity is a property of a grammar, and it is usually (but not always) possible to find an
equivalent unambiguous grammar.
Assignment No. 5
If the empty string does belong to a language, then we can eliminate from all productions
save for the single production S . In this case we can also eliminate any occurrences of S
from the right-hand-side of productions.
Chomsky Normal Form is particularly useful for programs that have to manipulate grammars.
Grammars in Greibach Normal Form are typically ugly and much longer than the CFG from
which they were derived. Greibach Normal Form is useful for proving the equivalence of
CFGs and NPDAs. When we discuss converting a CFG to an NPDA, or vice versa, we will
use Greibach Normal Form.
THEOREM
Any context-free language is generated by a context-free grammar in Chomsky normal form.
PROOF IDEA
We can convert any grammar G into Chomsky normal form. The conversion has several stages
wherein rules that violate the conditions are replaced with equivalent ones that are satisfactory.
First, we add a new start variable. Then, we eliminate all e rules of the form A e. We also
PROOF
First, we add a new start variable S0 and the rule S0 S, where S was the original start variable.
This change guarantees that the start variable doesn't occur on the right-hand side of a rule.
Second, we take care of all e rules. We remove an e-rule A e, where A is not the start variable.
Then for each occurrence of an A on the right-hand side of a rule, we add a new rule with that
occurrence deleted. In other words, if R uAv is a rule in which u and v are strings of variables
and terminals, we add rule Ruv. We do so for each occurrence of an A, so the rule R
uAvAw causes us to add R uvAw, R uAvw, and R uvw. If we have the rule R A, we
add R e unless we had previously removed the rule R e. We repeat these steps until we
eliminate all empty rules not involving the start variable.
Third, we handle all unit rules. We remove a unit rule A B. Then, whenever a rule B u
appears, we add the rule A u unless this was a unit rule previously removed. As before, u is
a string of variables and terminals. We repeat these steps until we eliminate all unit rules.
Finally, we convert all remaining rules into the proper form. We replace each rule A u1 u2 .
. . uk, where k > 3 and each ui is a variable or terminal symbol, with the rules A u1A1, A1
u2A2, A2 u3A3, . . . , and Ak-2 uk-1 uk.
The Ai's are new variables. If k = 2, we replace any terminal ui in the preceding rule(s) with the
new variable Ui and add the rule Ui ui.
Example 5.7:
Consider the following CFG and convert it to Chomsky’s Normal Form by using the conversion
procedure just given. The series of grammars presented illustrates the steps in the conversion.
Rules shown in bold have just been added. Rules shown in gray have just been removed.
1. The original CFG is shown on the left. The result of applying the first step to make a new
start variable appears on the right.
SASA | aB S0S
AB | S SASA | aB
Bb | AB | S
Bb |
2. Remove rules B, shown on the left, and A, shown on the right.
S0S S0S
SASA | aB | a SASA | aB | a | SA | AS | S
AB | S | AB | S |
Bb | Bb
3a. Remove unit rules S S, shown on the left, and So S, shown on the right.
4. Convert the remaining rules into the proper form by adding additional variables and rules.
The final grammar in Chomsky normal form is equivalent to G6, which follows. (Actually the
procedure given in Theorem produces several variables Ui along with several rules Ui a. We
simplified the resulting grammar by using a single variable U and rule U a.)
S0 AA1 | UB | a | SA | AS
S AA1 | UB | a | SA | AS
A b | AA1 | UB | a | SA | AS
A1 SA
Ua
Bb
5.8 Parsing
There are two ways to use a grammar:
Use the grammar to generate strings of the language. This is easy -- start with the start
symbol, and apply derivation steps until you get a string composed entirely of
terminals.
Use the grammar to recognize strings; that is, test whether they belong to the
language. For CFGs, this is usually much harder.
A language is a set of strings, and any well-defined set must have a membership criterion. A
context-free grammar can be used as a membership criterion -- if we can find a general
algorithm for using the grammar to recognize strings.
Parsing a string is finding a derivation (or a derivation tree) for that string. Parsing a string is
like recognizing a string. An algorithm to recognize a string will give us only a yes/no answer;
an algorithm to parse a string will give us additional information about how the string can be
formed from the grammar. The only realistic way to recognize a string of a context-free
grammar is to parse it.
Systematic approaches are easy to find. Almost any exhaustive search technique will do.
We can (almost) make the search finite by terminating every search path at the point that it
generates a sentential form containing more than |w| terminals.
Note: for the time being, we will ignore the possibility that λ is in the language.
If λ belongs to the language, we need to keep the production S λ. This creates a problem if
S occurs on the right hand side of some production, because then we have a way of decreasing
the length of a sentential form. All we need to do in this case is to add a new start symbol, say
S0, and to replace the production S λ with the pair of productions
S0
S0 S
There are ways to further restrict context-free grammars so that strings may be parsed in linear
or near-linear time. These restricted grammars are covered in courses in compiler construction
but will not be considered here. All such methods do reduce the power of the grammar, thus
limiting the languages that can be recognized. There is no known linear or near-linear
algorithm for parsing strings of a general context-free grammar.
M = (K, Σ, Γ, Δ, s, F) where,
K = finite state set
Σ = finite input alphabet
Γ = finite stack alphabet
s ∈ K: start state
F ⊆ K: final states Figure 12: Pushdown Automata (PDA)
Δ is now a function of three arguments. The first two are the same as before: the state, and
either λ or a symbol from the input alphabet. The third argument is the symbol on top of the
stack. Just as the input symbol is "consumed" when the function is applied, the stack symbol
is also "consumed" (removed from the stack). We must have the finite qualifier because the
full subset is infinite by virtue of the Γ* component.
The meaning of the transition relation is that, for σ ∈ Σ, if ((p, σ, α), (q, β)) ∈ Δ:
This transition relation ((p, u, λ), (q, a)) pushes symbol a on to the stack and ((p, u, a), (q, λ)
pops a from the stack.
Let the symbol " " indicate a move of the PDA, and suppose that δ(q1, a, x) = {(q2, y), ...}.
Then the following move is possible:
(q1, aw, xZ) (q2, w, yZ)
where w indicates the rest of the string following the a, and Z indicates the rest of the stack
contents underneath the x. This notation says that in moving from state q1 to state q2, an a is
consumed from the input string aw, and the x at the top (left) of the stack xZ is replaced with
y, leaving yZ on the stack.
We have the notation " " to indicate a single move of an PDA. We will also use " " to
indicate a sequence of zero or more moves, and we will use " " to indicate a sequence of one
or more moves. As expected, the yields relation, , is the reflexive, transitive closure of ⊢.
A string w is accepted by the PDA if (q0, w, ) ⊢* (f, , ). Namely, from the start state with
empty stack, we
• process the entire string,
• end in a final state
• end with an empty stack.
The language accepted by PDA is L(M) = {w | (q0, w, z) ⊢* (f, ε, ε), q0, f ∈ Q}, is the set of
all accepted strings. The empty stack is our key new requirement relative to finite state
machines. To recognize string w, begin with the instantaneous description (q0, w, z)
where
q0 is the start state,
Starting with this instantaneous description, make zero or more moves, just as you would with
an NFA. There are two kinds of moves that you can make:
λ-transitions. If you are in state q1, x is the top (leftmost) symbol in the stack, and δ(q1,
λ, x) = {(q2, w2), ...}, then you can replace the symbol x with the string w2 and move to
state q2.
Nonempty transitions. If you are in state q1, a is the next unconsumed input symbol, x
is the top (leftmost) symbol in the stack, and δ(q1, a, x) = {(q2, w2), ...}, then you can
remove the a from the input string, replace the symbol x with the string w2, and move
to state q2.
In final state acceptability, a PDA accepts a string when, after reading the entire string, the
PDA is in a final state. From the starting state, we can make moves that end up in a final state
with any stack values. The stack values are irrelevant as long as we end up in a final state.
For a PDA (Q, Σ, , δ, q0, z, F), the language accepted by the set of final states F is −
L(PDA) = {w | (q0, w, z) ⊢* (q, ε, x), q ∈ F} for any input stack string x.
If you are in a final state when you reach the end of the string (and maybe make some λ
transitions after reaching the end), then the string is accepted by the PDA. It doesn't matter
what is left on the stack.
or
because it consults neither the input, nor the stack and will leave the previous configuration
intact.
Both NPDAs and DPDAs may have λ-transitions; but a DPDA may have a λ-transition
only if no other transition is possible.
Formally: If δ(q, , b) , then δ(q, c, b) = for every c Σ.
Example 6.1:
Every finite automaton can be viewed as a pushdown automaton that never operates on its stack.
Let M = (K, ∑, ∆, s, F) be a finite automaton and let M’ = (K, ∑, Γ, ∆’, s, F), where Γ = Ф and
∆’ = {((p, u, e), (q, e)), (p, u, q) ∆}.
Then M and M’ accept the same language.
Example 6.2:
Design a PDA to accept the language L = { wcwR : w ∈ {a, b}} ٭.
Let M = (K, ∑, Γ, ∆, s, F) where K = { s, f }, ∑ = {a, b, c}, Γ = {a, b} and F = {f} and ∆
contains the following transitions.
1. ((s, a, e), (s, a)) push a
2. ((s, b, e), (s, b)) push b
3. ((s, c, e), (f, e)) change state
4. ((f, a, a), (f, e)) pop a
5. ((f, b, b), (f, e)) pop b
State Unread Input Stack Transition
s abbcbba e -
s bbcbba a 1
s bcbba ba 2
s cbba bba 2
f bba bba 3
f ba ba 5
f a a 5
f e e 4
Observe that this PDA is deterministic in the sense that there are no choices in transitions.
and
The idea in both machines is to stack the a's and match off the b's. The first one is non-
deterministic in the sense that it could prematurely guess that the a's are done and start matching
off b's. The second version is deterministic in that the first b acts as a trigger to start matching
off. Note that we must make both states final in the second version in order to accept ε.
x = σ ∈ Σ or ε
means to do so without consulting the stack; it says nothing about whether the stack is empty
or not. Nevertheless, one can maintain knowledge of an empty stack by using a dedicated
stack symbol, c, representing the "stack bottom" with the property that it is pushed onto an
empty stack by a transition from the start state with no other outgoing or incoming transitions.
Example 6.4:
Design a PDA to accept the language having equal numbers of a’s and b’s. #σ(w) = the number
of occurrences of σ in w
The language is L = {w ∈ {a,b}*: #a(w) = #b(w) }.
• PDA keeps a special symbol c
on the bottom of the stack as a
marker.
• Either a string of a’s or string of
b’s is kept by M on its stack.
Let M = (K, ∑, Γ, ∆, S, F), where k = {s, q, f}, ∑ = {a, b}, Γ= {a, b, c},
F = {f} and ∆ is listed below:
1. ((s, e, e), (q, c))
2. ((q, a, c), (q, ac))
3. ((q, a, a), (q, aa))
4. ((q, a, b), (q, e))
5. ((q, b, c), (q, bc))
a. { anbn : n ≥ 0 }
b. { ambn : 0 ≤ m < n }
In deterministic case, when the function is applied, the automaton moves to a new state q Q
and pushes a new string of symbols x* onto the stack. Since we are dealing with a
nondeterministic pushdown automaton, the result of applying δ is a finite set of (q, x) pairs. If
we were to draw the automaton, each such pair would be represented by a single arc.
As with an NFA, we do not need to specify δ for every possible combination of arguments.
For any case where δ is not specified, the transition is to Q, the empty set of states.
Consider the example of following NPDA.
Example 6.6: Q = {q0, q1, q2, q3}, Σ = {a, b}, = {0,1}, δ, q0, z=0, F={q3}, where,
Example 6.7:
Construct PDA to accept L = {wwR : w {a, b}}٭.
M = (K, ∑, Γ, ∆, s, f) where K = {s, f}, ∑ = {a, b}, = {a, b}, and F = {f} and ∆ contains the
following five transitions:
1. ((s, a, ), (s, a))
2. ((s, b, ), (s, b))
3. ((s, , ), (f, ))
4. ((f, a, a), (f, ))
5. ((f, b, b), (f, ))
The machine guesses when it has reached the middle of the input string and changes
from state s to f in a non-deterministic fashion.
Whenever the machine is in state s, it can non-deterministically choose either to push
the next input symbol into the stack, or to switch to state f without consuming any input.
This PDA is identical to the PDA in Example 6.2 except for the ε-transition. Nevertheless, there
is a significant difference in that this PDA must guess when to stop pushing symbols, jump to
the final state and start matching off of the stack.
Therefore, this machine is decidedly non-deterministic. In a general programming model (like
Turing Machines), we have the luxury of preprocessing the string to determine its length and
thereby knowing when the middle is coming.
Assignment No. 6
Of the two, the first uses a very simple, intuitive construction to achieve a mimick a leftmost
derivation by using a PDA. The converse step is far more complicated than the first. The
construction in the first is quite simple. The construction in the second involves two steps:
• Convert a PDA into a simple PDA.
The notion of simple means that the stack is always consulted, and pushed or popped
by one symbol only, or kept the same size. This part is quite straightforward.
• Convert the simple PDA into a grammar.
To say that G and M are equivalent means that L(M) = L(G), or, considering an arbitrary
string w ∈ Σ*:
S ⇒* w ⇔ (0,w,ε) ⊢* (1,ε,ε)
(Proof ⇒):
Induction on the length of the leftmost derivation.
𝑛
S ⇒ α′ ⇒ wα
Then, since the last step was leftmost, we can write:
α′ = xAβ for x ∈ Σ*
and then
xzβ = wα for A → z (A)
𝑛
By induction, since S ⇒ xAβ:
(1,x,S) ├* (1,ε,Aβ)
Furthermore, applying the transition of type 2, we have:
(1,ε,Aβ) ├ (1,ε,zβ)
Putting these two together:
(1,x,S) ├* (1,ε,zβ) (B)
Looking at (A) we see that the string x must be a prefix of w because α begins with a non-
terminal, or is empty. Write
w = xy and therefore, zβ = yα (C)
As a consequence of (B) we get:
(1,xy,S) ├* (1,y,zβ) (D)
Combine (C) and (D) to get:
(1,w,S) ├* (1,y,yα) (E)
Apply |y| transitions of type 3 to get
(1,y,yα) ├* (1,ε,α) (F)
Combine (E) and (F) to get the desired result:
(1,w,S) ├* (1,ε,α)
(Proof ⇐):
The proof going this direction is by induction on the number of type-2 steps in the derivation.
This restriction makes the entire proof simpler than the converse that we just proved.
We'll proceed to the induction step. Assume true for up to n steps and prove true for n+1 type-
2 steps. Write:
(1,w,S) ├* (1,y,Aβ) ├ (1,y,zβ) ├* (1,ε,α)
where the use of the rule
A⇒z
represents the final type-2 step and the last part of the chain is type-3 steps only. The string y
must be a suffix of w, and so writing
w = xy
we have:
(1,xy,S) ├* (1,y,Aβ)
and therefore,
We assert without proof that any NPDA can be transformed into an equivalent NPDA that has
the following form:
1. The NPDA has only one final state, which it enters if and only if the stack is empty;
2. With 𝑎 ∈ ∑∪ { }, 𝑎ll transitions must have the form
δ(q, a, A) = (qj, λ)
or
δ(q, a, A) = (qj, BC)
When we write a grammar, we can use any variable names we choose. As in programming
languages, we like to use "meaningful" variable names. When we translate an NPDA into a
CFG, we will use variable names that encode information about both the state of the NPDA
and the stack contents. Variable names will have the form [qiAqj], where qi and qj are states
and A is a variable. The "meaning" of the variable [qiAqj] is that the NPDA can go from state
qi with Ax on the stack to state qj with x on the stack.
Each transition of the form δ (qi, a, A) = (qj, λ) results in a single grammar rule.
Each transition of the form δ (qi, a, A) = (qj, BC) results in a multitude of grammar rules, one
for each pair of states qx and qy in the NPDA.
Since the PDA accepts by empty stack, the final set F is irrelevant. The construction defines a
CFG G = (V, T, P, S) where V contains a start symbol S as well as a symbol [sY t] for every
combination of stack symbol Y and states s and t. Thus V = fS; [qZq]; [qZr]; [qXq]; [qXr];
[rZq]; [rZr]; [rXq]; [rXr]. T = {0, 1}, the input alphabet of the PDA. S is the start symbol for
the grammar.
The intuitive meaning of variables like [qXq] is that it represents the language (set of strings)
that label paths from q to q that have the net effect of popping X off the stack (without going
deeper into the stack).
The productions P of G have two forms. First, for the start symbol S we add productions to
the "[startState, startStackSymbol, state]" variable for every state in the PDA. The language
generated by S will correspond to the set of strings labeling paths from S to any other state
that have the net effect of emptying the stack (popping off the starting stack symbol).
[sY t] → a
to the grammar (we have three transitions of this form in the PDA, all into state r). This
corresponds to the fact that there is a path from s to t labeled by a that has the net effect of
popping Y off the stack.
After this stage we have the following productions (with all non-terminals listed even if they
don’t have any productions):
S → [qZq] | [qZr]
[qZq] →
[qZr] →
[qXq] →
[qXr] →1
[rZq] →
[rZr] → ε
These "push nothing" transitions are just a special case of the general rule:
If there is a transition from s to t that reads a from the input, Y from the stack, and pushes k
symbols Y1Y2 · · · Yk onto the stack, add all productions of the form [sY sk] → a[tY1s1][s1Y2s2] ·
· · [sk-1Yksk] to the grammar (for all combinations of states s1, s2, . . . , sk).
This expresses the intuition that the PDA can go from s to sk with a net effect of popping Y by
first going from s to t while popping Y pushing the Yi’s on the stack, and then taking a path
that pops off each Yi in turn.
In the "push nothing" case, this results in RHS’s that are single terminals as above.
Lets apply the general rule to the δ(q, 0, Z) = (q, XZ) transition. In this case, we are pushing
the string XZ of length 2 on the stack, and need to add all productions of the form:
[qZs2] → 0[qXs1][s1Zs2]
where s1 and s2 can be any combinations of q and/or r. In other words, we add the
productions:
to the grammar.
Repeating this for the δ(q, 0, X) = (q, XX) transition gives us:
[qXq] → 0[qXq][qXq] | 0[qXr][rXq]
[qXr] → 0[qXq][qXr] | 0[qXr][rXr]
Collecting all of these productions together, we get (again listing all variables even if they
have no productions):
S → [qZq] | [qZr]
[qZq] → 0[qXq][qZq] | 0[qXr][rZq]
[qZr] → 0[qXq][qZr] | 0[qXr][rZr]
[qXq] → 0[qXq][qXq] | 0[qXr][rXq]
[qXr] → 1 | 0[qXq][qXr] | 0[qXr][rXr]
[rZq] →
[rZr] → ε
[rXq] →
[rXr] → 1
To verify that the language generated by the grammar is the same as the language accepted by
the PDA (by empty-stack), we will look at an example string to gain further insight into why
the construction works.
Consider the string 0011 in the language. The PDA accepts the string by pushing two X’s on
the stack while reading 0’s and then popping them off while reading 1’s. To derive 0011 in
Consider all the productions again. If we ever generate a [rZq] or [rXq] symbol, then we can’t
remove it (since those symbols have no productions). Therefore we can remove those symbols
and the productions containing them from the grammar.
S → [qZq] | [qZr]
[qZq] → 0[qXq][qZq]
[qZr] → 0[qXq][qZr] | 0[qXr][rZr]
[qXq] → 0[qXq][qXq]
[qXr] → 1 | 0[qXq][qXr] | 0[qXr][rXr]
[rZr] → ε
[rXr] → 1
Now the only production for [qZq] produces another [qZq] symbol, so if we ever generate
the [qZq] variable we will never be able to get to a string of just terminals. Similarly, the only
production for [qXq] produces more [qXq]’s. Removing these two variables (and the
productions that use them) simplifies the grammar to:
S → [qZr]
[qZr] → 0[qXr][rZr]
[qXr] → 1 | 0[qXr][rXr]
[rZr] → ε
[rXr] → 1
Variable [rZr] generates only ε, so it is a no-op and can be deleted. However, unlike the
previous simplifications we must keep a modified versions (with [rZr] replaced by ε) of the
productions using [rZr]. Similarly, variable [rXr] generates only the terminal 1, so we can
replace all uses of it with 1.
S → [qZr]
[qZr] → 0[qXr]
[qXr] → 1 | 0[qXr]1
Now it is easy to see that all derivations of the grammar start with S ⇒ [qZr] ⇒ 0[qXr].
Furthermore, we can now see that variable [qXr] generates all sequences of k ≥ 0 zeros
followed by k + 1 ones. Therefore, S generates {0n1n |n ≥ 1}.
Assignment No. 7
Definition 3:
A Context-sensitive language is specified with a context-sensitive grammar (CSG), in which
every production has the form:
𝛼 → 𝛽, where |𝛽| ≥ |𝛼|
Example 7.1:
Let G=(V, T, P, S) be context-sensitive grammar whose production rules are
P = { S → aBb
aB → bBB
bB → aa
B → b}.
The derivation for w=aaabb is as follows:
S aBb
bBBb
aaBb
abBBb
aaaBb
aaabb
Having got one instance of S, we may want to prepend more a’s to the beginning; if we want
to remember how many there were, we shall have to append something to the end as well at
the same time, and that cannot be a b or a c. We shall use a yet unknown symbol Q. The
following rule pre- and postpends:
1. S → abc | aSQ
If we apply this rule, for instance, three times, we get the sentential form
aaabcQQ
Now, to get aaabbbccc from this, each Q must be worth one b and one c, as was to be
expected, but we cannot just write
Q → bc
because that would allow b’s after the first c. The above rule would, however, be all right if it
were allowed to do replacement only between a b and a c; there, the newly inserted bc will do
no harm:
2. bQc → bbcc
Still, we cannot apply this rule since normally the Q’s are to the right of the c; this can be
remedied by allowing a Q to hop left over a c:
3. cQ → Qc
S → abc | aSQ
A derivation tree for a2b2c2 is given in following Figure. The grammar is monotonic and
therefore of Type 1; it can be proved that there is no Type 2 grammar for the language.
LBA Definition
A linear bounded automaton is a 5-tuple M =
(Q, Σ, Γ, q0, δ), where:
• Q is a finite set of states.
• ∑ is an alphabet (input symbols).
• Γ is an alphabet (store symbols).
• q0 ∈ Q is the initial state.
• δ, the transition function, is from Q ×( Γ ∪ {<, >}) to Q×( Γ ∪ {<, >}) × A.
If ((q, a);(q', b, action)) ∈ 𝛿, then when in state q with a at the current read position on the
tape, M may replace a with b on the tape, perform the specified action, and enter state q'.
M accepts w ∈ ∑* iff it starts with configuration (q0, <w >) and the action Y is taken.
State t looks for the leftmost a, changes this to an x, and moves into state u. If no symbol
from the input alphabet can be found, then the input string is accepted.
State u moves right past any a’s or x’s until it finds a b. It changes this b to an x, and
moves into state v.
State v moves right past any b’s or x’s until it finds a c. It changes this c to an x, and
moves into state w.
State w moves left past any a’s, b’s, c’s or x’s until it reaches the start boundary, and
moves into state t.
LBA Computation
Consider input aabbcc for the previous LBA M:
Unlike the other automata we have discussed, a Turing machine does not read "input."
Instead, there may be (and usually are) symbols on the tape before the Turing machine begins;
the Turing machine might read some, all, or none of these symbols. The initial tape may, if
desired, be thought of as "input."
xi...xjqmxk...xl
where the x's are the symbols on the tape, qm is the current state, and the tape head is on the
square containing xk (the symbol immediately following qm). A move of a Turing machine can
therefore be represented as a pair of instantaneous descriptions, separated by the symbol " ".
For example, if
δ(q5, b) = (q8, c, R)
then a possible move might be
abbabq5babb abbabcq8abb
(Notice that this definition assumes that the Turing machine starts with its tape head
positioned on the leftmost symbol.)
We said a Turing machine accepts its input if it halts in a final state. There are two ways this
could fail to happen:
1. The Turing machine could halt in a non-final state, or
2. The Turing machine could never stop (in which case we say it is in an infinite loop. )
If a Turing machine halts, the sequence of configurations leading to the halt state is called a
computation.
Assignment No. 8
Example 8.2: Consider the language of palindrome over {a, b}. The Turing machine is given
as follows:
(q0, #abaa) ├ (q1, # a baa) ├ (q2, ##baa) ├*(q2, ##baa#) ├ (q3, ##baa)
├ (q4, ##ba) ├* (q4, ##ba) ├ (q1, ##ba) ├ (q5, ###a) ├ (q5, ###a#) ├ (q6, ###a) crash.
(q0, #aba) ├ (q1, # a ba) ├ (q2, ##ba) ├*(q2, ##ba#) ├ (q3, ##ba)
├ (q4, ##b) ├ (q4, ##b) ├ (q1, ##b) ├ (q5, ####) ├ (q6, ###) ├ (h, #####) accept.
q0x qf y
where qf is a final state. A function f is Turing computable if there exists a Turing machine
that can perform the above task.
(q0, ##aaaa#) ├ (q1, ##aaaa#) ├ (q2, ##aaa##) ├ (q3, ##aaa##) ├ (q0, ##aa###)
├ (q1, ##aa###) ├ ( q2, ##a####) ├ (q3, ##a####) ├ (q0, #######) ├ (q1, #######)
├ (q4, #######) ├ (q5, ##Y###) ├ (h, ##Y###).
A/A, L #/#, S
B/B, L
a/a, L
b/b, L
#/#, L
Let’s trace the moves by the machine for the input string abab.
(q0, #abab) ├ (q1, # abab) ├ (q2, #Abab) ├*(q2, #Abab#) ├ (q3, #Abab)
├ (q4, #AbaB) ├* (q4, #AbaB) ├ (q1, #AbaB) ├ (q2, #ABaB)
├ (q2, #ABaB) ├ (q3, #ABaB) ├ (q4, #ABAB) ├ (q1, #ABAB)
├ (q5, #ABAB) ├ (q5, #AbAB)├ (q5, #abAB)
(First phase completed, center found.)
├ (q6, #abAB) ├ (q8, # AbAB) ├ (q8, #abAB) ├ (q9, #Ab#B)
#/#, R
q2 q3
#/#, R a/a, L
A/A, R
q0 q1 q4 b/b, L
B/B, R
#/#, L
#/#, L
q5 q6
(q0, #aba) ├ (q1, #aba) ├ (q2, #Aba) ├ (q2, #Aba) ├ (q2, #Aba#)
├ (q3, #Aba##) ├ (q4, #Aba#a) ├ (q4, #Aba#a) ├ (q4, #Aba#a)
├ (q4, #Aba#a) ├ (q1, #Aba#a) ├ (q5, #ABa#) ├ (q5, #ABa#a)
├ (q6, #ABa#a) ├ (q6, #ABa#a#)
├ (q6, # ABa#ab) ├ (q4, #ABa#ab) ├ (q4, #ABa#ab)
├ (q4, #ABa#ab) ├ (q1, #ABa#ab) ├ (q2, #ABA#ab)
├ (q3, #ABA#ab) ├ (q3, #ABA#ab) ├ (q3, #ABA#ab#)
├ (q4, #ABA#aba) ├ (q4, #ABA#aba) ├ (q4, #ABA#aba)
├ (q4, #ABA#aba) ├ (q1, #ABA#aba) ├ (q7, #ABA#aba)
├ (q7, #ABa#aba) ├ (q7, #Aba#aba) ├ (q7, #aba#aba)
├ (h, #aba#aba). (Accepted).
Examining the preceding computation, we see that D halts with input R(D) if, and only if, D
does not halt with input R(D). This is obviously a contradiction. However, the machine D can
be constructed directly from a machine H that solves the halting problem. The assumption that
the halting problem is decidable produces the preceding contradiction. Therefore, we conclude
that the halting problem is undecidable.
main()
{
printf (“hello world");
}
If the value of n that the program reads is 2, then it will eventually find combinations of
integers such as total = 12, x = 3, y = 4, and z = 5, for which xn + yn = zn. Thus, for input 2, the
program does print hello, world.
However, for any integer n > 2, the program will never find a triple of positive integers to
satisfy xn + yn = zn, and thus will fail to print hello world. Interestingly, until a few years ago,
it was known whether this program would print hello world for some large integer n. The
claim that it would not, i.e., that there are no integer solutions to the equation xn + yn = zn if n
> 2, was made by Fermat 300 years ago, but no proof was found until quite recently. This
statement is often referred to as “Fermat’s last theorem."
Let us define the −hello world− problem to be: determine whether a given C program, with a
given input, prints hello world as the first 11 characters that it prints. It would be remarkable
indeed if we could write a program that could examine any program P and input I for P, and
tell whether P, run with I as its input, would print hello world. We shall prove that no such
program exists.
What H2 does when given itself as input. Recall that H2, given any program P as input, makes
output yes if P prints hello world when given itself as input. Also, H2 prints hello world if P,
given itself as input, does not print hello, world as its first output.
Suppose that the H2 makes the output yes. Then the H2 in the box is saying about its input H2
that H2, given itself as input, prints hello world as its first output. But we just supposed that
the first output H2 makes in this situation is yes rather than hello world.
Thus, it appears that the output of the box is hello world, since it must be one or the other. But
if H2, given itself as input prints hello, world first, then the output of the program H2 must be
yes. Whichever output we suppose H2 makes, we can argue that it makes the other output.
This situation is paradoxical, and we conclude that H2 cannot exist. As a result, we have
contradicted the assumption that H exists. That is, we have proved that no program H can tell
whether or not a given program P with input I prints hello world as its first output.