Skip to content

Automata and Formal Languages

1. Finite Automata

1.1 Deterministic Finite Automaton (DFA)

A DFA is a 5-tuple M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_0, F) where:

  • QQ = finite set of states
  • Σ\Sigma = input alphabet
  • δ:Q×ΣQ\delta: Q \times \Sigma \to Q = transition function
  • q0Qq_0 \in Q = start state
  • FQF \subseteq Q = set of accepting states

Language of MM: L(M)={wΣ:M accepts w}L(M) = \{w \in \Sigma^* : M \text{ accepts } w\}

Example: DFA for strings ending in “ab” over Σ={a,b}\Sigma = \{a, b\}:

States: {q0, q1, q2}
q0 → (a) → q1, q0 → (b) → q0
q1 → (a) → q1, q1 → (b) → q2
q2 → (a) → q1, q2 → (b) → q0
Start: q0, Accept: q2

Extended transition function: δ:Q×ΣQ\delta^*: Q \times \Sigma^* \to Q

δ(q,ϵ)=q\delta^*(q, \epsilon) = q δ(q,wa)=δ(δ(q,w),a)\delta^*(q, wa) = \delta(\delta^*(q, w), a)

1.2 Nondeterministic Finite Automaton (NFA)

An NFA is a 5-tuple M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_0, F) where:

  • δ:Q×(Σ{ϵ})2Q\delta: Q \times (\Sigma \cup \{\epsilon\}) \to 2^Q = transition function (returns a set of states)

Key difference from DFA: Multiple transitions per symbol, ϵ\epsilon-transitions allowed.

Example: NFA for strings containing “aba” over Σ={a,b}\Sigma = \{a, b\}:

States: {q0, q1, q2, q3}
q0 → (a) → {q0, q1}
q1 → (b) → q2
q2 → (a) → q3
Start: q0, Accept: q3

1.3 NFA to DFA Conversion (Subset Construction)

Convert NFA NN to equivalent DFA DD:

SUBSET_CONSTRUCTION(N):
D.states = 2^Q
D.start = ε-closure({q0})
mark D.start as unexplored
while unexplored states exist:
pick unexplored state S
for each symbol a in Σ:
T = ε-closure(∪ δ(q, a) for each q in S)
D.transition[S][a] = T
if T not yet in D.states:
add T as unexplored
mark S as explored
D.accept = {S : S ∩ F ≠ ∅}

Time: O(2n)O(2^n) states in worst case. Often much smaller in practice.

1.4 DFA Minimization

Minimize a DFA to the smallest equivalent DFA.

MINIMIZE(DFA):
// Remove unreachable states
remove states not reachable from start
// Partition into accepting and non-accepting
P = {F, Q \ F}
repeat:
P" = {}
for each group G in P:
split G into subgroups where
for all a in Σ: δ(p,a) and δ(q,a) are in same group of P
add subgroups to P'
until P' == P
return DFA with P' as states

Result: Unique minimal DFA (up to isomorphism). O(nlogn)O(n \log n) with Hopcroft’s algorithm.

1.5 DFA vs NFA Equivalence

Theorem: DFAs and NFAs recognize exactly the same class of languages: the regular languages.

DFANFARegular Expressions\text{DFA} \equiv \text{NFA} \equiv \text{Regular Expressions}

2. Regular Expressions

2.1 Definition

Regular expressions over alphabet Σ\Sigma:

  • \emptyset: Empty language.
  • ϵ\epsilon: Language {ϵ}\{\epsilon\}.
  • aa (for aΣa \in \Sigma): Language {a}\{a\}.
  • R1R2R_1 \cdot R_2 (concatenation): {uv:uL(R1),vL(R2)}\{uv : u \in L(R_1), v \in L(R_2)\}.
  • R1R2R_1 \cup R_2 (union/alternation): L(R1)L(R2)L(R_1) \cup L(R_2).
  • RR^* (Kleene star): {w1w2wk:k0,wiL(R)}\{w_1 w_2 \cdots w_k : k \geq 0, w_i \in L(R)\}.
  • R+=RRR^+ = R \cdot R^*: One or more repetitions.

2.2 RE to NFA Conversion (Thompson’s Construction)

RE FormConstruction
\emptysetSingle non-accepting state
ϵ\epsilonStart = accept (single state)
aaStart → aa → accept
R1R2R_1 \cup R_2New start with ϵ\epsilon to starts of R1R_1 and R2R_2; accepts merge via ϵ\epsilon
R1R2R_1 \cdot R_2Connect accept of R1R_1 to start of R2R_2 via ϵ\epsilon
RR^*New start/accept with ϵ\epsilon loops through RR

Properties:

  • RE has at most O(R)O(|R|) states in the NFA.
  • NFA may have ϵ\epsilon-transitions but no more than 2 outgoing transitions per state.

2.3 DFA to RE Conversion

DFA_TO_RE(DFA):
rename states q1, q2, ..., qn
add new start state qs and new accept state qf
for each pair (qi, qj), eliminate states one by one
update transition labels using state elimination formula:
// Eliminating state qk:
// For each path qi → qk → qj:
// new label = (label(qi,qk)) (label(qk,qk))* (label(qk,qj)) ∪ label(qi,qj)

2.4 Closure Properties of Regular Languages

Regular languages are closed under:

OperationProof Method
UnionRE: R1R2R_1 \cup R_2; NFA: parallel construction
IntersectionDFA: product construction
ComplementDFA: swap accept/non-accept states
ConcatenationRE: R1R2R_1 \cdot R_2
Kleene starRE: RR^*
ReversalRE: reverse expression; NFA: reverse arrows
DifferenceAB=ABA \setminus B = A \cap \overline{B}

3. Pumping Lemma for Regular Languages

3.1 Statement

If LL is a regular language, then there exists a constant pp (the pumping length) such that any string sLs \in L with sp|s| \geq p can be divided into s=xyzs = xyz satisfying:

  1. y>0|y| > 0 (y is non-empty)
  2. xyp|xy| \leq p
  3. xyizLxy^iz \in L for all i0i \geq 0

3.2 Using the Pumping Lemma (Proof by Contradiction)

To prove LL is not regular:

  1. Assume LL is regular. Let pp be the pumping length.
  2. Choose a string sLs \in L with sp|s| \geq p (strategically).
  3. Show that for any division s=xyzs = xyz with xyp|xy| \leq p, y>0|y| > 0, some xyizLxy^iz \notin L.
  4. Contradiction → LL is not regular.

Example: L={anbn:n0}L = \{a^n b^n : n \geq 0\}

Let s=apbps = a^p b^p. Since xyp|xy| \leq p, y=aky = a^k for some k1k \geq 1. Then xy0z=apkbpLxy^0z = a^{p-k}b^p \notin L. Therefore LL is not regular.

Example: L={0n1n:n0}L = \{0^n 1^n : n \geq 0\}

Same approach. Not regular.

Example: L={wwR:w{0,1}}L = \{ww^R : w \in \{0,1\}^*\}

Choose s=0p11p0ps = 0^p 1 1^p 0^p. Pumping yy (within first pp symbols) breaks the palindrome structure.

3.3 Common Non-Regular Languages

LanguageNot Regular Because
{anbn:n0}\{a^n b^n : n \geq 0\}Need to count aa‘s vs bb‘s
{anbncn:n0}\{a^n b^n c^n : n \geq 0\}Need to match three counts
{ww:wΣ}\{ww : w \in \Sigma^*\}Need unbounded memory
{ap:p is prime}\{a^p : p \text{ is prime}\}Primes are non-regular
{w{a,b}:#a(w)=#b(w)}\{w \in \{a,b\}^* : \#_a(w) = \#_b(w)\}Need to count

4. Context-Free Grammars

4.1 Definition

A CFG is a 4-tuple G=(V,Σ,R,S)G = (V, \Sigma, R, S) where:

  • VV = finite set of variables (nonterminals)
  • Σ\Sigma = finite set of terminals (alphabet)
  • RR = finite set of production rules of form AαA \to \alpha where AVA \in V, α(VΣ)\alpha \in (V \cup \Sigma)^*
  • SVS \in V = start variable

Language of GG: L(G)={wΣ:Sw}L(G) = \{w \in \Sigma^* : S \Rightarrow^* w\}

4.2 Examples

Balanced parentheses:

S → (S)
S → SS
S → ε

{anbn:n0}\{a^n b^n : n \geq 0\}:

S → aSb
S → ε

{anbncn:n0}\{a^n b^n c^n : n \geq 0\} is NOT context-free (proven by pumping lemma for CFLs).

4.3 Chomsky Normal Form (CNF)

Every production is of the form:

  • ABCA \to BC (two variables)
  • AaA \to a (one terminal)
  • SϵS \to \epsilon (only for start symbol)

Conversion:

  1. Eliminate ϵ\epsilon-productions (except SϵS \to \epsilon).
  2. Eliminate unit productions (ABA \to B).
  3. Eliminate useless symbols (unreachable and non-generating).
  4. Convert remaining productions to CNF.

4.4 Parse Trees

A parse tree for a string ww generated by a CFG:

  • Root: labeled SS
  • Internal nodes: labeled with variables
  • Leaves: labeled with terminals (left to right form ww)
  • Children of node AA: labeled with the right-hand side of a production AαA \to \alpha

4.5 Ambiguity

A CFG is ambiguous if some string has two or more different parse trees (equivalently, different leftmost derivations).

Example:

E → E + E
E → E * E
E → (E)
E → a

The string a+aaa + a * a has two parse trees: (a+a)a(a + a) * a and a+(aa)a + (a * a).

Unambiguous version (precedence):

E → E + T | T
T → T * F | F
F → (E) | a

Inherent ambiguity: Some CFLs have no unambiguous grammar (e.g., {aibjck:i=j or j=k}\{a^i b^j c^k : i = j \text{ or } j = k\}).

5. Pushdown Automata

5.1 Definition

A PDA is a 6-tuple M=(Q,Σ,Γ,δ,q0,F)M = (Q, \Sigma, \Gamma, \delta, q_0, F) where:

  • Γ\Gamma = stack alphabet
  • δ:Q×(Σ{ϵ})×(Γ{ϵ})2Q×(Γ{ϵ})\delta: Q \times (\Sigma \cup \{\epsilon\}) \times (\Gamma \cup \{\epsilon\}) \to 2^{Q \times (\Gamma \cup \{\epsilon\})}

Each transition: read input symbol (or ϵ\epsilon), pop stack symbol (or ϵ\epsilon), push symbol(s) (or ϵ\epsilon), change state.

Deterministic PDA (DPDA): For every configuration, at most one move is possible.

5.2 PDA for {anbn:n0}\{a^n b^n : n \geq 0\}

Q = {q0, q1, q2}, F = {q2}, Σ = {a,b}, Γ = {S, $}
δ(q0, a, $) = {(q0, a$)} // push a for each a read
δ(q0, a, a) = {(q0, aa)}
δ(q0, b, a) = {(q1, ε)} // pop a for each b read
δ(q1, b, a) = {(q1, ε)}
δ(q1, ε, $) = {(q2, $)} // accept when stack has only $

5.3 Equivalence

Theorem: CFGs and PDAs recognize the same class of languages: the context-free languages.

CFGPDACFL\text{CFG} \equiv \text{PDA} \equiv \text{CFL}

Key difference from regular languages: DPDAs are strictly weaker than nondeterministic PDAs.

Regular=DPDANPDA=CFL\text{Regular} = \text{DPDA} \subsetneq \text{NPDA} = \text{CFL}

5.4 Closure Properties of CFLs

OperationClosed?
UnionYes
ConcatenationYes
Kleene starYes
IntersectionNo*
ComplementNo
Intersection with regularYes

*Not closed under general intersection, but closed under intersection with a regular language (using product of PDA and DFA).

6. Pumping Lemma for Context-Free Languages

6.1 Statement

If LL is a CFL, then there exists pp such that any sLs \in L with sp|s| \geq p can be divided into s=uvxyzs = uvxyz satisfying:

  1. vy>0|vy| > 0 (v and y not both empty)
  2. vxyp|vxy| \leq p
  3. uvixyizLuv^ixy^iz \in L for all i0i \geq 0

6.2 Application

Example: L={anbncn:n0}L = \{a^n b^n c^n : n \geq 0\} is not context-free.

Let s=apbpcps = a^p b^p c^p. Since vxyp|vxy| \leq p, the substring vxyvxy cannot span all three letter types.

  • If vxyvxy contains only aa‘s and bb‘s: pumping breaks the aa-bb-cc balance.
  • If vxyvxy contains only bb‘s and cc‘s: same issue.
  • If vxyvxy contains only one letter type: pumping breaks all three counts.

6.3 CFLs vs Non-CFLs

Not CFL (require three matching counts or more):

LanguageWhy Not CFL
{anbncn:n0}\{a^n b^n c^n : n \geq 0\}Three matching counts
{aibjck:i>j>k}\{a^i b^j c^k : i > j > k\}Three ordered counts
{ww:w{a,b}}\{ww : w \in \{a,b\}^*\}Copy language (non-CFL)

7. Turing Machines

7.1 Definition

A Turing machine is a 7-tuple M=(Q,Σ,Γ,δ,q0,qaccept,qreject)M = (Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}}) where:

  • QQ = states
  • Σ\Sigma = input alphabet (Σ\sqcup \notin \Sigma, blank symbol)
  • Γ=Σ{}\Gamma = \Sigma \cup \{\sqcup\} = tape alphabet
  • δ:Q×ΓQ×Γ×{L,R}\delta: Q \times \Gamma \to Q \times \Gamma \times \{L, R\}
  • q0q_0 = start, qaccept,qrejectq_{\text{accept}}, q_{\text{reject}} = halting states

Operation: Read symbol under head, write symbol, move left/right, change state. Computation halts on accept or reject.

7.2 TM Variants

All are equivalent in power to the standard TM:

VariantDescription
Multi-tape TMMultiple tapes, each with its own head
Nondeterministic TMMultiple possible transitions
EnumeratorsPrint out strings of a language
Multi-dimensional tapeTape extended to 2D+ grid

Theorem: Multi-tape TMs, nondeterministic TMs, and standard TMs recognize the same class of languages.

7.3 Turing-Recognizable (Recursively Enumerable)

A language is Turing-recognizable (RE) if some TM accepts it (may loop on non-members).

7.4 Turing-Decidable (Recursive)

A language is decidable if some TM halts on all inputs (accepts members, rejects non-members).

DecidableREAll languages\text{Decidable} \subsetneq \text{RE} \subsetneq \text{All languages}

8. Church-Turing Thesis

8.1 Statement

Any algorithmic procedure can be simulated by a Turing machine.

This is a thesis (not a theorem) — it cannot be proven because “algorithmic procedure” is an informal notion.

8.2 Implications

  • TM defines the boundary of what is computable.
  • No more powerful computational model exists (in terms of computable functions).
  • Equivalent models: lambda calculus, μ\mu-recursive functions, Post systems, register machines.

9. Decidability

9.1 Decidable Languages

LanguageDecidable?Algorithm
ADFA={(B,w):B accepts w}A_{\text{DFA}} = \{(B, w) : B \text{ accepts } w\}YesSimulate DFA on ww
ACFG={(G,w):G generates w}A_{\text{CFG}} = \{(G, w) : G \text{ generates } w\}YesCYK algorithm, $O(n^3
EDFA={A:L(A)=}E_{\text{DFA}} = \{A : L(A) = \emptyset\}YesCheck reachable accepting states
EQDFA={A,B:L(A)=L(B)}EQ_{\text{DFA}} = \{A, B : L(A) = L(B)\}YesCheck L(AΔB)\overline{L(A \Delta B)} for non-emptiness
ATM={(M,w):M accepts w}A_{\text{TM}} = \{(M, w) : M \text{ accepts } w\}NoDiagonalization proof

9.2 The Halting Problem

ATM={(M,w):M accepts w}A_{\text{TM}} = \{(M, w) : M \text{ accepts } w\} is undecidable.

Proof (diagonalization):

Assume H is a decider for A_TM.
Define D(M):
run H(M, <M>)
if H accepts: reject
if H rejects: accept
Run D on D: D(D) accepts iff D(D) rejects.
Contradiction. Therefore H cannot exist.

9.3 Reductions for Undecidability

To prove LL is undecidable, reduce a known undecidable language to LL.

Common undecidable problems:

ProblemReduction From
ETM={M:L(M)=}E_{\text{TM}} = \{M : L(M) = \emptyset\}ATMA_{\text{TM}}
REGTM={M:L(M) is regular}REG_{\text{TM}} = \{M : L(M) \text{ is regular}\}ATMA_{\text{TM}}
EQTM={M1,M2:L(M1)=L(M2)}EQ_{\text{TM}} = \{M_1, M_2 : L(M_1) = L(M_2)\}ETME_{\text{TM}}
PCP (Post Correspondence Problem)ATMA_{\text{TM}}
Hilbert’s 10th Problem (Diophantine equations)ATMA_{\text{TM}}

9.4 Rice’s Theorem

Theorem: Any non-trivial property of the language of a Turing machine is undecidable.

A property is non-trivial if it holds for some TMs and not for others.

Examples of undecidable properties (by Rice’s theorem):

  • Does MM accept any string?
  • Is L(M)L(M) finite?
  • Is L(M)L(M) regular?
  • Does L(M)=ΣL(M) = \Sigma^*?

Not covered by Rice’s theorem: Properties about the machine’s behavior (not its language), such as “does MM run for more than 100 steps on input ww?“

10. Common Pitfalls

  1. Confusing NFA and DFA complexity. NFA simulation requires tracking multiple states simultaneously (O(2n)O(2^n) worst case for subset construction), but the NFA itself may be exponentially smaller than any equivalent DFA.

  2. Wrong string choice for pumping lemma proofs. The string must be long enough (p\geq p) and specifically chosen to make pumping impossible. A poor choice leads to failing the proof.

  3. Forgetting that CFLs are not closed under complement and intersection. Unlike regular languages, CFLs lose these closure properties. L\overline{L} of a CFL may not be context-free.

  4. Assuming decidability based on the existence of an algorithm. An algorithm that may not halt (e.g., simulating a TM) does not establish decidability. The TM must halt on all inputs.

  5. Misapplying Rice’s theorem. Rice’s theorem applies only to properties of the language recognized by a TM, not properties of the TM’s code or computation behavior.

  6. Confusing Turing-recognizable and decidable. Every decidable language is Turing-recognizable, but not vice versa. The halting problem is recognizable but not decidable.

  7. Assuming more tape heads make TMs more powerful. Multi-tape TMs are equivalent to single-tape TMs. No variant of the standard TM increases computational power (only efficiency).

Worked Examples

Example 1: Designing a Finite Automaton

Problem: Design a DFA over {0, 1} that accepts strings containing the substring “010”. Solution: States: q0 (start, scanning for first 0), q1 (saw 0, waiting for 1), q2 (saw 01, waiting for 0), q3 (accepting, saw 010). Transitions: q0 on 0 -> q1, q0 on 1 -> q0. q1 on 1 -> q2, q1 on 0 -> q1. q2 on 0 -> q3, q2 on 1 -> q0. q3 on 0 -> q3, q3 on 1 -> q3. The DFA has 4 states and transitions back to earlier states when partial matches are broken.

Example 2: Context-Free Grammar for a Language

Problem: Write a CFG that generates the language L = {a^n b^n c^n : n >= 1}. Solution: S -> aBC. B -> aBB (this ensures more a’s push B’s onto the middle). C -> cD. D -> cDD (this ensures more c’s match). B -> b (terminal). D -> d (terminal). Wait — this generates a^n b^n c^m which is wrong. The language a^n b^n c^n is not context-free (proven by the pumping lemma for CFLs). No CFG exists for this language. This is a common exam trick question.

Summary

  • DFA and NFA recognize the same regular languages, convertible via subset construction.
  • Regular expressions are equivalent to finite automata (Thompson’s construction).
  • Pumping lemma for regular languages proves non-regularity by contradiction.
  • CFGs generate context-free languages, equivalent to PDAs.
  • Ambiguity means multiple parse trees; some CFLs are inherently ambiguous.
  • Pumping lemma for CFLs proves non-context-freeness (e.g., anbncna^n b^n c^n).
  • Turing machines are the most powerful model; equivalent to lambda calculus and recursive functions.
  • Church-Turing thesis: TMs capture all algorithmic computation.
  • Halting problem is undecidable (diagonalization). Rice’s theorem generalizes undecidability.

Cross-References

TopicLink
CompilersView
Complexity TheoryView
Algorithm DesignView