CSC 341: February 19, 2007

From CSWiki

Jump to: navigation, search

Previous Lecture Next Lecture

Contents

Instructor's notes for the class session

Bar-Hillel's pumping lemma for context-free languages

Theorem 2.34: For any context-free language A, there is a positive integer p such that, for any string s \in A of length greater than or equal to p, there are strings u,v,x,y,z such that

  • s = uvxyz,
  • for every non-negative integer i, u v^i x y^i z \in A,
  • v and y are not both null strings, and
  • the sum of the lengths of v, x, and y is less than or equal to p.

In practice, this theorem is generally used in the context of a proof by contraction: To show that some particular language is not context-free, we assume that it is, use Theorem 2.34 to derive the apodosis, and then show that this leads to a contradiction.

Like the pumping lemma for regular languages, Bar-Hillel's pumping lemma can be regarded as a two-player game: Mr. Pro chooses p; then Ms. Con chooses s (which may depend on p), subject to the restrictions that s \in A and |s| \ge p; then Mr. Pro partitions s into substrings u,v,x,y,z, subject to the restrictions that v and y are not both null strings and |vxy| \le p; and finally Ms. Con chooses i. Mr. Pro wins if u v^i x y^i z \in A and Ms. Con wins if it is not. If there is a winning strategy in this game for Ms. Con, A cannot be context-free.

Closure properties of context-free languages

The set of context-free languages is closed under union, concatenation, and star. The proofs by construction of context-free grammars are easy.

The set of context-free languagues is not closed under intersection. Proof by counterexample: \{\texttt{a}^i \texttt{b}^j \texttt{c}^k \mid i = j, i, j, k \ge 0\} and \{\texttt{a}^i \texttt{b}^j \texttt{c}^k \mid j = k, i, j, k \ge 0\} are both context-free (provide grammars), but their intersection, as we see in Example 2.36, is not.

The set of context-free languages is not closed under complementation. Proof by contradiction: If it were, then (since A \cap B = \overline{\overline{A} \cup \overline{B}}), it would also be closed under intersection.

Deterministic pushdown automata

The construction that we used to prove that any language recognized by a nondeterministic finite automaton is also recognized by a deterministic one will not work for pushdown automata, because different transitions from the same state on the same input and stack character can have divergent effects on the stack, and there is no way to keep track of these differing stack contents in a deterministic pushdown automaton. We can still implement nondeterministic pushdown automata on real-world computers by doing a breadth-first search through the possible computations, storing alternative copies of the automaton (with different states and stack contents) in a queue, but there is no way to implement this simulation without a more powerful memory device than a single stack.

Deterministic pushdown automata cannot recognize intrinsically ambiguous languages. They can recognize all regular languages (because they can simulate deterministic finite automata), but there are some languages that are context-free, not regular, and not intrinsically ambiguous, but still cannot be recognized by any deterministic pushdown automaton; an example is \{\texttt{a}^m \texttt{b}^n \mid m, n \ge 1 \wedge (n = m \vee n = 2m)\}.

Study questions

  1. In the proof of the pumping lemma for context-free languages, what happens if the non-terminal that is repeated along the key branch of the parse tree is the start symbol? Doesn't "pumping down" simply replace the entire tree with one of its subtrees in that case?
  2. Prove that the languages \{\texttt{a}^m\texttt{b}^m\texttt{c}^n\texttt{d}^n \mid m, n \ge 0\} and \{\texttt{a}^m\texttt{b}^n\texttt{c}^n\texttt{d}^m \mid m, n \ge 0\} are context-free, while the language \{\texttt{a}^m\texttt{b}^n\texttt{c}^m\texttt{d}^n \mid m, n \ge 0\} is not. Give a plausible and intuitive explanation for the ability of a context-free grammar to generate either of the first two languages but not the third, and for the ability of a pushdown automaton to recognize either of the first two languages but not the third.
  3. Is the language \{\texttt{1}^m\texttt{+}\texttt{1}^n\texttt{=}\texttt{1}^{m + n} \mid m, n \ge 0\} context-free? Prove that your answer is correct.
  4. Is the language \{\texttt{1}^m\texttt{*}\texttt{1}^n\texttt{=}\texttt{1}^{mn} \mid m, n \ge 0\} context-free? Prove that your answer is correct.
  5. The reduplicate of a language L is \{ww \mid w \in L\}. Is the set of context-free languages closed under reduplication? Prove that your answer is correct.

Proposed Solutions



  1. The pushdown automaton M1 = (Q,Σ,Γ,σ,q0,F) recognizes the language \{\texttt{1}^m\texttt{+}\texttt{1}^n\texttt{=}\texttt{1}^{m + n} \mid m, n \ge 0\} where

     Q = {q0,q1,q3,q4}
     Σ = {1, + , = }
     \Gamma = \{\$, 1\}
     \sigma(q_0, \varepsilon, \varepsilon) = (q_1, \$), \sigma(q_1, 1, \varepsilon) = (q_1, 1), \sigma(q_1, +, \varepsilon) = (q_2, \varepsilon), \sigma(q_2, 1, \varepsilon) = (q_2, 1), \sigma(q_2, =, \varepsilon) = (q_3, \varepsilon),
     \sigma(q_3, 1, 1) = (q_3, \varepsilon), \sigma(q_3, \varepsilon, \$) = (q_4, \varepsilon)
     F = {q4}.
           It follows that the language is context-free.


  1. Doesn't Example 2.38 page 127 give a counterexample which proves that the set of context-free languages is not closed under reduplication?



Previous Lecture Next Lecture

Front Door

Personal tools