Context-free language: Difference between revisions
en>Byelf2007 m general cleanup |
en>Tedickey spam |
||
Line 1: | Line 1: | ||
{{one source|date=September 2012}} | |||
In [[formal language theory]], a '''context-free language''' (CFL) is a [[formal language|language]] generated by some [[context-free grammar]] (CFG). Different CF grammars can generate the same CF language, or conversely, a given CF language can be generated by different CF grammars. It is important to distinguish properties of the language (intrinsic properties) from properties of a particular grammar (extrinsic properties). | |||
The set of all context-free languages is identical to the set of languages accepted by [[pushdown automata]], which makes these languages amenable to parsing. Indeed, given a CFG, there is a direct way to produce a pushdown automaton for the grammar (and corresponding language), though going the other way (producing a grammar given an automaton) is not as direct. | |||
Context-free languages have many applications in [[programming languages]]; for example, the [[Dyck language|language of all properly matched parentheses]] is generated by the grammar <math>S\to SS ~|~ (S) ~|~ \varepsilon</math>. Also, most arithmetic expressions are generated by context-free grammars. | |||
==Examples== | |||
An archetypical context-free language is <math>L = \{a^nb^n:n\geq1\}</math>, the language of all non-empty even-length strings, the entire first halves of which are <math>a</math>'s, and the entire second halves of which are <math>b</math>'s. <math>L</math> is generated by the grammar <math>S\to aSb ~|~ ab</math>, and is accepted by the pushdown automaton <math>M=(\{q_0,q_1,q_f\}, \{a,b\}, \{a,z\}, \delta, q_0, z, \{q_f\})</math> where <math>\delta</math> is defined as follows: | |||
<center> | |||
<math>\delta(q_0, a, z) = (q_0, az)</math><br> | |||
<math>\delta(q_0, a, a) = (q_0, aa)</math><br> | |||
<math>\delta(q_0, b, a) = (q_1, \varepsilon)</math><br> | |||
<math>\delta(q_1, b, a) = (q_1, \varepsilon)</math> | |||
<math>\delta(\mathrm{state}_1, \mathrm{read}, \mathrm{pop}) = (\mathrm{state}_2, \mathrm{push})</math> | |||
</center> | |||
Unambiguous CFLs are a proper subset of all CFLs: there are [[Inherently ambiguous language|inherently ambiguous]] CFLs. An example of an inherently ambiguous CFL is the union of <math>\{a^n b^m c^m d^n | n, m > 0\}</math> with <math>\{a^n b^n c^m d^m | n, m > 0\}</math>. This set is context-free, since the union of two context-free languages is always context-free. But there is no way to unambiguously parse strings in the (non-context-free) subset <math>\{a^n b^n c^n d^n | n > 0\}</math> which is the intersection of these two languages.{{sfn|Hopcroft|Ullman|1979}} | |||
== Languages that are not context-free == | |||
The set <math>\{a^n b^n c^n d^n | n > 0\}</math> is a [[context-sensitive language]], but there does not exist a context-free grammar generating this language.{{sfn|Hopcroft|Ullman|1979}} So there exist [[context-sensitive language]]s which are not context-free. To prove that a given language is not context-free, one may employ the [[pumping lemma for context-free languages]] or a number of other methods, such as [[Ogden's lemma]] or [[Parikh's theorem]].<ref>[http://cs.stackexchange.com/questions/265/how-to-prove-that-a-language-is-not-context-free How to prove that a language is not context-free?]</ref> | |||
== Closure properties == | |||
Context-free languages are [[closure (mathematics)|closed]] under the following operations. That is, if ''L'' and ''P'' are context-free languages, the following languages are context-free as well: | |||
* the [[union (set theory)|union]] <math>L \cup P</math> of ''L'' and ''P'' | |||
* the reversal of ''L'' | |||
* the [[concatenation]] <math>L \cdot P</math> of ''L'' and ''P'' | |||
* the [[Kleene star]] <math>L^*</math> of ''L'' | |||
* the image <math>\varphi(L)</math> of ''L'' under a [[String operations#String homomorphism|homomorphism]] <math>\varphi</math> | |||
* the image <math>\varphi^{-1}(L)</math> of ''L'' under an [[String operations#String homomorphism|inverse homomorphism]] <math>\varphi^{-1}</math> | |||
* the [[cyclic shift]] of ''L'' (the language <math>\{vu : uv \in L \}</math>) | |||
Context-free languages are not closed under [[complement (complexity)|complement]], [[intersection (set theory)|intersection]], or [[Complement (set theory)#Relative complement|difference]]. However, if ''L'' is a context-free language and ''D'' is a regular language then both their intersection <math>L\cap D</math> and their difference <math>L\setminus D</math> are context-free languages. | |||
===Nonclosure under intersection and complement=== | |||
The context-free languages are not closed under intersection. This can be seen by taking the languages <math>A = \{a^n b^n c^m \mid m, n \geq 0 \}</math> and <math>B = \{a^m b^n c^n \mid m,n \geq 0\}</math>, which are both context-free.<ref>A context-free grammar for the language ''A'' is given by the following production rules, taking ''S'' as the start symbol: ''S'' → ''Sc'' | ''aTb'' | ''ε''; ''T'' → ''aTb'' | ''ε''. The grammar for B is analogous.</ref> Their intersection is <math>A \cap B = \{ a^n b^n c^n \mid n \geq 0\}</math>, which can be shown to be non-context-free by the [[pumping lemma for context-free languages]]. | |||
Context-free languages are also not closed under complementation, as for any languages A and B: <math>A \cap B = \overline{\overline{A} \cup \overline{B}} </math>. | |||
==Decidability properties== | |||
The following problems are [[Undecidable problem|undecidable]] for arbitrary [[context-free grammar]]s A and B: | |||
*Equivalence: Given two context-free grammars ''A'' and ''B'', is <math>L(A)=L(B)</math>? | |||
*Intersection Emptiness: Given two context-free grammars ''A'' and ''B'', is <math>L(A) \cap L(B) = \emptyset </math> ? However, the intersection of a context-free language and a ''regular'' language is context-free,<ref>{{harvtxt|Salomaa|1973}}, p. 59, Theorem 6.7</ref> and the variant of the problem where ''B'' is a regular grammar is decidable. | |||
*Containment: Given a context-free grammar ''A'', is <math>L(A) \subseteq L(B)</math> ? Again, the variant of the problem where ''B'' is a regular grammar is decidable. | |||
*Universality: Given a context-free grammar ''A'', is <math>L(A)=\Sigma^*</math> ? | |||
The following problems are ''decidable'' for arbitrary context-free languages: | |||
*Emptiness: Given a context-free grammar ''A'', is <math>L(A)=\emptyset</math> ? | |||
*Finiteness: Given a context-free grammar ''A'', is <math>L(A)</math> finite? | |||
*Membership: Given a context-free grammar ''G'', and a word <math>w</math>, does <math>w \in L(G)</math> ? Efficient polynomial-time algorithms for the membership problem are the [[CYK algorithm]] and [[Earley parser|Earley's Algorithm]]. | |||
== Parsing == | |||
Determining an instance of the membership problem; i.e. given a string <math>w</math>, determine whether <math>w \in L(G)</math> where <math>L</math> is the language generated by some grammar <math>G</math>; is also known as ''[[parsing]]''. | |||
Formally, the set of all context-free languages is identical to the set of languages accepted by [[pushdown automata]] (PDA). Parser algorithms for context-free languages include the [[CYK algorithm]] and the [[Earley parser|Earley's Algorithm]]. | |||
A special subclass of context-free languages are the [[deterministic context-free language]]s which are defined as the set of languages accepted by a [[deterministic pushdown automaton]] and can be parsed by a [[LR parser|LR(k) parser]].<ref>{{cite doi|10.1016/S0019-9958(65)90426-2}}</ref> | |||
See also [[parsing expression grammar]] as an alternative approach to grammar and parser. | |||
== See also == | |||
* [[Deterministic context-free language]] | |||
* [[Parsing]] | |||
== Notes == | |||
{{Reflist}} | |||
== References == | |||
{{refbegin}} | |||
* {{cite book| author = [[Seymour Ginsburg]] | title = The Mathematical Theory of Context-Free Languages | year = 1966 | publisher = McGraw-Hill, Inc. | location = New York, NY, USA}} | |||
*{{cite book | |||
| last1 = Hopcroft | |||
| first1 = John E. | |||
| last2 = Ullman | |||
| first2 = Jeffrey D. | |||
| title = Introduction to Automata Theory, Languages, and Computation | |||
| publisher = Addison-Wesley | |||
| edition = 1st | |||
| year = 1979 | |||
| ref = harv | |||
}} | |||
* {{cite book|author=Arto Salomaa|title = Formal Languages|publisher = ACM Monograph Series|year= 1973}} | |||
* {{cite book|author = [[Michael Sipser]] | year = 1997 | title = Introduction to the Theory of Computation | publisher = PWS Publishing | isbn = 0-534-94728-X}} Chapter 2: Context-Free Languages, pp. 91–122. | |||
* Jean-Michel Autebert, Jean Berstel, Luc Boasson, [http://www-igm.univ-mlv.fr/~berstel/Articles/1997CFLPDA.pdf Context-Free Languages and Push-Down Automata], in: G. Rozenberg, A. Salomaa (eds.), Handbook of Formal Languages, Vol. 1, Springer-Verlag, 1997, 111-174. | |||
{{refend}} | |||
{{Formal languages and grammars}} | |||
[[Category:Formal languages]] | |||
[[Category:Languages]] | |||
[[Category:Linguistics]] |
Revision as of 16:07, 18 January 2014
Template:One source In formal language theory, a context-free language (CFL) is a language generated by some context-free grammar (CFG). Different CF grammars can generate the same CF language, or conversely, a given CF language can be generated by different CF grammars. It is important to distinguish properties of the language (intrinsic properties) from properties of a particular grammar (extrinsic properties).
The set of all context-free languages is identical to the set of languages accepted by pushdown automata, which makes these languages amenable to parsing. Indeed, given a CFG, there is a direct way to produce a pushdown automaton for the grammar (and corresponding language), though going the other way (producing a grammar given an automaton) is not as direct.
Context-free languages have many applications in programming languages; for example, the language of all properly matched parentheses is generated by the grammar . Also, most arithmetic expressions are generated by context-free grammars.
Examples
An archetypical context-free language is , the language of all non-empty even-length strings, the entire first halves of which are 's, and the entire second halves of which are 's. is generated by the grammar , and is accepted by the pushdown automaton where is defined as follows:
Unambiguous CFLs are a proper subset of all CFLs: there are inherently ambiguous CFLs. An example of an inherently ambiguous CFL is the union of with . This set is context-free, since the union of two context-free languages is always context-free. But there is no way to unambiguously parse strings in the (non-context-free) subset which is the intersection of these two languages.Template:Sfn
Languages that are not context-free
The set is a context-sensitive language, but there does not exist a context-free grammar generating this language.Template:Sfn So there exist context-sensitive languages which are not context-free. To prove that a given language is not context-free, one may employ the pumping lemma for context-free languages or a number of other methods, such as Ogden's lemma or Parikh's theorem.[1]
Closure properties
Context-free languages are closed under the following operations. That is, if L and P are context-free languages, the following languages are context-free as well:
- the union of L and P
- the reversal of L
- the concatenation of L and P
- the Kleene star of L
- the image of L under a homomorphism
- the image of L under an inverse homomorphism
- the cyclic shift of L (the language )
Context-free languages are not closed under complement, intersection, or difference. However, if L is a context-free language and D is a regular language then both their intersection and their difference are context-free languages.
Nonclosure under intersection and complement
The context-free languages are not closed under intersection. This can be seen by taking the languages and , which are both context-free.[2] Their intersection is , which can be shown to be non-context-free by the pumping lemma for context-free languages.
Context-free languages are also not closed under complementation, as for any languages A and B: .
Decidability properties
The following problems are undecidable for arbitrary context-free grammars A and B:
- Equivalence: Given two context-free grammars A and B, is ?
- Intersection Emptiness: Given two context-free grammars A and B, is ? However, the intersection of a context-free language and a regular language is context-free,[3] and the variant of the problem where B is a regular grammar is decidable.
- Containment: Given a context-free grammar A, is ? Again, the variant of the problem where B is a regular grammar is decidable.
- Universality: Given a context-free grammar A, is ?
The following problems are decidable for arbitrary context-free languages:
- Emptiness: Given a context-free grammar A, is ?
- Finiteness: Given a context-free grammar A, is finite?
- Membership: Given a context-free grammar G, and a word , does ? Efficient polynomial-time algorithms for the membership problem are the CYK algorithm and Earley's Algorithm.
Parsing
Determining an instance of the membership problem; i.e. given a string , determine whether where is the language generated by some grammar ; is also known as parsing.
Formally, the set of all context-free languages is identical to the set of languages accepted by pushdown automata (PDA). Parser algorithms for context-free languages include the CYK algorithm and the Earley's Algorithm.
A special subclass of context-free languages are the deterministic context-free languages which are defined as the set of languages accepted by a deterministic pushdown automaton and can be parsed by a LR(k) parser.[4]
See also parsing expression grammar as an alternative approach to grammar and parser.
See also
Notes
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
References
- 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 Chapter 2: Context-Free Languages, pp. 91–122. - Jean-Michel Autebert, Jean Berstel, Luc Boasson, Context-Free Languages and Push-Down Automata, in: G. Rozenberg, A. Salomaa (eds.), Handbook of Formal Languages, Vol. 1, Springer-Verlag, 1997, 111-174.
Other Sports Official Alfonzo from Chase, has hobbies and interests for instance fast, property developers in new industrial launch singapore and aquariums. In recent times has visited Monasteries of Haghpat and Sanahin.
- ↑ How to prove that a language is not context-free?
- ↑ A context-free grammar for the language A is given by the following production rules, taking S as the start symbol: S → Sc | aTb | ε; T → aTb | ε. The grammar for B is analogous.
- ↑ Template:Harvtxt, p. 59, Theorem 6.7
- ↑ Template:Cite doi