BCMP network: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Gareth Jones
→‎Proof: add alternative proof
en>David Eppstein
oops
 
Line 1: Line 1:
In the field of computational linguistics, a '''morphological dictionary''' is a linguistic resource that contains correspondences between surface form and lexical forms of words. Surface forms of words are those found in any text. The corresponding lexical form of a surface form is the [[Lemma (morphology)|lemma]] followed by grammatical information (for example the [[part of speech]], [[Grammatical gender|gender]] and [[Grammatical number|number]]). In English ''give'', ''gives'', ''giving'', ''gave'' and ''given'' are surface forms of the verb ''give''. The lexical form would be "give", verb. There are two kinds of morphological dictionaries: aligned and non-aligned.
My name is Tabitha and I am studying Athletics and Physical Education and Business and Management at Gdansk / Poland.<br><br>Stop by my web site; [http://www.bibleclassteachers.com/members/alvinstaten/profile/ wordpress dropbox backup]
 
==Aligned morphological dictionaries==
 
In an aligned morphological dictionary, the correspondence between the surface form and the lexical form of a word is aligned at the character level, for example:
 
:(h,h) (o,o) (u,u) (s,s) (e,e) (s,<n>), (θ,<pl>)
 
Where θ is the empty symbol and <n> signifies "noun", and <pl> signifies "plural".
 
In the example the left hand side is the surface form (input), and the right hand side is the lexical form (output). This order is used in [[Morphology (linguistics)|morphological analysis]] where a lexical form is generated from a surface form. In morphological generation this order would be reversed.
 
Formally, if Σ is the alphabet of the input symbols, and <math> \Gamma </math> is the alphabet of the output symbols, an aligned morphological dictionary is a subset <math> A \subset 2^{(L^*)} </math>, where:
 
:<math> L = (( \Sigma \cup { \theta } ) \times \Gamma) \cup (\Sigma \times ( \Gamma \cup { \theta } )) </math>
 
is the alphabet of all the possible alignments including the empty symbol. That is, an aligned morphological dictionary is a set of string in <math>L^*</math>.
 
== Non-aligned morphological dictionary ==
 
A non-aligned morphological dictionary is simply a set <math> U \subset 2^{(\Gamma^* \times \Sigma^*)}</math> of pairs of input and output strings. A non-aligned morphological dictionary would represent the previous example as:
 
:(houses, house<n><pl>)
 
It is possible to convert a non-aligned dictionary into an aligned dictionary. Besides trivial alignments to the left or to the right, linguistically motivated alignments which align characters to their corresponding morphemes are possible.
 
== Lexical ambiguities ==
 
Frequently there exists more than one lexical form associated with a surface form of a word. For example "house" may be a noun in the singular, {{IPA|/haʊs/}}, or may be a verb in the present tense, {{IPA|/haʊz/}}. As a result of this it is necessary to have a function which relates input strings with their corresponding output strings.
 
If we define the set <math> E \subset \Sigma^* </math> of input words such that <math> E = { w: (w,w') \in U } </math>, the correspondence function would be <math> \tau : E \rightarrow 2^{\Gamma^{*}} </math> defined as <math> \tau(w) =  w' : (w,w') \in U </math>.
 
==List of online morphological dictionaries==
* [http://www.canoo.net Canoo.net – German]
* [http://www.babelpoint.org/english Babelpoint.org – English]
* [http://www.babelpoint.org/french Babelpoint.org – French]
* [http://www.babelpoint.org/german Babelpoint.org – German]
* [http://www.babelpoint.org/russian Babelpoint.org – Russian]
* [http://www.babelpoint.org/spanish Babelpoint.org – Spanish]
* [http://www.babelpoint.org/swedish Babelpoint.org – Swedish]
 
==References==
{{reflist}}
* Garrido-Alenda, A. and Forcada, M. L. (2002). "[http://www.dlsi.ua.es/~mlf/docum/garrido02j.pdf Comparing nondeterministic and quasideterministic finite-state transducers built from morphological dictionaries]". ''Procesamiento del Lenguaje Natural'', (XVIII Congreso de la Sociedad Española de Procesamiento del Lenguaje Natural, Valladolid, Spain, 11-13.09.2002)
 
[[Category:Computational linguistics]]
[[Category:Translation databases]]
[[Category:Morphology|dictionary]]

Latest revision as of 00:47, 25 December 2014

My name is Tabitha and I am studying Athletics and Physical Education and Business and Management at Gdansk / Poland.

Stop by my web site; wordpress dropbox backup