Research Paper Undergraduate 594 words

Tagging and Morphological Disambiguation Kemal

Last reviewed: November 24, 2006 ~3 min read

Tagging and Morphological Disambiguation

Kemal Oflazer and ?lker Kuruz describe the results of a text tagging project in their report "Tagging and Morphological Disambiguation of Turkish Text." The authors developed the Turkish text tagger for use within the PC-KIMMO environment and claim that the tagging process applies equally to other agglutinative languages like Finnish. Tagging text facilitates the parsing process, enabling linguists to create extensive databases of morphemes in the overall analysis of natural languages. Linguists will tag words for pertinent information such as part of speech, denoting their usage as well as their lexical form. Because Turkish and other agglutinative languages naturally carry with them ambiguous morphologies, a specialized tagging system can help linguistic analysts minimize error rates. The types of ambiguities in agglutinative languages differ from those in inflective languages. On the one hand, morphotactical rules limit the potential parts of speech of a given lexical form. On the other hand, the same lexical form can have various surface or contextual meanings. If part of the goal of the tagging project is also to include idiomatic constructs, then ambiguities become even harder to resolve.

Prior research indicated that rule-based and statistic-based tagging systems do not completely or reliably disambiguate agglutinative text. Rule-based tagging relies on construction rules to rule out potential errors, whereas statistical-based tagging relies on actual context. In the current study, a rule-based system was used to disambiguate the text. However, the researchers included a broad set of parameters and variables to promote accuracy. For the current study, researchers used 250 constraints, both general and specific, to test the strength of their tagger on the Turkish language. Using constraints is the key difference between Oflazer and Kuruz's tagger and previous efforts. Results showed that the tagger does help resolve morphological ambiguities, up to 99% accuracy. Moreover, their tagger was designed as a stand-alone application that allows but does not rely on user intervention. The authors acknowledge some of the weaknesses in their tagger, including its speed, its ability to analyze substantive texts, and its inability to process less common constructions in which word order is loose.

Commentary

Oflazer and Kuruz suggest that their tagger would be greatly improved if tagging were not done incrementally but rather, globally. Natural language analysis must take into account real-world language uses, which are frequently loose. However, preliminary results of the Oflazer and Kuruz system seem promising. Combining a rule-based system with a statistics-based one minimizes general tagging errors as well as specifically targeting the morphological ambiguities in specific linguistic subsets. Potentially, the Oflazer and Kuruz could be designed as smart technology that logs correct and incorrect hits to generate new internal rules for tagging.

You’re 76% through this paper. Sign up to read the full paper.

Sign Up Now — Instant Access Already a member? Log in
130,000+ paper examples AI writing assistant Citation generator Cancel anytime
Cite This Paper
PaperDue. (2006). Tagging and Morphological Disambiguation Kemal. PaperDue. https://www.paperdue.com/essay/tagging-and-morphological-disambiguation-41517

Always verify citation format against your institution’s current style guide requirements.