Optimizing the construction of the English morphological analyser

Show full item record



Permalink

http://hdl.handle.net/10138/317510

Citation

Hurskainen , A 2020 ' Optimizing the construction of the English morphological analyser ' Technical Reports on Language Technology , no. 54 , University of Helsinki, Institute for Asian and African Studies , Helsinki .

Title: Optimizing the construction of the English morphological analyser
Author: Hurskainen, Arvi
Contributor: University of Helsinki, Department of Languages
Publisher: University of Helsinki, Institute for Asian and African Studies
Date: 2020
Number of pages: 17
Belongs to series: Technical Reports on Language Technology
URI: http://hdl.handle.net/10138/317510
Abstract: The construction of a morphological analyser for English is a fairly simple operation. The language has only a few morphological features, and they can be easily described. However, listing all wordforms as separate entries in the lexicon is certainly not the optimal solution. When using finite state transducers as developing environments, it is customary to list the word stems of various POS categories into separate sub-lexicons, and the inflection suffixes (and prefixes) into other sub-lexicons. Because the comprehensive analysis system tends to expand to dimensions, which are uncomfortable, or sometimes impossible, to manage, there is strong motivation to condense the lexicon wherever possible. This report describes the method, where verbs, and words derived from them, such as adjectives and nouns, are listed as underspecified entities. In a later phase these entities are then processed into separate readings, so that the readings can be disambiguated on the basis of context. The method condenses the lexicon considerably and it is easier to maintain, when the stems are in one place.
Subject: 6121 Languages
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
optimizing_the_ ... morphological_analyser.pdf 377.2Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record