Finite-state Relations Between Two Historically Closely Related Languages

Visa fullständig post



Permalänk

http://hdl.handle.net/10138/42176

Citation

Koskenniemi , K 2013 , Finite-state Relations Between Two Historically Closely Related Languages . in Þ Eyþórsson , L Borin , D Haug & E Rögnvaldsson (eds) , Proceedings of the workshop on computational historical linguistics at NODALIDA 2013 . NEALT Proceedings Series , vol. 18 , Northern European Association for Language Technology , Linköping , pp. 43-53 , Workshop on Computational Historical Linguistics, NODALIDA 2013 , Oslo , Norway , 22/05/2013 . < http://www.ep.liu.se/ecp/087/ecp13087.pdf >

Titel: Finite-state Relations Between Two Historically Closely Related Languages
Författare: Koskenniemi, Kimmo
Medarbetare: Eyþórsson, Þórhallur
Borin, Lars
Haug, Dag
Rögnvaldsson, Eirikur
Upphovmannens organisation: Department of Modern Languages 2010-2017
Utgivare: Northern European Association for Language Technology
Datum: 2013
Språk: eng
Sidantal: 11
Tillhör serie: Proceedings of the workshop on computational historical linguistics at NODALIDA 2013
Tillhör serie: NEALT Proceedings Series
ISBN: 978-91-7519-587-2
ISSN: 1650-3686
Permanenta länken (URI): http://hdl.handle.net/10138/42176
Abstrakt: Regular correspondences between historically related languages can be modelled using finite-state transducers (FST). A new method is presented by demonstrating it with a bidirectional experiment between Finnish and Estonian. An artificial representation (resembling a proto-language) is established between two related languages. This representation, AFE (Aligned Finnish-Estonian) is based on the letter by letter alignment of the two languages and uses mechanically constructed morphophonemes which represent the corresponding characters. By describing the constraints of this AFE using two-level rules, one may construct useful mappings between the languages. In this way, the badly ambiguous FSTs from Finnish and Estonian to AFE can be composed into a practically unambiguous transducer from Finnish to Estonian. The inverse mapping from Estonian to Finnish is mildly ambiguous. Steps according to the proposed method could be repeated as such with dialectal or older written texts. Choosing a set of model words, aligning them, recording the mechanical correspondences and designing rules for the constraints could be done with a limited effort. For the purposes of indexing and searching, the mild ambiguity may be tolerable as such. The ambiguity can be further reduced by composing the resulting FST with a speller or morphological analyser of the standard language.
Subject: 6121 Languages
finite-state transducers
historical linguistics
HFST
two-level morphology
FOMA
Referentgranskad: Ja
Användningsbegränsning: openAccess
Parallelpublicerad version: publishedVersion


Filer under denna titel

Totalt antal nerladdningar: Laddar...

Filer Storlek Format Granska
ecp1387004.pdf 149.3Kb PDF Granska/Öppna

Detta dokument registreras i samling:

Visa fullständig post