Projecting named entity recognizers from resource-rich to resource-poor languages without annotated or parallel corpora

Show simple item record

dc.contributor Helsingin yliopisto, Matemaattis-luonnontieteellinen tiedekunta fi
dc.contributor University of Helsinki, Faculty of Science en
dc.contributor Helsingfors universitet, Matematisk-naturvetenskapliga fakulteten sv
dc.contributor.author Hou, Jue
dc.date.issued 2019
dc.identifier.uri URN:NBN:fi:hulib-202001211120
dc.identifier.uri http://hdl.handle.net/10138/310012
dc.description.abstract Named entity recognition is a challenging task in the field of NLP. As other machine learning problems, it requires a large amount of data for training a workable model. It is still a problem for languages such as Finnish due to the lack of data in linguistic resources. In this thesis, I propose an approach to automatic annotation in Finnish with limited linguistic rules and data of resource-rich language, English, as reference. Training with BiLSTM-CRF model, the preliminary result shows that automatic annotation can produce annotated instances with high accuracy and the model can achieve good performance for Finnish. In addition to automatic annotation and NER model training, to show the actual application of my Finnish NER model, two related experiments are conducted and discussed at the end of my thesis. en
dc.language.iso eng
dc.publisher Helsingin yliopisto fi
dc.publisher University of Helsinki en
dc.publisher Helsingfors universitet sv
dc.title Projecting named entity recognizers from resource-rich to resource-poor languages without annotated or parallel corpora en
dc.type.ontasot pro gradu -tutkielmat fi
dc.type.ontasot master's thesis en
dc.type.ontasot pro gradu-avhandlingar sv
dc.subject.discipline Tietojenkäsittelytiede und
dct.identifier.urn URN:NBN:fi:hulib-202001211120

Files in this item

Total number of downloads: Loading...

Files Size Format View
Jue_Hou-Master_s_Thesis-v2.1.pdf 1.067Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record