HeLI, a Word-Based Backoff Method for Language Identification

Show simple item record

dc.contributor.author Jauhiainen, Tommi Sakari
dc.contributor.author Linden, Bo Krister Johan
dc.contributor.author Jauhiainen, Heidi Annika
dc.date.accessioned 2017-01-30T22:55:02Z
dc.date.available 2017-01-30T22:55:02Z
dc.date.issued 2016
dc.identifier.citation Jauhiainen , T S , Linden , B K J & Jauhiainen , H A 2016 , HeLI, a Word-Based Backoff Method for Language Identification . in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects : VarDial3, Osaka, Japan, December 12 2016 . pp. 153-162 , Joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects , Osaka , Japan , 11/12/2016 . < https://www.aclweb.org/anthology/W16-4820.pdf >
dc.identifier.citation conference
dc.identifier.other PURE: 77538221
dc.identifier.other PURE UUID: c628f295-0b6d-4109-b1ee-d231761e0b3b
dc.identifier.other ORCID: /0000-0003-2337-303X/work/29934307
dc.identifier.other ORCID: /0000-0002-8227-5627/work/29790944
dc.identifier.other ORCID: /0000-0002-6474-3570/work/34198884
dc.identifier.uri http://hdl.handle.net/10138/174332
dc.description.abstract In this paper we describe the Helsinki language identification method, HeLI, and the resources we created for and used in the 3rd edition of the Discriminating between Similar Languages (DSL) shared task, which was organized as part of the VarDial 2016 workshop. The shared task comprised of a total of 8 tracks, of which we participated in 7. The shared task had a record number of participants, with 17 teams providing results for the closed track of the test set A. Our system reached the 2nd position in 4 tracks (A closed and open, B1 open and B2 open) and in this paper we are focusing on the methods and data used for those tracks. We describe our word-based back-off method in mathematical notation. We also describe how we selected the corpus we used in the open tracks. en
dc.format.extent 10
dc.language.iso eng
dc.relation.ispartof Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects
dc.relation.isversionof 978-4-87974-716-7
dc.rights cc_by
dc.rights.uri info:eu-repo/semantics/openAccess
dc.subject 6121 Languages
dc.subject 113 Computer and information sciences
dc.title HeLI, a Word-Based Backoff Method for Language Identification en
dc.type Conference contribution
dc.contributor.organization Department of Modern Languages 2010-2017
dc.contributor.organization Krister Linden / Research Group
dc.contributor.organization Language Technology
dc.description.reviewstatus Peer reviewed
dc.rights.accesslevel openAccess
dc.type.version publishedVersion
dc.identifier.url https://www.aclweb.org/anthology/W16-4820.pdf

Files in this item

Total number of downloads: Loading...

Files Size Format View
VarDial320.pdf 572.2Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record