Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Show simple item record

dc.contributor.author Talman, Aarne
dc.contributor.author Suni, Antti
dc.contributor.author Celikkanat, Hande
dc.contributor.author Kakouros, Sofoklis
dc.contributor.author Tiedemann, Jörg
dc.contributor.author Vainio, Martti
dc.contributor.editor Hartmann, Mareike
dc.contributor.editor Plank, Barbara
dc.date.accessioned 2020-02-18T11:35:03Z
dc.date.available 2020-02-18T11:35:03Z
dc.date.issued 2019-09-30
dc.identifier.citation Talman , A , Suni , A , Celikkanat , H , Kakouros , S , Tiedemann , J & Vainio , M 2019 , Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations . in M Hartmann & B Plank (eds) , 22nd Nordic Conference on Computational Linguistics (NoDaLiDa) : Proceedings of the Conference . Linköping Electronic Conference Proceedings , no. 167 , NEALT Proceedings Series , no. 42 , Linköping University Electronic Press , Linköping , pp. 281–290 , Nordic Conference on Computational Linguistics , Turku , Finland , 30/09/2019 .
dc.identifier.citation conference
dc.identifier.other PURE: 132263108
dc.identifier.other PURE UUID: e7f78b80-371d-4925-a8b6-66b34de47073
dc.identifier.other ORCID: /0000-0003-2570-0196/work/70947223
dc.identifier.other ORCID: /0000-0001-8996-0793/work/70953390
dc.identifier.other ORCID: /0000-0003-3065-7989/work/70953439
dc.identifier.other ORCID: /0000-0002-3573-5993/work/70953456
dc.identifier.other ORCID: /0000-0003-2858-5867/work/70953502
dc.identifier.uri http://hdl.handle.net/10138/311873
dc.description.abstract In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models are publicly available. en
dc.format.extent 10
dc.language.iso eng
dc.publisher Linköping University Electronic Press
dc.relation.ispartof 22nd Nordic Conference on Computational Linguistics (NoDaLiDa)
dc.relation.ispartofseries Linköping Electronic Conference Proceedings
dc.relation.ispartofseries NEALT Proceedings Series
dc.relation.isversionof 978-91-7929-995-8
dc.rights cc_by
dc.rights.uri info:eu-repo/semantics/openAccess
dc.subject 113 Computer and information sciences
dc.subject Natural language processing
dc.subject 6121 Languages
dc.title Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations en
dc.type Conference contribution
dc.contributor.organization Department of Digital Humanities
dc.contributor.organization Language Technology
dc.contributor.organization Phonetics
dc.contributor.organization Phonetics and Speech Synthesis
dc.contributor.organization Mind and Matter
dc.description.reviewstatus Peer reviewed
dc.relation.issn 1650-3686
dc.rights.accesslevel openAccess
dc.type.version publishedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
W19_6129.pdf 572.0Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record