Analysing concatenation approaches to document-level NMT in two different domains

Show simple item record

dc.contributor.author Scherrer, Yves
dc.contributor.author Tiedemann, Jörg
dc.contributor.author Loáiciga, Sharid
dc.date.accessioned 2019-11-11T12:34:02Z
dc.date.available 2019-11-11T12:34:02Z
dc.date.issued 2019-11-01
dc.identifier.citation Scherrer , Y , Tiedemann , J & Loáiciga , S 2019 , Analysing concatenation approaches to document-level NMT in two different domains . in The Fourth Workshop on Discourse in Machine Translation : Proceedings of the Workshop . The Association for Computational Linguistics , Stroudsburg , pp. 51-61 , Workshop on Discourse in Machine Translation , Hong Kong , China , 03/11/2019 . https://doi.org/10.18653/v1/D19-6506
dc.identifier.citation workshop
dc.identifier.other PURE: 127932479
dc.identifier.other PURE UUID: 3ba01646-0650-4d9d-9e4b-b71b91298ab8
dc.identifier.other Bibtex: urn:d8d849eab2f99dd2ebbb3e2e838ad674
dc.identifier.other ORCID: /0000-0001-5247-5073/work/64666692
dc.identifier.other ORCID: /0000-0003-3065-7989/work/64666893
dc.identifier.uri http://hdl.handle.net/10138/306876
dc.description.abstract In this paper, we investigate how different aspects of discourse context affect the performance of recent neural MT systems. We describe two popular datasets covering news and movie subtitles and we provide a thorough analysis of the distribution of various document-level features in their domains. Furthermore, we train a set of context-aware MT models on both datasets and propose a comparative evaluation scheme that contrasts coherent context with artificially scrambled documents and absent context, arguing that the impact of discourse-aware MT models will become visible in this way. Our results show that the models are indeed affected by the manipulation of the test data, providing a different view on document-level translation quality than absolute sentence-level scores. en
dc.format.extent 11
dc.language.iso eng
dc.publisher The Association for Computational Linguistics
dc.relation.ispartof The Fourth Workshop on Discourse in Machine Translation
dc.relation.isversionof 978-1-950737-74-1
dc.rights cc_by
dc.rights.uri info:eu-repo/semantics/openAccess
dc.subject 113 Computer and information sciences
dc.subject 6121 Languages
dc.title Analysing concatenation approaches to document-level NMT in two different domains en
dc.type Conference contribution
dc.contributor.organization Department of Digital Humanities
dc.contributor.organization Language Technology
dc.contributor.organization Mind and Matter
dc.description.reviewstatus Peer reviewed
dc.relation.doi https://doi.org/10.18653/v1/D19-6506
dc.rights.accesslevel openAccess
dc.type.version publishedVersion
dc.relation.funder European Commission / Horizon 2020
dc.relation.funder European Research Council (ERC)
dc.relation.grantnumber
dc.relation.grantnumber 771113

Files in this item

Total number of downloads: Loading...

Files Size Format View
D19_6506.pdf 259.2Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record