An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

Show simple item record

dc.contributor.author Raganato, Alessandro
dc.contributor.author Vázquez, Raúl
dc.contributor.author Creutz, Mathias
dc.contributor.author Tiedemann, Jörg
dc.contributor.editor Augenstein, Isabelle
dc.contributor.editor Gella, Spandana
dc.contributor.editor Ruder, Sebastian
dc.contributor.editor Kann, Katharina
dc.contributor.editor Can, Burcu
dc.contributor.editor Welbl, Johannes
dc.contributor.editor Conneau, Alexis
dc.contributor.editor Ren, Xiang
dc.contributor.editor Rei, Marek
dc.date.accessioned 2019-09-02T07:08:02Z
dc.date.available 2019-09-02T07:08:02Z
dc.date.issued 2019-08-01
dc.identifier.citation Raganato , A , Vázquez , R , Creutz , M & Tiedemann , J 2019 , An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation . in I Augenstein , S Gella , S Ruder , K Kann , B Can , J Welbl , A Conneau , X Ren & M Rei (eds) , The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) : Proceedings of the Workshop . The Association for Computational Linguistics , Stroudsburg , pp. 27-32 , Workshop on Representation Learning for NLP , Florence , Italy , 02/08/2019 . < https://www.aclweb.org/anthology/W19-4304 >
dc.identifier.citation workshop
dc.identifier.other PURE: 126388991
dc.identifier.other PURE UUID: c14916d4-12e2-492d-b8ea-5ad6fe5c735c
dc.identifier.other Bibtex: urn:c90f69c2134aebef56f3a691852cce0f
dc.identifier.other ORCID: /0000-0003-3065-7989/work/61350409
dc.identifier.other ORCID: /0000-0003-1862-4172/work/61350423
dc.identifier.other WOS: 000521942000004
dc.identifier.uri http://hdl.handle.net/10138/305136
dc.description.abstract In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence. en
dc.format.extent 6
dc.language.iso eng
dc.publisher The Association for Computational Linguistics
dc.relation.ispartof The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
dc.relation.isversionof 978-1-950737-35-2
dc.rights cc_by
dc.rights.uri info:eu-repo/semantics/openAccess
dc.subject 6121 Languages
dc.subject 113 Computer and information sciences
dc.title An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation en
dc.type Conference contribution
dc.contributor.organization Department of Digital Humanities
dc.contributor.organization Language Technology
dc.description.reviewstatus Peer reviewed
dc.rights.accesslevel openAccess
dc.type.version publishedVersion
dc.relation.funder European Commission
dc.relation.funder SUOMEN AKATEMIA
dc.identifier.url https://www.aclweb.org/anthology/W19-4304
dc.relation.grantnumber 771113
dc.relation.grantnumber

Files in this item

Total number of downloads: Loading...

Files Size Format View
W19_4304.pdf 371.9Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record