Multilingual NMT with a language-independent attention bridge

Show full item record



Permalink

http://hdl.handle.net/10138/304660

Citation

Vazquez Carrillo , J R , Raganato , A , Tiedemann , J & Creutz , M 2019 , Multilingual NMT with a language-independent attention bridge . in I Augenstein , S Gella , S Ruder , K Kann , B Can , J Welbl , A Conneau , X Ren & M Rei (eds) , The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) : Proceedings of the Workshop . The Association for Computational Linguistics , Stroudsburg , pp. 33-39 , Workshop on Representation Learning for NLP , Florence , Italy , 02/08/2019 .

Title: Multilingual NMT with a language-independent attention bridge
Author: Vazquez Carrillo, Juan Raul; Raganato, Alessandro; Tiedemann, Jörg; Creutz, Mathias
Editor: Augenstein, Isabelle; Gella, Spandana; Ruder, Sebastian; Kann, Katharina; Can, Burcu; Welbl, Johannes; Conneau, Alexis; Ren, Xiang; Rei, Marek
Contributor: University of Helsinki, Department of Digital Humanities
University of Helsinki, Language Technology
University of Helsinki, Department of Digital Humanities
University of Helsinki, Department of Digital Humanities
Publisher: The Association for Computational Linguistics
Date: 2019
Language: eng
Number of pages: 7
Belongs to series: The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) Proceedings of the Workshop
ISBN: 978-1-950737-35-2
URI: http://hdl.handle.net/10138/304660
Abstract: In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.
Subject: 6121 Languages
113 Computer and information sciences
Natural language processing
Multilingual machine translation
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
W19_4305.pdf 375.2Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record