Granroth-Wilding , M & Toivonen , H 2019 , Unsupervised Learning of Cross-Lingual Symbol Embeddings Without Parallel Data . in Second Annual Meeting of the Society for Computation in Linguistics (SCiL 2019) . , 4 , The Association for Computational Linguistics , pp. 19-28 , Society for Computation in Linguistics , New York , New York , United States , 03/01/2019 . https://doi.org/10.7275/wx64-ea83
Title: | Unsupervised Learning of Cross-Lingual Symbol Embeddings Without Parallel Data |
Author: | Granroth-Wilding, Mark; Toivonen, Hannu |
Contributor organization: | Department of Computer Science Discovery Research Group/Prof. Hannu Toivonen Helsinki Institute for Information Technology |
Publisher: | The Association for Computational Linguistics |
Date: | 2019-01-03 |
Language: | eng |
Number of pages: | 10 |
Belongs to series: | Second Annual Meeting of the Society for Computation in Linguistics (SCiL 2019) |
ISBN: | 978-1-5108-7753-5 |
DOI: | https://doi.org/10.7275/wx64-ea83 |
URI: | http://hdl.handle.net/10138/304870 |
Abstract: | We present a new method for unsupervised learning of multilingual symbol (e.g. character) embeddings, without any parallel data or prior knowledge about correspondences between languages. It is able to exploit similarities across languages between the distributions over symbols' contexts of use within their language, even in the absence of any symbols in common to the two languages. In experiments with an artificially corrupted text corpus, we show that the method can retrieve character correspondences obscured by noise. We then present encouraging results of applying the method to real linguistic data, including for low-resourced languages. The learned representations open the possibility of fully unsupervised comparative studies of text or speech corpora in low-resourced languages with no prior knowledge regarding their symbol sets. |
Subject: |
113 Computer and information sciences
6121 Languages |
Peer reviewed: | Yes |
Rights: | unspecified |
Usage restriction: | openAccess |
Self-archived version: | publishedVersion |
Funder: | Academy of Finland |
Grant number: |
Total number of downloads: Loading...
Files | Size | Format | View |
---|---|---|---|
mgwht2019.pdf | 977.7Kb |
View/ |