Normalizing Non-canonical Turkish Texts Using Machine Translation Approaches

Show full item record



Permalink

http://hdl.handle.net/10138/305143

Citation

Çolakoğlu , T , Sulubacak , U & Tantuğ , A C 2019 , Normalizing Non-canonical Turkish Texts Using Machine Translation Approaches . in F Alva-Manchego , E Choi & D Khashabi (eds) , The 57th Annual Meeting of the Association for Computational Linguistics : Proceedings of the Student Research Workshop . The Association for Computational Linguistics , Stroudsburg , pp. 267-272 , 2019 ACL Student Research Workshop , Florence , Italy , 29/07/2019 . < https://www.aclweb.org/anthology/P19-2037 >

Title: Normalizing Non-canonical Turkish Texts Using Machine Translation Approaches
Author: Çolakoğlu, Talha; Sulubacak, Umut; Tantuğ, Ahmet Cüneyd
Other contributor: University of Helsinki, Language Technology
Alva-Manchego, Fernando
Choi, Eunsol
Khashabi, Daniel

Publisher: The Association for Computational Linguistics
Date: 2019-07-28
Language: eng
Number of pages: 6
Belongs to series: The 57th Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop
ISBN: 978-1-950737-47-5
URI: http://hdl.handle.net/10138/305143
Abstract: With the growth of the social web, user-generated text data has reached unprecedented sizes. Non-canonical text normalization provides a way to exploit this as a practical source of training data for language processing systems. The state of the art in Turkish text normalization is composed of a token level pipeline of modules, heavily dependent on external linguistic resources and manually defined rules. Instead, we propose a fully automated, context-aware machine translation approach with fewer stages of processing. Experiments with various implementations of our approach show that we are able to surpass the current best-performing system by a large margin.
Subject: 113 Computer and information sciences
Natural language processing
Text normalization
Machine translation
6121 Languages
Computational linguistics
Turkish
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
P19_2037.pdf 208.5Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record