Browsing by Subject "Language technology"

Sort by: Order: Results:

Now showing items 1-9 of 9
  • Koskenniemi, Kimmo Matti (Linköping University Electronic Press, 2017)
    NEALT Proceedings Series
    The paper presents two finite-state methods which can be used for aligning pairs of cognate words or sets of different allomorphs of stems. Both methods use weighted finite-state machines for choosing the best alternative. Individual letter or phoneme correspondences can be weighted according to various principles, e.g. using distinctive features. The comparison of just two forms at a time is simple, so that method is easier to refine to include context conditions. Both methods are language independent and could be tuned for and applied to several types of languages for producing gold standard data. The algorithms were implemented using the HFST finite-state library from short Python programs. The paper demonstrates that the solving of some non-trivial problems has become easier and accessible for a wider range of scholars.
  • Alstola, Tero; Zaia, Shana; Sahala, Aleksi; Jauhiainen, Heidi; Svärd, Saana; Linden, Krister (2019)
  • Pollak, Senja; Boggia, Michele; Linden, Carl-Gustav; Leppänen, Leo; Zosa, Elaine; Toivonen, Hannu (The Association for Computational Linguistics, 2021)
  • Pirinen, Tommi; Silfverberg, Miikka; Linden, Krister (2012)
    In this paper we demonstrate a finite-state implementation of context-aware spell checking utilizing an N-gram based part of speech (POS) tagger to rerank the suggestions from a simple edit-distance based spell-checker. We demonstrate the benefits of context-aware spell-checking for English and Finnish and introduce modifications that are necessary to make traditional N-gram models work for morphologically more complex languages, such as Finnish.
  • Koponen, Maarit; Sulubacak, Umut; Vitikainen, Kaisa; Tiedemann, Jörg (European Association for Machine Translation, 2020)
    This paper presents a user evaluation of machine translation and post-editing for TV subtitles. Based on a process study where 12 professional subtitlers translated and post-edited subtitles, we compare effort in terms of task time and number of keystrokes. We also discuss examples of specific subtitling features like condensation, and how these features may have affected the post-editing results. In addition to overall MT quality, segmentation and timing of the subtitles are found to be important issues to be addressed in future work.
  • Toivonen, Hannu; Boggia, Michele; Mind and Matter; Department of Computer Science; Helsinki Institute for Information Technology; Discovery Research Group/Prof. Hannu Toivonen; Language Technology (The Association for Computational Linguistics, 2021)
  • Öhman, Emily Sofi; Kajava, Kaisla S A (CEUR Workshop Proceedings, 2018)
    CEUR Workshop Proceedings
  • Rehm, Georg; Uszkoreit, Hans; Ananiadou, Sophia; Bel, Nuria; Bieleviciene, Audrone; Borin, Lars; Branco, Antonio; Budin, Gerhard; Calzolari, Nicoletta; Daelemans, Walter; Garabik, Radovan; Grobelnik, Marko; Garcia-Mateo, Carmen; van Genabith, Josef; Hajic, Jan; Hernaez, Inma; Judge, John; Koeva, Svetla; Krek, Simon; Krstev, Cvetana; Linden, Krister; Magnini, Bernardo; Mariani, Joseph; McNaught, John; Melero, Maite; Monachini, Monica; Moreno, Asuncion; Odijk, Jan; Ogrodniczuk, Maciej; Pezik, Piotr; Piperidis, Stelios; Przepiorkowski, Adam; Rognvaldsson, Eirikur; Rosner, Mike; Pedersen, Bolette Sandford; Skadina, Inguna; De Smedt, Koenraad; Tadic, Marko; Thompson, Paul; Tufis, Dan; Varadi, Tamas; Vasiljevs, Andrejs; Vider, Kadri; Zabarskaite, Jolanta (2016)
    This article provides an overview of the dissemination work carried out in META-NET from 2010 until 2015; we describe its impact on the regional, national and international level, mainly with regard to politics and the funding situation for LT topics. The article documents the initiative's work throughout Europe in order to boost progress and innovation in our field.
  • Vázquez , Raúl; Aulamo, Mikko; Sulubacak, Umut; Tiedemann, Jörg (The Association for Computational Linguistics, 2020)
    This paper describes the University of Helsinki Language Technology group’s participation in the IWSLT 2020 offline speech translation task, addressing the translation of English audio into German text. In line with this year’s task objective, we train both cascade and end-to-end systems for spoken language translation. We opt for an end-to-end multitasking architecture with shared internal representations and a cascade approach that follows a standard procedure consisting of ASR, correction, and MT stages. We also describe the experiments that served as a basis for the submitted systems. Our experiments reveal that multitasking training with shared internal representations is not only possible but allows for knowledge-transfer across modalities.