Browsing by Subject "lexicon"

Sort by: Order: Results:

Now showing items 1-9 of 9
  • Stolt, Suvi; Savini, Silvia; Guarini, Annalisa; Caselli, Maria Cristina; Matomäki, Jaakko; Lapinleimu, Helena; Haataja, Leena; Lehtonen, Liisa; Alessandroni, Rosina; Faldella, Giacomo; Sansavini, Alessandra (2017)
    This cross-linguistic study investigated whether the native language has any influence on lexical composition among Italian (N = 125) and Finnish (N = 116) very preterm (born at
  • Vehkavuori, Suvi-Maria; Stolt, Suvi (2019)
    Previous studies have shown that early lexical development is associated with later language development. It is less clear which language domains early receptive/expressive lexicons are associated with. This study analyses these associations. The study also investigates whether children with slow/typical/fast developing early receptive/expressive lexical skills differed in their language skills at three and a half years (42 months) and the predictive value of early receptive/expressive lexical skills for later language skills. The participants of this longitudinal study were 68 healthy, monolingual Finnish-speaking children whose language development was measured using the Finnish, short-form-version of the Communicative Development Inventories at 12, 15, 18 and 24 months. At 42 months, language skills of the participants were assessed using tests measuring lexical, phonological, morphological and general receptive/expressive language skills. Early receptive lexicon was associated with later morphological skills from 15 months and onwards and with other language domains at 24 months. Early expressive lexicon was associated with later morphological skills at 15 months and onwards but with other language domains from 18 months. A trend was found that children with different early lexical growth rates differed in their language skills at 42 months. The best models for predicting later receptive/expressive language skills included variables from both early receptive and expressive lexicons. These models worked well to explain receptive/expressive language skills at 42 months (63/78% of the variance). This study provides novel information on the specific associations between receptive and expressive lexicon growth and later language skills. For clinicians, measuring both receptive and expressive lexicons provides the most representative information on children's language development.
  • Yli-Jyrä, Anssi Mikael (Northern European Association for Language Technology, 2011)
    NEALT Proceedings Series
    A novel technique of adding positionwise flags to one-level finite state lexicons is presented. The proposed flags are kinds of morphophonemic markers and they constitute a flexible method for describing morphophonological processes with a formalism that is tightly coupled with lexical entries and rule-like regular expressions. The formalism is inspired by the techniques used in two-level rule compilation and it practically compiles all the rules in parallel, but in an efficient way. The technique handles morphophonological processes without a separate morphophonemic representation. The occurrences of the allomorphophonemes in latent phonological strings are tracked through a dynamic data structure into which the most prominent (i.e. the best ranked) flags are collected. The application of the technique is suspected to give advantages when describing the morphology of Bantu languages and dialects.
  • Lindfors, And. Otto (Lund, 1824)
  • Koskenniemi, Kimmo Matti (The Association for Computational Linguistics, 2018)
    A practical method for interactive guessing of LEXC lexicon entries is presented. The method is based on describing groups of similarly inflected words using regular expressions. The patterns are compiled into a finite-state transducer (FST) which maps any word form into the possible LEXC lexicon entries which could generate it. The same FST can be used (1) for converting conventional headword lists into LEXC entries, (2) for interactive guessing of entries, (3) for corpus-assisted interactive guessing and (4) guessing entries from corpora. A method of representing affixes as a table is presented as well how the tables can be converted into LEXC format for several different purposes including morphological analysis and entry guessing. The method has been implemented using the HFST finite-state transducer tools and its Python embedding plus a number of small Python scripts for conversions. The method is tested with a near complete implementation of Finnish verbs. An experiment of generating Finnish verb entries out of corpus data is also described as well as a creation of a full-scale analyzer for Finnish verbs using the conversion patterns.
  • Drobac, Senka; Linden, Krister; Pirinen, Tommi; Silfverberg, Miikka (European Language Resources Association (ELRA), 2014)
    Flag diacritics, which are special multi-character symbols executed at runtime, enable optimising finite-state networks by combining identical sub-graphs of its transition graph. Traditionally, the feature has required linguists to devise the optimisations to the graph by hand alongside the morphological description. In this paper, we present a novel method for discovering flag positions in morphological lexicons automatically, based on the morpheme structure implicit in the language description. With this approach, we have gained significant decrease in the size of finite-state networks while maintaining reasonable application speed. The algorithm can be applied to any language description, where the biggest achievements are expected in large and complex morphologies. The most noticeable reduction in size we got with a morphological transducer for Greenlandic, whose original size is on average about 15 times larger than other morphologies. With the presented hyper-minimization method, the transducer is reduced to 10,1% of the original size, with lookup speed decreased only by 9,5%.
  • Virtanen, Toni; Nuutinen, Mikko; Häkkinen, Jukka (2019)
    We have collected a large dataset of subjective image quality "*nesses," such as sharpness or colorfulness. The dataset comes from seven studies and contains 39,415 quotations from 146 observers who have evaluated 62 scenes either in print images or on display. We analyzed the subjective evaluations and formed a hierarchical image quality attribute lexicon for *nesses, which is visualized as image quality wheel (IQ-Wheel). Similar wheel diagrams for attributes have become industry standards in other sensory experience fields such as flavor and fragrance sciences. The IQ-Wheel contains the frequency information of 68 attributes relating to image quality. Only 20% of the attributes were positive, which agrees with previous findings showing a preference for negative attributes in image quality evaluation. Our results also show that excluding physical attributes of paper gloss, observers then use similar terminology when evaluating images with printed images or images viewed on a display. IQ-Wheel can be used to guide the selection of scenes and distortions when designing subjective experimental setups and creating image databases. (C) 2019 SPIE and IS&T
  • Kuvac, Jelena; Blazi, Antonija; Schults, Astra; Tulviste, Tiia; Stolt, Suvi (2021)
    Cross-linguistic studies can provide information about general and language specific features of language development, but relatively few such studies are available in literature. The main aim of the present study was to investigate, from a cross-linguistic perspective, the roles of the internal factor of gender and external factors of birth order and parental education level on the development of language in 2-year-old children. We examined 351 children growing up in three European language contexts: Croatian (N = 104), Estonian (N = 141) and Finnish (N = 106). Information on lexical skills and word combination ability was collected using the short form of the MacArthur-Bates Communicative Development Inventories and the influence of background factors on these aspects of language development was investigated. No significant differences were found in lexical skills or word combination ability among the three language groups. These aspects of language development varied significantly with gender, but not with external factors. Our findings suggest that internal factors may influence early language development more than external factors.
  • Tsurkka, Tiia (Helsingin yliopisto, 2018)
    Pro Gradu –tutkielmani käsittelee kristillisen tietokirjallisuuden kääntämistä sanaston kääntämisen näkökulmasta. Materiaalini on oma 20 sivun käännökseni Leif Nummelan kirjasta Raamatun punainen lanka. Käännös on tehty suomesta englantiin. Käännös on tehty samanaikaisesti toimeksiantona Nummelaa ja tätä tutkielmaa varten. Tutkielmani teoriapohja koostuu Eugene Nidan ekvivalenssin käsitteestä sekä Nidan Raamatun kääntämisen teorioista. Ekvivalenssia on myöhemmin käyttänyt myös Jean-Paul Vinay ja Jean Darbelnet, joiden teorioita käsittelen myös. Tämän lisäksi käytän Katharina Reissin ja Hans J. Vermeerin Skopos-teoriaa käännöksen tarkoituksen määrittelyyn. Viimeisimpänä hyödynnän myös Lawrence Venutin kotouttamisen ja vieraannuttamisen käsitteitä, joista myös Nida on kirjoittanut. Tutkimuskysymykseni on: ”Mitä käännösstrategioita käännöksestäni löytyy?” Oletukseni on, että kristillisessä kirjallisuudessa luovuus ei ole suurissa määrissä sallittua. Raamatun kääntämistä on tutkittu niin pitkään, että sanasto on tarkoituksellisesti muokkautunut nykyiseen muotoonsa. Kristinuskon sanastoa voidaan käsitellä yhtenä erikoisalan sanastona, ja ammattitaitoisen kääntäjän tulisi hallita tämä sanasto ryhtyessään kääntämään tämän alan tekstejä. Luen sanastoon mukaan sekä Raamatusta löytyvän sanaston että kristinuskoon yleisesti vakiintuneen sanaston. Analysoin käännöstä kolmessa osassa. Ensimmäisenä analysoin lähtötekstistä löytyvien Raamatun jakeiden käännöstä ekvivalenssin käsitteen avulla. Oletuksena on, että jakeet tulee vain mekaanisesti korvata kohdekielen Raamatun jakeilla. Seuraavaksi analysoin tekstiä, jossa kirjailija on viitannut Raamatun jakeisiin, mutta muokannut ja selittänyt sanomaa. Tässä osiossa tärkeintä on sanaston pysyminen Raamatun sanaston mukaisena. Viimeisenä analysoin kaikkea jäljelle jäävää tekstiä, joka ei juurikaan käsittele Raamatun kertomuksia. Tässä osuudessa kääntäjän on mahdollisuus käyttää kotouttavaa tai vieraannuttavaa strategiaa, kunhan lopputulos ei ole Raamatun sanoman vastainen. Käsittelen kaikkia kategorioita myös käännöksen Skopoksen näkökulmasta. Tärkein Skopos tekstilleni on se, että käännös on Raamatun oppien mukainen, tyyli ja kieliopilliset seikat ovat toissijaisia. Johtopäätökseni on, että ekvivalenssi ja Skopos ovat toimivia teorioita kristillisten aiheiden kääntämiseen. Kotouttamista tai vieraannuttamista ei käännöksestäni juurikaan löytynyt, mutta se saattaa myös tarkoittaa sitä, että niitä ei olisikaan voitu käyttää – niiden poissaolo ei tarkoita väärää käännöstä. Kääntäjälle on tärkeintä löytää tasapaino näiden kahden käsitteen väliltä.