Browsing by Subject "machine learning"

Sort by: Order: Results:

Now showing items 1-20 of 66
  • Koivisto, Maria (Helsingin yliopisto, 2020)
    Immunohistochemistry (IHC) is a widely used research tool for detecting antigens and can be used in medical and biochemical research. The co-localization of two separate proteins is sometimes crucial for analysis, requiring a double staining. This comes with a number of challenges since staining results depend on the pre-treatment of samples, host-species where the antibody was raised and spectral differentiation of the two proteins. In this study, the proteins GABAR-α2 and CAMKII were stained simultaneously to study the expression of the GABA receptor in hippocampal pyramidal cells. This was performed in PGC-1α transgenic mice, possibly expressing GABAR-α2 excessively compared to wildtype mice. Staining optimization was performed regarding primary and secondary antibody concentration, section thickness, antigen retrieval and detergent. Double staining was performed successfully and proteins of interest were visualized using a confocal microscope after which image analyses were performed using two different methods: 1) a traditional image analysis based on intensity and density of stained dots and 2) a novel convolutional neural network (CNN) machine learning approach. The traditional image analysis did not detect any differences in the stained brain slices, whereas the CNN model showed an accuracy of 72% in categorizing the images correctly as transgenic/wildtype brain slices. The results from the CNN model imply that GABAR-α2 is expressed differently in PGC-1α transgenic mice, which might impact other factors such as behaviour and learning. This protocol and the novel method of using CNN as an image analysis tool can be of future help when performing IHC analysis on brain neuronal studies.
  • Tolonen, Mikko; Lahti, Leo; Ilomäki, Niko (2015)
    This article analyses publication trends in the field of history in early modern Britain and North America in 1470–1800, based on English Short- Title Catalogue (ESTC) data. Its major contribution is to demonstrate the potential of digitized library catalogues as an essential scholastic tool and part of reproducible research. We also introduce a novel way of quantitatively analysing a particular trend in book production, namely the publishing of works in the field of history. The study is also our first experimental analysis of paper consumption in early modern book production, and dem- onstrates in practice the importance of open-science principles for library and information science. Three main research questions are addressed: 1) who wrote history; 2) where history was published; and 3) how publishing changed over time in early modern Britain and North America. In terms of our main findings we demonstrate that the average book size of history publications decreased over time, and that the octavo-sized book was the rising star in the eighteenth century, which is a true indication of expand- ing audiences. The article also compares different aspects of the most popu- lar writers on history, such as Edmund Burke and David Hume. Although focusing on history, these findings may reflect more widespread publishing trends in the early modern era. We show how some of the key questions in this field can be addressed through the quantitative analysis of large-scale bibliographic data collections.
  • Huertas, Andres (Helsingin yliopisto, 2020)
    Investment funds are continuously looking for new technologies and ideas to enhance their results. Lately, with the success observed in other fields, wealth managers are taking a closes look at machine learning methods. Even if the use of ML is not entirely new in finance, leveraging new techniques has proved to be challenging and few funds succeed in doing so. The present work explores de usage of reinforcement learning algorithms for portfolio management for the stock market. It is well known the stochastic nature of stock and aiming to predict the market is unrealistic; nevertheless, the question of how to use machine learning to find useful patterns in the data that enable small market edges, remains open. Based on the ideas of reinforcement learning, a portfolio optimization approach is proposed. RL agents are trained to trade in a stock exchange, using portfolio returns as rewards for their RL optimization problem, thus seeking optimal resource allocation. For this purpose, a set of 68 stock tickers in the Frankfurt exchange market was selected, and two RL methods applied, namely Advantage Actor-Critic(A2C) and Proximal Policy Optimization (PPO). Their performance was compared against three commonly traded ETFs (exchange-traded funds) to asses the algorithm's ability to generate returns compared to real-life investments. Both algorithms were able to achieve positive returns in a year of testing( 5.4\% and 9.3\% for A2C and PPO respectively, a European ETF (VGK, Vanguard FTSE Europe Index Fund) for the same period, reported 9.0\% returns) as well as healthy risk-to-returns ratios. The results do not aim to be financial advice or trading strategies, but rather explore the potential of RL for studying small to medium size stock portfolios.
  • Nygren, Saara (Helsingin yliopisto, 2020)
    A relational database management system’s configuration is essential while optimizing database performance. Finding the optimal knob configuration for the database requires tuning of multiple interdependent knobs. Over the past few years, relational database vendors have added machine learning models to their products and Oracle announced the first autonomous (i.e self-driving) database in 2017. This thesis clarifies the autonomous database concept and surveys the latest research on machine learning methods for relational database knob tuning. The study aimed to find solutions that can tune multiple database knobs and be applied to any relational database. The survey found three machine learning implementations that tune multiple knobs at a time. These are called OtterTune, CDBTune, and QTune. Ottertune uses traditional machine learning techniques, while CDBTune and QTune rely on deep reinforcement learning. These implementations are presented in this thesis, along with a discussion of the features they offer. The thesis also presents an autonomic system’s basic concepts like self-CHOP and MAPE-K feedback loop and a knowledge model to define the knowledge needed to implement them. These can be used in the autonomous database contexts along with Intelligent Machine Design and Five Levels of AI-Native Database to present requirements for the autonomous database.
  • Mouchlis, Varnavas D.; Afantitis, Antreas; Serra, Angela; Fratello, Michele; Papadiamantis, Anastasios G.; Aidinis, Vassilis; Lynch, Iseult; Greco, Dario; Melagraki, Georgia (2021)
    De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including ma-chine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been em-ployed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and high-lights hot topics for further development.
  • Kibble, Milla; Khan, Suleiman A.; Ammad-ud-din, Muhammad; Bollepalli, Sailalitha; Palviainen, Teemu; Kaprio, Jaakko; Pietiläinen, Kirsi H.; Ollikainen, Miina (2020)
    We combined clinical, cytokine, genomic, methylation and dietary data from 43 young adult monozygotic twin pairs (aged 22-36 years, 53% female), where 25 of the twin pairs were substantially weight discordant (delta body mass index > 3 kg m(-2)). These measurements were originally taken as part of the TwinFat study, a substudy of The Finnish Twin Cohort study. These five large multivariate datasets (comprising 42, 71, 1587, 1605 and 63 variables, respectively) were jointly analysed using an integrative machine learning method called group factor analysis (GFA) to offer new hypotheses into the multi-molecular-level interactions associated with the development of obesity. New potential links between cytokines and weight gain are identified, as well as associations between dietary, inflammatory and epigenetic factors. This encouraging case study aims to enthuse the research community to boldly attempt new machine learning approaches which have the potential to yield novel and unintuitive hypotheses. The source code of the GFA method is publically available as the R package GFA.
  • Itkonen, Sami (Helsingin yliopisto, 2020)
    Sanayhdistelmät ovat useamman sanan kombinaatioita, jotka ovat jollakin tavalla jähmeitä ja/tai idiomaattisia. Tutkimuksessa tarkastellaan suomen kielen verbaalisia idiomeja sanaupotusmenetelmän (word2vec) avulla. Työn aineistona käytetään Gutenberg-projektista haettuja suomenkielisiä kirjoja. Työssä tutkitaan pääosin erityisesti idiomeja, joissa esiintyy suomen kielen sana ‘silmä’. Niiden idiomaattisuutta mitataan komposiittisuuden (kuinka hyvin sanayhdistelmän merkitys vastaa sen komponenttien merkitysten kombinaatiota) ja jähmeyttä leksikaalisen korvaustestin avulla. Vastaavat testit tehdään myös sanojen sisäisen rakenteen huomioonottavan fastText-algoritmin avulla. Työssä on myös luotu Gutenberg-korpuksen perusteella pienehkö luokiteltu lausejoukko, jota lajitellaan neuroverkkopohjaisen luokittelijan avulla. Tämä lisäksi työssä tunnustellaan eri ominaisuuksien kuten sijamuodon vaikutusta idiomin merkitykseen. Mittausmenetelmien tulokset ovat yleisesti ottaen varsin kirjavia. fastText-algoritmin suorituskyky on yleisesti ottaen hieman parempi kuin perusmenetelmän; sen lisäksi sanaupotusten laatu on parempi. Leksikaalinen korvaustesti antaa parhaimmat tulokset, kun vain lähin naapuri otetaan huomioon. Sijamuodon todettiin olevan varsin tärkeä idiomin merkityksen määrittämiseen. Mittauksien heikot tulokset voivat johtua monesta tekijästä, kuten siitä, että idiomien semanttisen läpinäkyvyyden aste voi vaihdella. Sanaupotusmenetelmä ei myöskään normaalisti ota huomioon sitä, että myös sanayhdistelmillä voi olla useita merkityksiä (kirjaimellinen ja idiomaattinen/kuvaannollinen). Suomen kielen rikas morfologia asettaa menetelmälle myös ylimääräisiä haasteita. Tuloksena voidaan sanoa, että sanaupotusmenetelmä on jokseenkin hyödyllinen suomen kielen idiomien tutkimiseen. Testattujen mittausmenetelmien käyttökelpoisuus yksin käytettynä on rajallinen, mutta ne saattaisivat toimia paremmin osana laajempaa tutkimusmekanismia.
  • Lehtonen, Leevi (Helsingin yliopisto, 2021)
    Quantum computing has an enormous potential in machine learning, where problems can quickly scale to be intractable for classical computation. A Boltzmann machine is a well-known energy-based graphical model suitable for various machine learning tasks. Plenty of work has already been conducted for realizing Boltzmann machines in quantum computing, all of which have somewhat different characteristics. In this thesis, we conduct a survey of the state-of-the-art in quantum Boltzmann machines and their training approaches. Primarily, we examine variational quantum Boltzmann machine, a specific variant of quantum Boltzmann machine suitable for the near-term quantum hardware. Moreover, as variational quantum Boltzmann machine heavily relies on variational quantum imaginary time evolution, we effectively analyze variational quantum imaginary time evolution to a great extent. Compared to the previous work, we evaluate the execution of variational quantum imaginary time evolution with a more comprehensive collection of hyperparameters. Furthermore, we train variational quantum Boltzmann machines using a toy problem of bars and stripes, representing more multimodal probability distribution than the Bell states and the Greenberger-Horne-Zeilinger states considered in the earlier studies.
  • Hämäläinen, Kreetta (Helsingin yliopisto, 2021)
    Personalized medicine tailors therapies for the patient based on predicted risk factors. Some tools used for making predictions on the safety and efficacy of drugs are genetics and metabolomics. This thesis focuses on identifying biomarkers for the activity level of the drug transporter organic anion transporting polypep-tide 1B1 (OATP1B1) from data acquired from untargeted metabolite profiling. OATP1B1 transports various drugs, such as statins, from portal blood into the hepatocytes. OATP1B1 is a genetically polymorphic influx transporter, which is expressed in human hepatocytes. Statins are low-density lipoprotein cholesterol-lowering drugs, and decreased or poor OATP1B1 function has been shown to be associated with statin-induced myopathy. Based on genetic variability, individuals can be classified to those with normal, decreased or poor OATP1B1 function. These activity classes were employed to identify metabolomic biomarkers for OATP1B1. To find the most efficient way to predict the activity level and find the biomarkers that associate with the activity level, 5 different machine learning models were tested with a dataset that consisted of 356 fasting blood samples with 9152 metabolite features. The models included both a Random Forest regressor and a classifier, Gradient Boosted Decision Tree regressor and classifier, and a Deep Neural Network regressor. Hindrances specific for this type of data was the collinearity between the features and the large amount of features compared to the number of samples, which lead to issues in determining the important features of the neural network model. To adjust to this, the data was clustered according to their Spearman’s rank-order correlation ranks. Feature importances were calculated using two methods. In the case of neural network, the feature importances were calculated with permutation feature importance using mean squared error, and random forest and gradient boosted decision trees used gini impurity. The performance of each model was measured, and all classifiers had a poor ability to predict decreasead and poor function classes. All regressors performed very similarly to each other. Gradient boosted decision tree regressor performed the best by a slight margin, but random forest regressor and neural network regressor performed nearly as well. The best features from all three models were cross-referenced with the features found from y-aware PCA analysis. The y-aware PCA analysis indicated that 14 best features cover 95% of the explained variance, so 14 features were picked from each model and cross-referenced with each other. Cross-referencing highest scoring features reported by the best models found multiple features that showed up as important in many models.Taken together, machine learning methods provide powerful tools to identify potential biomarkers from untargeted metabolomics data.
  • Garcia Moreno-Esteva, Enrique; White, Sonia L. J.; Wood, Joanne M.; Black, Alex A. (2018)
    In this research, we aimed to investigate the visual-cognitive behaviours of a sample of 106 children in Year 3 (8.8 ± 0.3 years) while completing a mathematics bar-graph task. Eye movements were recorded while children completed the task and the patterns of eye movements were explored using machine learning approaches. Two different techniques of machine-learning were used (Bayesian and K-Means) to obtain separate model sequences or average scanpaths for those children who responded either correctly or incorrectly to the graph task. Application of these machine-learning approaches indicated distinct differences in the resulting scanpaths for children who completed the graph task correctly or incorrectly: children who responded correctly accessed information that was mostly categorised as critical, whereas children responding incorrectly did not. There was also evidence that the children who were correct accessed the graph information in a different, more logical order, compared to the children who were incorrect. The visual behaviours aligned with different aspects of graph comprehension, such as initial understanding and orienting to the graph, and later interpretation and use of relevant information on the graph. The findings are discussed in terms of the implications for early mathematics teaching and learning, particularly in the development of graph comprehension, as well as the application of machine learning techniques to investigations of other visual-cognitive behaviours.
  • Tanoli, Ziaurrehman; Vähä-Koskela, Markus; Aittokallio, Tero (2021)
    Introduction: Drug repurposing provides a cost-effective strategy to re-use approved drugs for new medical indications. Several machine learning (ML) and artificial intelligence (AI) approaches have been developed for systematic identification of drug repurposing leads based on big data resources, hence further accelerating and de-risking the drug development process by computational means. Areas covered: The authors focus on supervised ML and AI methods that make use of publicly available databases and information resources. While most of the example applications are in the field of anticancer drug therapies, the methods and resources reviewed are widely applicable also to other indications including COVID-19 treatment. A particular emphasis is placed on the use of comprehensive target activity profiles that enable a systematic repurposing process by extending the target profile of drugs to include potent off-targets with therapeutic potential for a new indication. Expert opinion: The scarcity of clinical patient data and the current focus on genetic aberrations as primary drug targets may limit the performance of anticancer drug repurposing approaches that rely solely on genomics-based information. Functional testing of cancer patient cells exposed to a large number of targeted therapies and their combinations provides an additional source of repurposing information for tissue-aware AI approaches.
  • Romppainen, Jonna (Helsingin yliopisto, 2020)
    Surface diffusion in metals can be simulated with the atomistic kinetic Monte Carlo (KMC) method, where the evolution of a system is modeled by successive atomic jumps. The parametrisation of the method requires calculating the energy barriers of the different jumps that can occur in the system, which poses a limitation to its use. A promising solution to this are machine learning methods, such as artificial neural networks, which can be trained to predict barriers based on a set of pre-calculated ones. In this work, an existing neural network based parametrisation scheme is enhanced by expanding the atomic environment of the jump to include more atoms. A set of surface diffusion jumps was selected and their barriers were calculated with the nudged elastic band method. Artificial neural networks were then trained on the calculated barriers. Finally, KMC simulations of nanotip flattening were run using barriers which were predicted by the neural networks. The simulations were compared to the KMC results obtained with the existing scheme. The additional atoms in the jump environment caused significant changes to the barriers, which cannot be described by the existing model. The trained networks also showed a good prediction accuracy. However, the KMC results were in some cases more realistic or as realistic as the previous results, but often worse. The quality of the results also depended strongly on the selection of training barriers. We suggest that, for example, active learning methods can be used in the future to select the training data optimally.
  • Mukhtar, Usama (Helsingin yliopisto, 2020)
    Sales forecasting is crucial for run any retail business efficiently. Profits are maximized if popular products are available to fulfill the demand. It is also important to minimize the loss caused by unsold stock. Fashion retailers face certain challenges which make sales forecasting difficult for the products. Some of these challenges are the short life cycle of products and introduction of new products all around the year. The goal of this thesis is to study forecasting methods for fashion. We use the product attributes for products in a season to build a model that can forecast sales for all the products in the next season. Sales for different attributes are analysed for three years. Sales for different variables vary for values which indicate that a model fitted on product attributes may be used for forecasting sales. A series of experiments are conducted with multiple variants of the datasets. We implemented multiple machine learning models and compared them against each other. Empirical results are reported along with the baseline comparisons to answer research questions. Results from first experiment indicate that machine learning models are almost doing as good as the baseline model that uses mean values as predictions. The results may improve in the upcoming years when more data is available for training. The second experiment shows that models built for specific product groups are better than the generic models that are used to predict sales for all kinds of products. Since we observed a heavy tail in the data, a third experiment was conducted to use logarithmic sales for predictions, and the results do not improve much as compared to results from previous methods. The conclusion of the thesis is that machine learning methods can be used for attribute-based sales forecasting in fashion industry but more data is needed, and modeling specific groups of products bring better results.
  • Jääskeläinen, Matias (Helsingin yliopisto, 2020)
    This thesis is about exploring descriptors for atmospheric molecular clusters. Descriptors are needed for applying machine learning methods for molecular systems. There is a collection of descriptors readily available in the DScribe-library developed in Aalto University for custom machine learning applications. The question of which descriptors to use is up to the user to decide. This study takes the first steps in integrating machine learning into existing procedure of configurational sampling that aims to find the optimal structure for any given molecular cluster of interest. The structure selection step forms a bottleneck in the configurational sampling procedure. A new structure selection method presented in this study uses k-means clustering to find structures that are similar to each other. The clustering results can be used to discard redundant structures more effectively than before which leaves fewer structures to be calculated with more expensive computations. Altogether that speeds up the configurational sampling procedure. To aid the selection of suitable descriptor for this application, a comparison of four descriptors available in DScribe is made. A procedure for structure selection by representing atmospheric clusters with descriptors and labeling them into groups with k-means was implemented. The performance of descriptors was compared with a custom score suitable for this application, and it was found that MBTR outperforms the other descriptors. This structure selection method will be utilized in the existing configurational sampling procedure for atmospheric molecular clusters but it is not restricted to that application.
  • Iqbal, Sumaiya; Perez-Palma, Eduardo; Jespersen, Jakob B.; May, Patrick; Hoksza, David; Heyne, Henrike O.; Ahmed, Shehab S.; Rifat, Zaara T.; Rahman, M. Sohel; Lage, Kasper; Palotie, Aarno; Cottrell, Jeffrey R.; Wagner, Florence F.; Daly, Mark J.; Campbell, Arthur J.; Lal, Dennis (2020)
    Interpretation of the colossal number of genetic variants identified from sequencing applications is one of the major bottlenecks in clinical genetics, with the inference of the effect of amino acid-substituting missense variations on protein structure and function being especially challenging. Here we characterize the three-dimensional (3D) amino acid positions affected in pathogenic and population variants from 1,330 disease-associated genes using over 14,000 experimentally solved human protein structures. By measuring the statistical burden of variations (i.e., point mutations) from all genes on 40 3D protein features, accounting for the structural, chemical, and functional context of the variations' positions, we identify features that are generally associated with pathogenic and population missense variants. We then perform the same amino acid-level analysis individually for 24 protein functional classes, which reveals unique characteristics of the positions of the altered amino acids: We observe up to 46% divergence of the class-specific features from the general characteristics obtained by the analysis on all genes, which is consistent with the structural diversity of essential regions across different protein classes. We demonstrate that the function-specific 3D features of the variants match the readouts of mutagenesis experiments for BRCA1 and PTEN, and positively correlate with an independent set of clinically interpreted pathogenic and benign missense variants. Finally, we make our results available through a web server to foster accessibility and downstream research. Our findings represent a crucial step toward translational genetics, from highlighting the impact of mutations on protein structure to rationalizing the variants' pathogenicity in terms of the perturbed molecular mechanisms.
  • Amadae, S. M. (Faculty of Social Sciences, University of Helsinki, 2020)
    Computational Transformation of the Public Sphere is the organic product of what turned out to be an effective collaboration between MA students and their professor in the Global Politics and Communication program in the Faculty of Social Sciences at the University of Helsinki, in the Fall of 2019. The course, Philosophy of Politics and Communication, is a gateway course into this MA program. As I had been eager to conduct research on the impact of new digital technologies and artificial intelligence (AI) on democratic governance, I saw this course as an opportunity to not only share, but also further develop my knowledge of this topic.
  • An, Yu (Helsingin yliopisto, 2020)
    Maps of science, or cartography of scientific fields, provide insights into the state of scientific knowledge. Analogous to geographical maps, maps of science present the fields as positions and show the paths connecting each other, which can serve as an intuitive illustration for the history of science or a hint to spot potential opportunities for collaboration. In this work, I investigate the reproducibility of a method to generate such maps. The idea of the method is to derive representations representations for the given scientific fields with topic models and then perform hierarchical clustering on these, which in the end yields a tree of scientific fields as the map. The result is found unreproducible, as my result obtained on the arXiv data set (~130k articles from arXiv Computer Science) shows an inconsistent structure from the one in the reference study. To investigate the cause of the inconsistency, I derive a second set of maps using the same method and an adjusted data set, which is constructed by re-sampling the arXiv data set to a more balanced distribution. The findings show the confounding factors in the data cannot account for the inconsistency; instead, it should be due to the stochastic nature of the unsupervised algorithm. I also improve the approach by using ensemble topic models to derive representations. It is found the method to derive maps of science can be reproducible when it uses an ensemble topic model fused from a sufficient number of base models.
  • Kajava, Kaisla (Helsingin yliopisto, 2018)
    Sentimenttianalyysi (sentiment analysis) on nopeasti kehittyvä kieliteknologian ala, jonka päämääränä on automaattisesti tunnistaa luonnollisella kielellä tuotetusta tekstistä subjektiivisia piirteitä. Tyypillisesti sentimenttianalyysissa luokitellaan tekstiä binäärisesti luokkiin ‘positiivinen’ tai ‘negatiivinen’. Moniluokkainen tunneskaala saadaan kuitenkin kasvattamalla mahdollisten sentimenttiluokkien määrää, jolloin mukaan otetaan hienojakoisempia tunteita kuten ‘vihainen’, ‘iloinen’ ja ‘surullinen’. Tekstiklassifikaatiossa käytetään usein ohjattuja koneoppimismenetelmiä. Tämä edellyttää riittävää opetusaineistoa, jonka avulla klassifikaatioalgoritmi voidaan opettaa tunnistamaan tekstistä haluttuja piirteitä. Koska sentimenttianalyysiin tarvittavat opetusaineistot ovat pääosin englanninkielisiä, muunkielisiä aineistoja tuotetaan kääntämällä alkuperäinen aineisto eri kielille. On kuitenkin tärkeää arvioida käännetyn aineiston käytettävyyttä koneoppimisalgoritmien opetuksessa. Kun teksti käännetään kieleltä toiselle, tulee alkuperäisen sentimentti-informaation säilyä ennallaan, jotta tekstiä voidaan luotettavasti käyttää algoritmien opettamiseen. Mikäli sentimentti-informaatio säilyy hyvin käännetyssä tekstissä, kieltenvälisiä sentimenttiaineistoja voidaan koota siirto-oppimismenetelmillä (transfer learning) eli projisoimalla alkuperäiskielisten virkkeiden sentimenttiluokat käännetyille virkkeille. Tämä pro gradu -tutkimus arvioi, missä määrin luonnollisen kielen binäärinen ja moniluokkainen sentimentti-informaatio säilyy samana, kun teksti käännetään kieleltä toiselle. Tutkimusaineistona käytetään paralleeleja virkkeitä alkuperäiskielellä englanniksi sekä käännöksinä suomeksi, ranskaksi ja italiaksi. Sentimentti-informaation säilymistä tutkitaan annotoimalla ensin englanninkieliset virkkeet siten, että tuloksena on sekä binäärinen että moniluokkainen aineisto, jossa kullakin virkkeellä on yksi sentimenttiluokka. Tämän jälkeen kunkin käännetyn kielen paralleelit virkkeet annotoi kaksi erillistä annotoijaa, mistä saadaan vertailukohde alkuperäisille englanninkielille annotaatioille. Lisäksi tutkimus arvioi siirto-oppimismenetelmien hyödyllisyyttä tutkimalla, saavuttavatko koneoppimisalgoritmit samankaltaisia tuloksia käännetyillä aineistoilla, jotka on koottu projisoimalla alkuperäisten aineistojen annotaatiot käännetyille virkkeille, kuin alkuperäisillä englanninkielisillä aineistoilla. Sentimenttiklassifikaatiossa käytetään naiivi Bayes (naïve Bayes), maksimientropia (maximum entropy), monikerroksinen perseptroni (multilayer perceptron) ja tukivektorikone (support vector machine) -klassifikaattoreita. Tutkimustulokset osoittavat, että luonnollisen kielen tekstejä käännettäessä sentimentti-informaatio säilyy hyvin. Tämän perusteella voidaan päätellä, että kieltenvälinen siirto-oppiminen on tarpeeksi luotettava tapa opettaa sentimenttianalyysialgoritmeja. Klassifikaatiotulokset puolestaan osoittavat, että siirto-oppimismenetelmällä opetetut algoritmit saavuttavat luotettavia tuloksia binäärisessä klassifikaatiossa, kun taas vakaa moniluokkainen klassifikaatio vaatii suurempaa aineistoa.
  • Davis, Keith III (Helsingin yliopisto, 2020)
    We study the use of data collected via electroencephalography (EEG) to classify stimuli presented to subjects using a variety of mathematical approaches. We report an experiment with three objectives: 1) To train individual classifiers that reliably infer the class labels of visual stimuli using EEG data collected from subjects; 2) To demonstrate brainsourcing, a technique to combine brain responses from a group of human contributors each performing a recognition task to determine classes of stimuli; 3) To explore collaborative filtering techniques applied to data produced by individual classifiers to predict subject responses for stimuli in which data is unavailable or otherwise missing. We reveal that all individual classifier models perform better than a random baseline, while a brainsourcing model using data from as few as four participants achieves performance superior to any individual classifier. We also show that matrix factorization applied to classifier outputs as a collaborative filtering approach achieves predictive results that perform better than random. Although the technique is fairly sensitive to the sparsity of the dataset, it nonetheless demonstrates a viable proof-of-concept and warrants further investigation.
  • Alcantara, Jose Carlos (Helsingin yliopisto, 2020)
    A recent machine learning technique called federated learning (Konecny, McMahan, et. al., 2016) offers a new paradigm for distributed learning. It consists of performing machine learning on multiple edge devices and simultaneously optimizing a global model for all of them, without transmitting user data. The goal for this thesis was to prove the benefits of applying federated learning to forecasting telecom key performance indicator (KPI) values from radio network cells. After performing experiments with different data sources' aggregations and comparing against a centralized learning model, the results revealed that a federated model can shorten the training time for modelling new radio cells. Moreover, the amount of transferred data to a central server is minimized drastically while keeping equivalent performance to a traditional centralized model. These experiments were performed with multi-layer perceptron as model architecture after comparing its performance against LSTM. Both, input and output data were sequences of KPI values.