Browsing by Subject "neural networks"

Sort by: Order: Results:

Now showing items 1-10 of 10
  • Hämäläinen, Kreetta (Helsingin yliopisto, 2021)
    Personalized medicine tailors therapies for the patient based on predicted risk factors. Some tools used for making predictions on the safety and efficacy of drugs are genetics and metabolomics. This thesis focuses on identifying biomarkers for the activity level of the drug transporter organic anion transporting polypep-tide 1B1 (OATP1B1) from data acquired from untargeted metabolite profiling. OATP1B1 transports various drugs, such as statins, from portal blood into the hepatocytes. OATP1B1 is a genetically polymorphic influx transporter, which is expressed in human hepatocytes. Statins are low-density lipoprotein cholesterol-lowering drugs, and decreased or poor OATP1B1 function has been shown to be associated with statin-induced myopathy. Based on genetic variability, individuals can be classified to those with normal, decreased or poor OATP1B1 function. These activity classes were employed to identify metabolomic biomarkers for OATP1B1. To find the most efficient way to predict the activity level and find the biomarkers that associate with the activity level, 5 different machine learning models were tested with a dataset that consisted of 356 fasting blood samples with 9152 metabolite features. The models included both a Random Forest regressor and a classifier, Gradient Boosted Decision Tree regressor and classifier, and a Deep Neural Network regressor. Hindrances specific for this type of data was the collinearity between the features and the large amount of features compared to the number of samples, which lead to issues in determining the important features of the neural network model. To adjust to this, the data was clustered according to their Spearman’s rank-order correlation ranks. Feature importances were calculated using two methods. In the case of neural network, the feature importances were calculated with permutation feature importance using mean squared error, and random forest and gradient boosted decision trees used gini impurity. The performance of each model was measured, and all classifiers had a poor ability to predict decreasead and poor function classes. All regressors performed very similarly to each other. Gradient boosted decision tree regressor performed the best by a slight margin, but random forest regressor and neural network regressor performed nearly as well. The best features from all three models were cross-referenced with the features found from y-aware PCA analysis. The y-aware PCA analysis indicated that 14 best features cover 95% of the explained variance, so 14 features were picked from each model and cross-referenced with each other. Cross-referencing highest scoring features reported by the best models found multiple features that showed up as important in many models.Taken together, machine learning methods provide powerful tools to identify potential biomarkers from untargeted metabolomics data.
  • Mukhtar, Usama (Helsingin yliopisto, 2020)
    Sales forecasting is crucial for run any retail business efficiently. Profits are maximized if popular products are available to fulfill the demand. It is also important to minimize the loss caused by unsold stock. Fashion retailers face certain challenges which make sales forecasting difficult for the products. Some of these challenges are the short life cycle of products and introduction of new products all around the year. The goal of this thesis is to study forecasting methods for fashion. We use the product attributes for products in a season to build a model that can forecast sales for all the products in the next season. Sales for different attributes are analysed for three years. Sales for different variables vary for values which indicate that a model fitted on product attributes may be used for forecasting sales. A series of experiments are conducted with multiple variants of the datasets. We implemented multiple machine learning models and compared them against each other. Empirical results are reported along with the baseline comparisons to answer research questions. Results from first experiment indicate that machine learning models are almost doing as good as the baseline model that uses mean values as predictions. The results may improve in the upcoming years when more data is available for training. The second experiment shows that models built for specific product groups are better than the generic models that are used to predict sales for all kinds of products. Since we observed a heavy tail in the data, a third experiment was conducted to use logarithmic sales for predictions, and the results do not improve much as compared to results from previous methods. The conclusion of the thesis is that machine learning methods can be used for attribute-based sales forecasting in fashion industry but more data is needed, and modeling specific groups of products bring better results.
  • Agnelli, J. P.; Çöl, A.; Lassas, M.; Murthy, R.; Santacesaria, M.; Siltanen, S. (2020)
    Electrical impedance tomography (EIT) is an emerging non-invasive medical imaging modality. It is based on feeding electrical currents into the patient, measuring the resulting voltages at the skin, and recovering the internal conductivity distribution. The mathematical task of EIT image reconstruction is a nonlinear and ill-posed inverse problem. Therefore any EIT image reconstruction method needs to be regularized, typically resulting in blurred images. One promising application is stroke-EIT, or classification of stroke into either ischemic or hemorrhagic. Ischemic stroke involves a blood clot, preventing blood flow to a part of the brain causing a low-conductivity region. Hemorrhagic stroke means bleeding in the brain causing a high-conductivity region. In both cases the symptoms are identical, so a cost-effective and portable classification device is needed. Typical EIT images are not optimal for stroke-EIT because of blurriness. This paper explores the possibilities of machine learning in improving the classification results. Two paradigms are compared: (a) learning from the EIT data, that is Dirichlet-to-Neumann maps and (b) extracting robust features from data and learning from them. The features of choice are virtual hybrid edge detection (VHED) functions (Greenleaf et al 2018 Anal. PDE 11) that have a geometric interpretation and whose computation from EIT data does not involve calculating a full image of the conductivity. We report the measures of accuracy, sensitivity and specificity of the networks trained with EIT data and VHED functions separately. Computational evidence based on simulated noisy EIT data suggests that the regularized grey-box paradigm (b) leads to significantly better classification results than the black-box paradigm (a).
  • Höglund, Henrik (Svenska handelshögskolan, 2010)
    Economics and Society
    Detecting Earnings Management Using Neural Networks. Trying to balance between relevant and reliable accounting data, generally accepted accounting principles (GAAP) allow, to some extent, the company management to use their judgment and to make subjective assessments when preparing financial statements. The opportunistic use of the discretion in financial reporting is called earnings management. There have been a considerable number of suggestions of methods for detecting accrual based earnings management. A majority of these methods are based on linear regression. The problem with using linear regression is that a linear relationship between the dependent variable and the independent variables must be assumed. However, previous research has shown that the relationship between accruals and some of the explanatory variables, such as company performance, is non-linear. An alternative to linear regression, which can handle non-linear relationships, is neural networks. The type of neural network used in this study is the feed-forward back-propagation neural network. Three neural network-based models are compared with four commonly used linear regression-based earnings management detection models. All seven models are based on the earnings management detection model presented by Jones (1991). The performance of the models is assessed in three steps. First, a random data set of companies is used. Second, the discretionary accruals from the random data set are ranked according to six different variables. The discretionary accruals in the highest and lowest quartiles for these six variables are then compared. Third, a data set containing simulated earnings management is used. Both expense and revenue manipulation ranging between -5% and 5% of lagged total assets is simulated. Furthermore, two neural network-based models and two linear regression-based models are used with a data set containing financial statement data from 110 failed companies. Overall, the results show that the linear regression-based models, except for the model using a piecewise linear approach, produce biased estimates of discretionary accruals. The neural network-based model with the original Jones model variables and the neural network-based model augmented with ROA as an independent variable, however, perform well in all three steps. Especially in the second step, where the highest and lowest quartiles of ranked discretionary accruals are examined, the neural network-based model augmented with ROA as an independent variable outperforms the other models.
  • Duong, Quoc Quan (Helsingin yliopisto, 2021)
    Discourse dynamics is one of the important fields in digital humanities research. Over time, the perspectives and concerns of society on particular topics or events might change. Based on the changing in popularity of a certain theme different patterns are formed, increasing or decreasing the prominence of the theme in news. Tracking these changes is a challenging task. In a large text collection discourse themes are intertwined and uncategorized, which makes it hard to analyse them manually. The thesis tackles a novel task of automatic extraction of discourse trends from large text corpora. The main motivation for this work lies in the need in digital humanities to track discourse dynamics in diachronic corpora. Machine learning is a potential method to automate this task by learning patterns from the data. However, in many real use-cases ground truth is not available and annotating discourses on a corpus-level is incredibly difficult and time-consuming. This study proposes a novel procedure to generate synthetic datasets for this task, a quantitative evaluation method and a set of benchmarking models. Large-scale experiments are run using these synthetic datasets. The thesis demonstrates that a neural network model trained on such datasets can obtain meaningful results when applied to a real dataset, without any adjustments of the model.
  • Silva, Milton; Pratas, Diogo; Pinho, Armando J. (2020)
    Background: The increasing production of genomic data has led to an intensified need for models that can cope efficiently with the lossless compression of DNA sequences. Important applications include long-term storage and compression-based data analysis. In the literature, only a few recent articles propose the use of neural networks for DNA sequence compression. However, they fall short when compared with specific DNA compression tools, such as GeCo2. This limitation is due to the absence of models specifically designed for DNA sequences. In this work, we combine the power of neural networks with specific DNA models. For this purpose, we created GeCo3, a new genomic sequence compressor that uses neural networks for mixing multiple context and substitution-tolerant context models. Findings: We benchmark GeCo3 as a reference-free DNA compressor in 5 datasets, including a balanced and comprehensive dataset of DNA sequences, the Y-chromosome and human mitogenome, 2 compilations of archaeal and virus genomes, 4 whole genomes, and 2 collections of FASTQ data of a human virome and ancient DNA. GeCo3 achieves a solid improvement in compression over the previous version (GeCo2) of 2.4%, 7.1%, 6.1%, 5.8%, and 6.0%, respectively. To test its performance as a reference-based DNA compressor, we benchmark GeCo3 in 4 datasets constituted by the pairwise compression of the chromosomes of the genomes of several primates. GeCo3 improves the compression in 12.4%, 11.7%, 10.8%, and 10.1% over the state of the art. The cost of this compression improvement is some additional computational time (1.7-3 times slower than GeCo2). The RAM use is constant, and the tool scales efficiently, independently of the sequence size. Overall, these values outperform the state of the art. Conclusions: GeCo3 is a genomic sequence compressor with a neural network mixing approach that provides additional gains over top specific genomic compressors. The proposed mixing method is portable, requiring only the probabilities of the models as inputs, providing easy adaptation to other data compressors or compression-based data analysis tools. GeCo3 is released under GPLv3 and is available for free download at
  • Barin Pacela, Vitória (Helsingin yliopisto, 2021)
    Independent Component Analysis (ICA) aims to separate the observed signals into their underlying independent components responsible for generating the observations. Most research in ICA has focused on continuous signals, while the methodology for binary and discrete signals is less developed. Yet, binary observations are equally present in various fields and applications, such as causal discovery, signal processing, and bioinformatics. In the last decade, Boolean OR and XOR mixtures have been shown to be identifiable by ICA, but such models suffer from limited expressivity, calling for new methods to solve the problem. In this thesis, "Independent Component Analysis for Binary Data", we estimate the mixing matrix of ICA from binary observations and an additionally observed auxiliary variable by employing a linear model inspired by the Identifiable Variational Autoencoder (iVAE), which exploits the non-stationarity of the data. The model is optimized with a gradient-based algorithm that uses second-order optimization with limited memory, resulting in a training time in the order of seconds for the particular study cases. We investigate which conditions can lead to the reconstruction of the mixing matrix, concluding that the method is able to identify the mixing matrix when the number of observed variables is greater than the number of sources. In such cases, the linear binary iVAE can reconstruct the mixing matrix up to order and scale indeterminacies, which are considered in the evaluation with the Mean Cosine Similarity Score. Furthermore, the model can reconstruct the mixing matrix even under a limited sample size. Therefore, this work demonstrates the potential for applications in real-world data and also offers a possibility to study and formalize identifiability in future work. In summary, the most important contributions of this thesis are the empirical study of the conditions that enable the mixing matrix reconstruction using the binary iVAE, and the empirical results on the performance and efficiency of the model. The latter was achieved through a new combination of existing methods, including modifications and simplifications of a linear binary iVAE model and the optimization of such a model under limited computational resources.
  • Luotamo, Markku Ilkka Juhana; Metsämäki, Sari; Klami, Arto (2021)
    Semantic segmentation by convolutional neural networks (CNN) has advanced the state of the art in pixel-level classification of remote sensing images. However, processing large images typically requires analyzing the image in small patches, and hence, features that have a large spatial extent still cause challenges in tasks, such as cloud masking. To support a wider scale of spatial features while simultaneously reducing computational requirements for large satellite images, we propose an architecture of two cascaded CNN model components successively processing undersampled and full-resolution images. The first component distinguishes between patches in the inner cloud area from patches at the cloud's boundary region. For the cloud-ambiguous edge patches requiring further segmentation, the framework then delegates computation to a fine-grained model component. We apply the architecture to a cloud detection data set of complete Sentinel-2 multispectral images, approximately annotated for minimal false negatives in a land-use application. On this specific task and data, we achieve a 16% relative improvement in pixel accuracy over a CNN baseline based on patching.
  • Enwald, Joel (Helsingin yliopisto, 2020)
    Mammography is used as an early detection system for breast cancer, which is one of the most common types of cancer, regardless of one’s sex. Mammography uses specialised X-ray machines to look into the breast tissue for possible tumours. Due to the machine’s set-up as well as to reduce the radiation patients are exposed to, the number of X-ray measurements collected is very restricted. Reconstructing the tissue from this limited information is referred to as limited angle tomography. This is a complex mathematical problem and ordinarily leads to poor reconstruction results. The aim of this work is to investigate how well a neural network whose structure utilizes pre-existing models and known geometry of the problem performs at this task. In this preliminary work, we demonstrate the results on simulated two-dimensional phantoms and discuss the extension of the results to 3-dimensional patient data.
  • Elmnäinen, Johannes (Helsingin yliopisto, 2020)
    The Finnish Environment Institute (SYKE) has at least two missions which require surveying large land areas: finding invasive alien species and monitoring the state of Finnish lakes. Various methods to accomplish these tasks exist, but they traditionally rely on manual labor by experts or citizen activism, and as such do not scale well. This thesis explores the usage of computer vision to dramatically improve the scaling of these tasks. Specifically, the aim is to fly a drone over selected areas and use a convolutional neural network architecture (U-net) to create segmentations of the images. The method performs well on select biomass estimation task classes due to large enough datasets and easy-to-distinguish core features of the classes. Furthermore, a qualitative study of datasets was performed, yielding an estimate for a lower bound of number of examples for an useful dataset. ACM Computing Classification System (CCS): CCS → Computing methodologies → Machine learning → Machine learning approaches → Neural networks