Browsing by Subject "Data science"

Sort by: Order: Results:

Now showing items 1-6 of 6
  • Stocker, M.; Paasonen, P.; Fiebig, M.; Zaidan, M.A.; Hardisty, A. (2018)
    Interpreting observational data is a fundamental task in the sciences, specifically in earth and environmental science where observational data are increasingly acquired, curated, and published systematically by environmental research infrastructures. Typically subject to substantial processing, observational data are used by research communities, their research groups and individual scientists, who interpret such primary data for their meaning in the context of research investigations. The result of interpretation is information—meaningful secondary or derived data—about the observed environment. Research infrastructures and research communities are thus essential to evolving uninterpreted observational data to information. In digital form, the classical bearer of information are the commonly known “(elaborated) data products,” for instance maps. In such form, meaning is generally implicit e.g., in map colour coding, and thus largely inaccessible to machines. The systematic acquisition, curation, possible publishing and further processing of information gained in observational data interpretation—as machine readable data and their machine readable meaning—is not common practice among environmental research infrastructures. For a use case in aerosol science, we elucidate these problems and present a Jupyter based prototype infrastructure that exploits a machine learning approach to interpretation and could support a research community in interpreting observational data and, more importantly, in curating and further using resulting information about a studied natural phenomenon. © 2018 The Author(s).
  • CENTER-TBI Collaborators; Gravesteijn, Benjamin Y.; Nieboer, Daan; Ercole, Ari; Palotie, Aarno; Piippo-Karjalainen, Anna; Pirinen, Matti; Posti, Jussi P.; Raj, Rahul; Ripatti, Samuli; Tenovuo, Olli; Takala, Riikka (2020)
    Objective: We aimed to explore the added value of common machine learning (ML) algorithms for prediction of outcome for moderate and severe traumatic brain injury. Study Design and Setting: We performed logistic regression (LR), lasso regression, and ridge regression with key baseline predictors in the IMPACT-II database (15 studies, n = 11,022). ML algorithms included support vector machines, random forests, gradient boosting machines, and artificial neural networks and were trained using the same predictors. To assess generalizability of predictions, we performed internal, internal-external, and external validation on the recent CENTER-TBI study (patients with Glasgow Coma Scale Results: In the IMPACT-II database, 3,332/11,022 (30%) died and 5,233(48%) had unfavorable outcome (Glasgow Outcome Scale less than 4). In the CENTER-TBI study, 348/1,554(29%) died and 651(54%) had unfavorable outcome. Discrimination and calibration varied widely between the studies and less so between the studied algorithms. The mean area under the curve was 0.82 for mortality and 0.77 for unfavorable outcomes in the CENTER-TBI study. Conclusion: ML algorithms may not outperform traditional regression approaches in a low-dimensional setting for outcome prediction after moderate or severe traumatic brain injury. Similar to regression-based prediction models, ML algorithms should be rigorously validated to ensure applicability to new populations. (C) 2020 The Authors. Published by Elsevier Inc.
  • Sipilä, Reetta; Kalso, Eija; Lötsch, Jörn (2020)
    Background: Persistent pain in breast cancer survivors is common. Psychological and sleep-related factors modulate perception, interpretation and coping with pain and may contribute to the clinical phenotype. The present analysis pursued the hypothesis that breast cancer survivors form subgroups, based on psychological and sleep-related parameters that are relevant to the impact of pain on the patients' life. Methods: We analysed 337 women treated for breast cancer, in whom psychological and sleep-related parameters as well as parameters related to pain intensity and interference had been acquired. Data were analysed by using supervised and unsupervised machine-learning techniques (i) to detect patient subgroups based on the pattern of psychological or sleep-related parameters, (ii) to interpret the detected cluster structure and (iii) to relate this data structure to pain interference and impact on life. Results: Artificial intelligence-based detection of data structure, implemented as self-organizing neuronal maps, identified two different clusters of patients. A smaller cluster (11.5% of the patients) had comparatively lower resilience, more depressive symptoms and lower extraversion than the other patients. In these patients, life-satisfaction, mood, and life in general were comparatively more impeded by persistent pain. Conclusions: The results support the initial hypothesis that psychological and sleep-related parameter patterns are meaningful for subgrouping patients with respect to how persistent pain after breast cancer treatments interferes with their life. This indicates that management of pain should address more complex features than just pain intensity. Artificial intelligence is a useful tool in the identification of subgroups of patients based on psychological factors. (C) 2020 The Authors. Published by Elsevier Ltd.
  • Loetsch, Joern; Sipilä, Reetta; Tasmuth, Tiina; Kringel, Dario; Estlander, Ann-Mari; Meretoja, Tuomo; Kalso, Eija; Ultsch, Alfred (2018)
    Background Prevention of persistent pain following breast cancer surgery, via early identification of patients at high risk, is a clinical need. Supervised machine-learning was used to identify parameters that predict persistence of significant pain. Methods Over 500 demographic, clinical and psychological parameters were acquired up to 6 months after surgery from 1,000 women (aged 28-75 years) who were treated for breast cancer. Pain was assessed using an 11-point numerical rating scale before surgery and at months 1, 6, 12, 24, and 36. The ratings at months 12, 24, and 36 were used to allocate patents to either "persisting pain" or "non-persisting pain" groups. Unsupervised machine learning was applied to map the parameters to these diagnoses. Results A symbolic rule-based classifier tool was created that comprised 21 single or aggregated parameters, including demographic features, psychological and pain-related parameters, forming a questionnaire with "yes/no" items (decision rules). If at least 10 of the 21 rules applied, persisting pain was predicted at a cross-validated accuracy of 86% and a negative predictive value of approximately 95%. Conclusions The present machine-learned analysis showed that, even with a large set of parameters acquired from a large cohort, early identification of these patients is only partly successful. This indicates that more parameters are needed for accurate prediction of persisting pain. However, with the current parameters it is possible, with a certainty of almost 95%, to exclude the possibility of persistent pain developing in a woman being treated for breast cancer.
  • Toivonen, Hannu; Boggia, Michele; Mind and Matter; Department of Computer Science; Helsinki Institute for Information Technology; Discovery Research Group/Prof. Hannu Toivonen; Language Technology (The Association for Computational Linguistics, 2021)
  • ijäs, timo (Helsingin yliopisto, 2021)
    The topic of this thesis is spatial analytics in competitive gaming and e-sports. The way in which players analyze spatial aspects of gameplay has not been well documented. I study how game, genre and skill level affect the use of spatial analysis in competitive gaming. My aim is also to identify the benefits and challenges of spatial analytics, as well as the need for new spatial analytical tools. Four games of different popular competitive gaming genres were chosen for the study. An online survey was conducted which resulted in a cross-sectional dataset of 2453 responses. It was analyzed using ordinal logistic regression and histogram-based gradient boosting in a cross-validating manner. Open-field answers were summarized using state-of-the-art deep learning methods and analyzed with inductive content analysis. Additionally, experts of each game were interviewed. The results show that the use and understanding of spatial analysis is largely not game- or genre dependent. Players grow spatial skills along with their skill level and start using more complex spatial analytical methods more frequently as their skill level rises. It is exceedingly rare that expert players do not analyze spatial aspects of their gameplay. There is a need for different kinds of spatial analytics tools in all competitive games, and the benefits of advanced tools to a player and the community can be large. However, the tools need to be highly contextualized, fine-tuned for each game specifically, and tailored to the players’ needs. Creating new tools for spatial analytics is something useful for competitive gaming as a whole. The inclusion of more detailed spatial analytical tools can lead to a new era of competitive gaming. E-sports is a rapidly growing phenomenon, and the analytics that support its growth should follow.