Browsing by Subject "data science"

Sort by: Order: Results:

Now showing items 1-7 of 7
  • Kringel, Dario; Malkusch, Sebastian; Kalso, Eija; Lötsch, Jorn (2021)
    The genetic background of pain is becoming increasingly well understood, which opens up possibilities for predicting the individual risk of persistent pain and the use of tailored therapies adapted to the variant pattern of the patient's pain-relevant genes. The individual variant pattern of pain-relevant genes is accessible via next-generation sequencing, although the analysis of all "pain genes" would be expensive. Here, we report on the development of a cost-effective next generation sequencing-based pain-genotyping assay comprising the development of a customized AmpliSeq (TM) panel and bioinformatics approaches that condensate the genetic information of pain by identifying the most representative genes. The panel includes 29 key genes that have been shown to cover 70% of the biological functions exerted by a list of 540 so-called "pain genes" derived from transgenic mice experiments. These were supplemented by 43 additional genes that had been independently proposed as relevant for persistent pain. The functional genomics covered by the resulting 72 genes is particularly represented by mitogen-activated protein kinase of extracellular signal-regulated kinase and cytokine production and secretion. The present genotyping assay was established in 61 subjects of Caucasian ethnicity and investigates the functional role of the selected genes in the context of the known genetic architecture of pain without seeking functional associations for pain. The assay identified a total of 691 genetic variants, of which many have reports for a clinical relevance for pain or in another context. The assay is applicable for small to large-scale experimental setups at contemporary genotyping costs.
  • Kringel, Dario; Kaunisto, Mari A.; Lippmann, Catharina; Kalso, Eija; Lötsch, Jörn (2018)
    Background: Many gene variants modulate the individual perception of pain and possibly also its persistence. The limited selection of single functional variants is increasingly being replaced by analyses of the full coding and regulatory sequences of pain-relevant genes accessible by means of next generation sequencing (NGS). Methods: An NGS panel was created for a set of 77 human genes selected following different lines of evidence supporting their role in persisting pain. To address the role of these candidate genes, we established a sequencing assay based on a custom AmpliSeq (TM) panel to assess the exomic sequences in 72 subjects of Caucasian ethnicity. To identify the systems biology of the genes, the biological functions associated with these genes were assessed by means of a computational over-representation analysis. Results: Sequencing generated a median of 2.85 . 10(6) reads per run with a mean depth close to 200 reads, mean read length of 205 called bases and an average chip loading of 71%. A total of 3,185 genetic variants were called. A computational functional genomics analysis indicated that the proposed NGS gene panel covers biological processes identified previously as characterizing the functional genomics of persisting pain. Conclusion: Results of the NGS assay suggested that the produced nucleotide sequences are comparable to those earned with the classical Sanger sequencing technique. The assay is applicable for small to large-scale experimental setups to target the accessing of information about any nucleotide within the addressed genes in a study cohort.
  • Lötsch, J.; Sipilä, R.; Dimova, Rozita; Kalso, E. (2018)
    Background: Prevention of persistent pain after breast cancer surgery, via early identification of patients at high risk, is a clinical need. Psychological factors are among the most consistently proposed predictive parameters for the development of persistent pain. However, repeated use of long psychological questionnaires in this context may be exhaustive for a patient and inconvenient in everyday clinical practice. Methods: Supervised machine learning was used to create a short form of questionnaires that would provide the same predictive performance of pain persistence as the full questionnaires in a cohort of 1000 women followed up for 3 yr after breast cancer surgery. Machine-learned predictors were first trained with the full-item set of Beck's Depression Inventory (BDI), Spielberger's StateeTrait Anxiety Inventory (STAI), and the StateeTrait Anger Expression Inventory (STAXI-2). Subsequently, features were selected from the questionnaires to create predictors having a reduced set of items. Results: A combined seven-item set of 10% of the original psychological questions from STAI and BDI, provided the same predictive performance parameters as the full questionnaires for the development of persistent postsurgical pain. The seven-item version offers a shorter and at least as accurate identification of women in whom pain persistence is unlikely (almost 95% negative predictive value). Conclusions: Using a data-driven machine-learning approach, a short list of seven items from BDI and STAI is proposed as a basis for a predictive tool for the persistence of pain after breast cancer surgery.
  • Lötsch, Jörn; Mustonen, Laura; Harno, Hanna; Kalso, Eija (2022)
    Background: Persistent postsurgical neuropathic pain (PPSNP) can occur after intraoperative damage to somatosensory nerves, with a prevalence of 29-57% in breast cancer surgery. Proteomics is an active research field in neuropathic pain and the first results support its utility for establishing diagnoses or finding therapy strategies. Methods: 57 women (30 non-PPSNP/27 PPSNP) who had experienced a surgeon-verified intercostobrachial nerve injury during breast cancer surgery, were examined for patterns in 74 serum proteomic markers that allowed discrimination between subgroups with or without PPSNP. Serum samples were obtained both before and after surgery. Results: Unsupervised data analyses, including principal component analysis and self-organizing maps of artificial neurons, revealed patterns that supported a data structure consistent with pain-related subgroup (non-PPSPN vs. PPSNP) separation. Subsequent supervised machine learning-based analyses revealed 19 proteins (CD244, SIRT2, CCL28, CXCL9, CCL20, CCL3, IL.10RA, MCP.1, TRAIL, CCL25, IL10, uPA, CCL4, DNER, STAMPB, CCL23, CST5, CCL11, FGF.23) that were informative for subgroup separation. In cross-validated training and testing of six different machine-learned algorithms, subgroup assignment was significantly better than chance, whereas this was not possible when training the algorithms with randomly permuted data or with the protein markers not selected. In particular, sirtuin 2 emerged as a key protein, presenting both before and after breast cancer treatments in the PPSNP compared with the non-PPSNP subgroup. Conclusions: The identified proteins play important roles in immune processes such as cell migration, chemotaxis, and cytokine-signaling. They also have considerable overlap with currently known targets of approved or investigational drugs. Taken together, several lines of unsupervised and supervised analyses pointed to structures in serum proteomics data, obtained before and after breast cancer surgery, that relate to neuroinflammatory processes associated with the development of neuropathic pain after an intraoperative nerve lesion.
  • Zhou, Fang; Qu, Qiang; Toivonen, Hannu (2017)
    Networks often contain implicit structure. We introduce novel problems and methods that look for structure in networks, by grouping nodes into supernodes and edges to superedges, and then make this structure visible to the user in a smaller generalised network. This task of finding generalisations of nodes and edges is formulated as network Summarisation'. We propose models and algorithms for networks that have weights on edges, on nodes or on both, and study three new variants of the network summarisation problem. In edge-based weighted network summarisation, the summarised network should preserve edge weights as well as possible. A wider class of settings is considered in path-based weighted network summarisation, where the resulting summarised network should preserve longer range connectivities between nodes. Node-based weighted network summarisation in turn allows weights also on nodes and summarisation aims to preserve more information related to high weight nodes. We study theoretical properties of these problems and show them to be NP-hard. We propose a range of heuristic generalisation algorithms with different trade-offs between complexity and quality of the result. Comprehensive experiments on real data show that weighted networks can be summarised efficiently with relatively little error.
  • Laaksonen, Salla-Maaria; Haapoja, Jesse Juhani; Kinnunen, Teemu; Nelimarkka, Matti; Pöyhtäri, Reeta (2020)
    Hate speech has been identified as a pressing problem in society and several automated approaches have been designed to detect and prevent it. This paper reports and reflects upon an action research setting consisting of multi-organizational collaboration conducted during Finnish municipal elections in 2017, wherein a technical infrastructure was designed to automatically monitor candidates' social media updates for hate speech. The setting allowed us to engage in a 2-fold investigation. First, the collaboration offered a unique view for exploring how hate speech emerges as a technical problem. The project developed an adequately well-working algorithmic solution using supervised machine learning. We tested the performance of various feature extraction and machine learning methods and ended up using a combination of Bag-of-Words feature extraction with Support-Vector Machines. However, an automated approach required heavy simplification, such as using rudimentary scales for classifying hate speech and a reliance on word-based approaches, while in reality hate speech is a linguistic and social phenomenon with various tones and forms. Second, the action-research-oriented setting allowed us to observe affective responses, such as the hopes, dreams, and fears related to machine learning technology. Based on participatory observations, project artifacts and documents, interviews with project participants, and online reactions to the detection project, we identified participants' aspirations for effective automation as well as the level of neutrality and objectivity introduced by an algorithmic system. However, the participants expressed more critical views toward the system after the monitoring process. Our findings highlight how the powerful expectations related to technology can easily end up dominating a project dealing with a contested, topical social issue. We conclude by discussing the problematic aspects of datafying hate and suggesting some practical implications for hate speech recognition.
  • Harju, Anu A.; Huhtamäki, Jukka (2021)
    In March 2019, the first ever act of terrorist violence in New Zealand was live-streamed on social media, making many social media users unwitting witnesses to the massacre on their devices. The Christchurch mosque attacks revealed a particular digital and emotional vulnerability embedded in the digital media infrastructure. The last words of the first victim soon transmorphed into #hellobrother that, as a digital artefact, participated in shaping the emotional landscape. Combining real-time digital media ethnography on Twitter with data science and computational tools, this multi-method study has two aims: first and foremost, to develop and apply new methodology for the study of unexpected, mediated events as they unfold in real time; second, to explore post-death digital artefacts through the concept of digital afterlife that we approach through two complementary perspectives, data afterlife (the technological) and data as afterlife (the emotional). Adopting a relational perspective, we further develop the concept, and highlight the constitutive role of data in the emotional dimension of digital afterlife arising from its capacity to enter affective arrangements. The methodological contributions include development of a conceptual and technological framework for conducting data science as ethnography and the introduction of Tweetboard, a novel artefact for investigating digital afterlife.