Browsing by Subject "data integration"

Sort by: Order: Results:

Now showing items 1-13 of 13
  • Ruiz-Benito, Paloma; Vacchiano, Giorgio; Lines, Emily R.; Reyer, Christopher P.O.; Ratcliffe, Sophia; Morin, Xavier; Hartig, Florian; Mäkelä, Annikki; Yousefpour, Rasoul; Chaves, Jimena E.; Palacios-Orueta, Alicia; Benito-Garzón, Marta; Morales-Molino, Cesar; Camarero, J. Julio; Jump, Alistair S.; Kattge, Jens; Lehtonen, Aleksi; Ibrom, Andreas; Owen, Harry J.F.; Zavala, Miguel A. (2020)
    Climate change is expected to cause major changes in forest ecosystems during the 21st century and beyond. To assess forest impacts from climate change, the existing empirical information must be structured, harmonised and assimilated into a form suitable to develop and test state-of-the-art forest and ecosystem models. The combination of empirical data collected at large spatial and long temporal scales with suitable modelling approaches is key to understand forest dynamics under climate change. To facilitate data and model integration, we identified major climate change impacts observed on European forest functioning and summarised the data available for monitoring and predicting such impacts. Our analysis of c. 120 forest-related databases (including information from remote sensing, vegetation inventories, dendroecology, palaeoecology, eddy-flux sites, common garden experiments and genetic techniques) and 50 databases of environmental drivers highlights a substantial degree of data availability and accessibility. However, some critical variables relevant to predicting European forest responses to climate change are only available at relatively short time frames (up to 10-20 years), including intra-specific trait variability, defoliation patterns, tree mortality and recruitment. Moreover, we identified data gaps or lack of data integration particularly in variables related to local adaptation and phenotypic plasticity, dispersal capabilities and physiological responses. Overall, we conclude that forest data availability across Europe is improving, but further efforts are needed to integrate, harmonise and interpret this data (i.e. making data useable for non-experts). Continuation of existing monitoring and networks schemes together with the establishments of new networks to address data gaps is crucial to rigorously predict climate change impacts on European forests.
  • Hietala, Reija; Ijäs, Asko; Pikner, Tarmo; Kull, Anne; Printsmann, Anu; Kuusik, Maila; Fagerholm, Nora; Vihervaara, Petteri; Nordström, Paulina; Kostamo, Kirsi (Springer Nature, 2021)
    Journal of Coastal Conservation 25 (2021), 47
    The Maritime Spatial Planning (MSP) Directive was ratified (2014/89/EU) along the Strategy of the European Union (EU) on the Blue Economy to contribute to the effective management of maritime activities and resources and incorporate the principal elements of Integrated Coastal Zone Management (ICZM) (2002/413/EC) into planning at the land-sea interface. There is a need to develop the ICZM approach throughout Europe to realise the potential for both socio-economic and environmental targets set by the EU and national legislations. In this study, we co-developed different approaches for land-sea interactions in four case areas in Estonia and Finland based on the defined characteristics and key interests derived from local or regional challenges by integrating spatial data on human activities and ecology. Furthermore, four ICZM drafts were co-evaluated by stakeholders and the public using online map-based assessment tools (public participatory GIS). The ICZM approaches of the Estonian cases ranged from the diversification of land use to the enhancement of community-based entrepreneurship. The Finnish cases aimed to define the trends for sustainable marine and coastal tourism and introduce the ecosystem service concept in land use planning. During the project activities, we found that increased communication and exchange of local and regional views and values on the prevailing land-sea interactions were important for the entire process. Thereafter, the ICZM plans were applied to the MSP processes nationally, and they support the sustainable development of coastal areas in Estonia and Finland.
  • Spjuth, Ola; Karlsson, Andreas; Clements, Mark; Humphreys, Keith; Ivansson, Emma; Dowling, Jim; Eklund, Martin; Jauhiainen, Alexandra; Czene, Kamila; Gronberg, Henrik; Sparen, Par; Wiklund, Fredrik; Cheddad, Abbas; Palsdottir, Porgerodur; Rantalainen, Mattias; Abrahamsson, Linda; Laure, Erwin; Litton, Jan-Eric; Palmgren, Juni (2017)
    Objective: We provide an e-Science perspective on the workflow from risk factor discovery and classification of disease to evaluation of personalized intervention programs. As case studies, we use personalized prostate and breast cancer screenings. Materials and Methods: We describe an e-Science initiative in Sweden, e-Science for Cancer Prevention and Control (eCPC), which supports biomarker discovery and offers decision support for personalized intervention strategies. The generic eCPC contribution is a workflow with 4 nodes applied iteratively, and the concept of e-Science signifies systematic use of tools from the mathematical, statistical, data, and computer sciences. Results: The eCPC workflow is illustrated through 2 case studies. For prostate cancer, an in-house personalized screening tool, the Stockholm-3 model (S3M), is presented as an alternative to prostate-specific antigen testing alone. S3M is evaluated in a trial setting and plans for rollout in the population are discussed. For breast cancer, new biomarkers based on breast density and molecular profiles are developed and the US multicenter Women Informed to Screen Depending on Measures (WISDOM) trial is referred to for evaluation. While current eCPC data management uses a traditional data warehouse model, we discuss eCPC-developed features of a coherent data integration platform. Discussion and Conclusion: E-Science tools are a key part of an evidence-based process for personalized medicine. This paper provides a structured workflow from data and models to evaluation of new personalized intervention strategies. The importance of multidisciplinary collaboration is emphasized. Importantly, the generic concepts of the suggested eCPC workflow are transferrable to other disease domains, although each disease will require tailored solutions.
  • Finotello, Francesca; Calura, Enrica; Risso, Davide; Hautaniemi, Sampsa; Romualdi, Chiara (2020)
  • Mäkinen, Jussi; Vanhatalo, Jarno (2018)
    Aim Our aim involved developing a method to analyse spatiotemporal distributions of Arctic marine mammals (AMMs) using heterogeneous open source data, such as scientific papers and open repositories. Another aim was to quantitatively estimate the effects of environmental covariates on AMMs’ distributions and to analyse whether their distributions have shifted along with environmental changes. Location Arctic shelf area. The Kara Sea. Methods Our literature search focused on survey data regarding polar bears (Ursus maritimus), Atlantic walruses (Odobenus rosmarus rosmarus) and ringed seals (Phoca hispida). We mapped the data on a grid and built a hierarchical Poisson point process model to analyse species’ densities. The heterogeneous data lacked information on survey intensity and we could model only the relative density of each species. We explained relative densities with environmental covariates and random effects reflecting excess spatiotemporal variation and the unknown, varying sampling effort. The relative density of polar bears was explained also by the relative density of seals. Results The most important covariates explaining AMMs’ relative densities were ice concentration and distance to the coast, and regarding polar bears, also the relative density of seals. The results suggest that due to the decrease in the average ice concentration, the relative densities of polar bears and walruses slightly decreased or stayed constant during the 17‐year‐long study period, whereas seals shifted their distribution from the Eastern to the Western Kara Sea. Main conclusions Point process modelling is a robust methodology to estimate distributions from heterogeneous observations, providing spatially explicit information about ecosystems and thus serves advances for conservation efforts in the Arctic. In a simple trophic system, a distribution model of a top predator benefits from utilizing prey species’ distributions compared to a solely environmental model. The decreasing ice cover seems to have led to changes in AMMs’ distributions in the marginal Arctic region.
  • Pavel, Alisa; del Giudice, Giusy; Federico, Antonio; Di Lieto, Antonio; Kinaret, Pia A. S.; Serra, Angela; Greco, Dario (2021)
    The COVID-19 disease led to an unprecedented health emergency, still ongoing worldwide. Given the lack of a vaccine or a clear therapeutic strategy to counteract the infection as well as its secondary effects, there is currently a pressing need to generate new insights into the SARS-CoV-2 induced host response. Biomedical data can help to investigate new aspects of the COVID-19 pathogenesis, but source heterogeneity represents a major drawback and limitation. In this work, we applied data integration methods to develop a Unified Knowledge Space (UKS) and used it to identify a new set of genes associated with SARS-CoV-2 host response, both in vitro and in vivo. Functional analysis of these genes reveals possible long-term systemic effects of the infection, such as vascular remodelling and fibrosis. Finally, we identified a set of potentially relevant drugs targeting proteins involved in multiple steps of the host response to the virus.
  • Latvala, Pekka; Huuhko, Kim; Kokkonen, Matti (Copernicus Publications, 2022)
    Abstracts of the International Cartographic Association Series
  • Haak, Bastiaan W.; Argelaguet, R.; Kinsella, C.M.; Kullberg, R.F.J.; Lankelma, J.M.; Deijs, M.; Klein, M.; Jebbink, M.F.; Hugenholtz, F.; Kostidis, S.; Giera, M.; Hakvoort, T.B.M.; De Jonge, W.J.; Schultz, M.J.; Gool, T.V.; Van Der Poll, T.; De Vos, W.M.; Van Der Hoek, L.M.; Wiersingaa, W. Joost (2021)
    Bacterial microbiota play a critical role in mediating local and systemic immunity, and shifts in these microbial communities have been linked to impaired outcomes in critical illness. Emerging data indicate that other intestinal organisms, including bacteriophages, viruses of eukaryotes, fungi, and protozoa, are closely interlinked with the bacterial microbiota and their host, yet their collective role during antibiotic perturbation and critical illness remains to be elucidated. We employed multi-omics factor analysis (MOFA) to systematically integrate the bacterial (16S rRNA), fungal (intergenic transcribed spacer 1 rRNA), and viral (virus discovery next generation sequencing) components of the intestinal microbiota of 33 critically ill patients with and without sepsis and 13 healthy volunteers. In addition, we quantified the absolute abundances of bacteria and fungi using 16S and 18S rRNA PCRs and characterized the short-chain fatty acids (SCFAs) butyrate, acetate, and propionate using nuclear magnetic resonance spectroscopy. We observe that a loss of the anaerobic intestinal environment is directly correlated with an overgrowth of aerobic pathobionts and their corresponding bacteriophages as well as an absolute enrichment of opportunistic yeasts capable of causing invasive disease. We also observed a strong depletion of SCFAs in both disease states, which was associated with an increased absolute abundance of fungi with respect to bacteria. Therefore, these findings illustrate the complexity of transkingdom changes following disruption of the intestinal bacterial microbiome. IMPORTANCE While numerous studies have characterized antibiotic-induced disruptions of the bacterial microbiome, few studies describe how these disruptions impact the composition of other kingdoms such as viruses, fungi, and protozoa. To address this knowledge gap, we employed MOFA to systematically integrate viral, fungal, and bacterial sequence data from critically ill patients (with and without sepsis) and healthy volunteers, both prior to and following exposure to broad-spectrum antibiotics. In doing so, we show that modulation of the bacterial component of the microbiome has implications extending beyond this kingdom alone, enabling the overgrowth of potentially invasive fungi and viruses. While numerous preclinical studies have described similar findings in vitro, we confirm these observations in humans using an integrative analytic approach. These findings underscore the potential value of multi-omics data integration tools in interrogating how different components of the microbiota contribute to disease states. In addition, our findings suggest that there is value in further studying potential adjunctive therapies using anaerobic bacteria or SCFAs to reduce fungal expansion after antibiotic exposure, which could ultimately lead to improved outcomes in the intensive care unit (ICU).
  • Ovaska, Kristian; Laakso, Marko Kalevi; Haapa-Paananen, Saija; Louhimo, Riku; Chen, Ping; Aittomäki, Janne Viljami Juhanpoika; Valo, Erkka Antero; Nunez-Fontarnau, Javier; Rantanen, Ville; Karinen, Sirkku Helena; Nousiainen, Kari Juhani; Lahesmaa-Korpinen, Anna-Maria Kristiina; Miettinen, Minna Emilia; Saarinen, Lilli Annika; Kohonen, Pekka; Wu, Jianmin; Westermarck, Jukka; Hautaniemi, Sampsa (2010)
  • Cazaly, Emma; Saad, Joseph; Wang, Wenyu; Heckman, Caroline; Ollikainen, Miina; Tang, Jing (2019)
    Epigenetic research involves examining the mitotically heritable processes that regulate gene expression, independent of changes in the DNA sequence. Recent technical advances such as whole-genome bisulfite sequencing and affordable epigenomic array-based technologies, allow researchers to measure epigenetic profiles of large cohorts at a genome-wide level, generating comprehensive high-dimensional datasets that may contain important information for disease development and treatment opportunities. The epigenomic profile for a certain disease is often a result of the complex interplay between multiple genetic and environmental factors, which poses an enormous challenge to visualize and interpret these data. Furthermore, due to the dynamic nature of the epigenome, it is critical to determine causal relationships from the many correlated associations. In this review we provide an overview of recent data analysis approaches to integrate various omics layers to understand epigenetic mechanisms of complex diseases, such as obesity and cancer. We discuss the following topics: (i) advantages and limitations of major epigenetic profiling techniques, (ii) resources for standardization, annotation and harmonization of epigenetic data, and (iii) statistical methods and machine learning methods for establishing data-driven hypotheses of key regulatory mechanisms. Finally, we discuss the future directions for data integration that shall facilitate the discovery of epigenetic-based biomarkers and therapies.
  • Jaiswal, Alok; Gautam, Prson; Pietilä, Elina A; Timonen, Sanna; Nordström, Nora; Akimov, Yevhen; Sipari, Nina; Tanoli, Ziaurrehman; Fleischer, Thomas; Lehti, Kaisa; Wennerberg, Krister; Aittokallio, Tero (2021)
    Molecular and functional profiling of cancer cell lines is subject to laboratory-specific experimental practices and data analysis protocols. The current challenge therefore is how to make an integrated use of the omics profiles of cancer cell lines for reliable biological discoveries. Here, we carried out a systematic analysis of nine types of data modalities using meta-analysis of 53 omics studies across 12 research laboratories for 2,018 cell lines. To account for a relatively low consistency observed for certain data modalities, we developed a robust data integration approach that identifies reproducible signals shared among multiple data modalities and studies. We demonstrated the power of the integrative analyses by identifying a novel driver gene, ECHDC1, with tumor suppressive role validated both in breast cancer cells and patient tumors. The multi-modal meta-analysis approach also identified synthetic lethal partners of cancer drivers, including a co-dependency of PTEN deficient endometrial cancer cells on RNA helicases.
  • Serra, Angela; Fratello, Michele; Cattelani, Luca; Liampa, Irene; Melagraki, Georgia; Kohonen, Pekka; Nymark, Penny; Federico, Antonio; Kinaret, Pia Anneli Sofia; Jagiello, Karolina; Ha, My Kieu; Choi, Jang-Sik; Sanabria, Natasha; Gulumian, Mary; Puzyn, Tomasz; Yoon, Tae-Hyun; Sarimveis, Haralambos; Grafström, Roland; Afantitis, Antreas; Greco, Dario (2020)
    Transcriptomics data are relevant to address a number of challenges in Toxicogenomics (TGx). After careful planning of exposure conditions and data preprocessing, the TGx data can be used in predictive toxicology, where more advanced modelling techniques are applied. The large volume of molecular profiles produced by omics-based technologies allows the development and application of artificial intelligence (AI) methods in TGx. Indeed, the publicly available omics datasets are constantly increasing together with a plethora of different methods that are made available to facilitate their analysis, interpretation and the generation of accurate and stable predictive models. In this review, we present the state-of-the-art of data modelling applied to transcriptomics data in TGx. We show how the benchmark dose (BMD) analysis can be applied to TGx data. We review read across and adverse outcome pathways (AOP) modelling methodologies. We discuss how network-based approaches can be successfully employed to clarify the mechanism of action (MOA) or specific biomarkers of exposure. We also describe the main AI methodologies applied to TGx data to create predictive classification and regression models and we address current challenges. Finally, we present a short description of deep learning (DL) and data integration methodologies applied in these contexts. Modelling of TGx data represents a valuable tool for more accurate chemical safety assessment. This review is the third part of a three-article series on Transcriptomics in Toxicogenomics.
  • Nutrient Network; Aakala, Tuomas; Makela, Annikki (2020)
    Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives.