Data-driven characterization of molecular phenotypes across heterogeneous sample collections

Show full item record



Mehtonen , J , Polonen , P , Häyrynen , S , Dufva , O , Lin , J , Liuksiala , T , Granberg , K , Lohi , O , Hautamäki , V , Nykter , M & Heinäniemi , M 2019 , ' Data-driven characterization of molecular phenotypes across heterogeneous sample collections ' , Nucleic Acids Research , vol. 47 , no. 13 , 76 .

Title: Data-driven characterization of molecular phenotypes across heterogeneous sample collections
Author: Mehtonen, Juha; Polonen, Petri; Häyrynen, Sergei; Dufva, Olli; Lin, Jake; Liuksiala, Thomas; Granberg, Kirsi; Lohi, Olli; Hautamäki, Ville; Nykter, Matti; Heinäniemi, Merja
Contributor organization: Immunobiology Research Program
Hematologian yksikkö
HUS Comprehensive Cancer Center
Department of Oncology
University of Helsinki
Date: 2019-07-26
Language: eng
Number of pages: 12
Belongs to series: Nucleic Acids Research
ISSN: 0305-1048
Abstract: Existing large gene expression data repositories hold enormous potential to elucidate disease mechanisms, characterize changes in cellular pathways, and to stratify patients based on molecular profiles. To achieve this goal, integrative resources and tools are needed that allow comparison of results across datasets and data types. We propose an intuitive approach for data-driven stratifications of molecular profiles and benchmark our methodology using the dimensionality reduction algorithm t-distributed stochastic neighbor embedding (t-SNE) with multi-study and multi-platform data on hematological malignancies. Our approach enables assessing the contribution of biological versus technical variation to sample clustering, direct incorporation of additional datasets to the same low dimensional representation, comparison of molecular disease subtypes identified from separate t-SNE representations, and characterization of the obtained clusters based on pathway databases and additional data. In this manner, we performed an integrative analysis across multi-omics acute myeloid leukemia studies. Our approach indicated new molecular subtypes with differential survival and drug responsiveness among samples lacking fusion genes, including a novel myelodysplastic syndrome-like cluster and a cluster characterized with CEBPA mutations and differential activity of the S-adenosylmethionine-dependent DNA methylation pathway. In summary, integration across multiple studies can help to identify novel molecular disease subtypes and generate insight into disease biology.
3122 Cancers
1184 Genetics, developmental biology, physiology
1182 Biochemistry, cell and molecular biology
Peer reviewed: Yes
Rights: cc_by
Usage restriction: openAccess
Self-archived version: publishedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
gkz281.pdf 4.905Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record