Browsing by Subject "WIDE ASSOCIATION"

Sort by: Order: Results:

Now showing items 1-20 of 30
  • Sarviaho, R.; Hakosalo, O.; Tiira, K.; Sulkama, S.; Niskanen, J. E.; Hytonen, M. K.; Sillanpää, M. J.; Lohi, H. (2020)
    The complex phenotypic and genetic nature of anxieties hampers progress in unravelling their molecular etiologies. Dogs present extensive natural variation in fear and anxiety behaviour and could advance the understanding of the molecular background of behaviour due to their unique breeding history and genetic architecture. As dogs live as part of human families under constant care and monitoring, information from their behaviour and experiences are easily available. Here we have studied the genetic background of fearfulness in the Great Dane breed. Dogs were scored and categorised into cases and controls based on the results of the validated owner-completed behavioural survey. A genome-wide association study in a cohort of 124 dogs with and without socialisation as a covariate revealed a genome-wide significant locus on chromosome 11. Whole exome sequencing and whole genome sequencing revealed extensive regions of opposite homozygosity in the same locus on chromosome 11 between the cases and controls with interesting neuronal candidate genes such as MAPK9/JNK2, a known hippocampal regulator of anxiety. Further characterisation of the identified locus will pave the way for molecular understanding of fear in dogs and may provide a natural animal model for human anxieties.
  • Kibble, Milla; Khan, Suleiman A.; Ammad-ud-din, Muhammad; Bollepalli, Sailalitha; Palviainen, Teemu; Kaprio, Jaakko; Pietiläinen, Kirsi H.; Ollikainen, Miina (2020)
    We combined clinical, cytokine, genomic, methylation and dietary data from 43 young adult monozygotic twin pairs (aged 22-36 years, 53% female), where 25 of the twin pairs were substantially weight discordant (delta body mass index > 3 kg m(-2)). These measurements were originally taken as part of the TwinFat study, a substudy of The Finnish Twin Cohort study. These five large multivariate datasets (comprising 42, 71, 1587, 1605 and 63 variables, respectively) were jointly analysed using an integrative machine learning method called group factor analysis (GFA) to offer new hypotheses into the multi-molecular-level interactions associated with the development of obesity. New potential links between cytokines and weight gain are identified, as well as associations between dietary, inflammatory and epigenetic factors. This encouraging case study aims to enthuse the research community to boldly attempt new machine learning approaches which have the potential to yield novel and unintuitive hypotheses. The source code of the GFA method is publically available as the R package GFA.
  • Westra, Harm-Jan; Arends, Danny; Esko, Tonu; Peters, Marjolein J.; Schurmann, Claudia; Schramm, Katharina; Kettunen, Johannes; Yaghootkar, Hanieh; Fairfax, Benjamin P.; Andiappan, Anand Kumar; Li, Yang; Fu, Jingyuan; Karjalainen, Juha; Platteel, Mathieu; Visschedijk, Marijn; Weersma, Rinse K.; Kasela, Silva; Milani, Lili; Tserel, Liina; Peterson, Part; Reinmaa, Eva; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Homuth, Georg; Petersmann, Astrid; Lorbeer, Roberto; Prokisch, Holger; Meitinger, Thomas; Herder, Christian; Roden, Michael; Grallert, Harald; Ripatti, Samuli; Perola, Markus; Wood, Andrew R.; Melzer, David; Ferrucci, Luigi; Singleton, Andrew B.; Hernandez, Dena G.; Knight, Julian C.; Melchiotti, Rossella; Lee, Bernett; Poidinger, Michael; Zolezzi, Francesca; Larbi, Anis; Wang, De Yun; van den Berg, Leonard H.; Veldink, Jan H.; Rotzschke, Olaf; Makino, Seiko; Salomaa, Veikko; Strauch, Konstantin; Voelker, Uwe; van Meurs, Joyce B. J.; Metspalu, Andres; Wijmenga, Cisca; Jansen, Ritsert C.; Franke, Lude (2015)
    The functional consequences of trait associated SNPs are often investigated using expression quantitative trait locus (eQTL) mapping. While trait-associated variants may operate in a cell-type specific manner, eQTL datasets for such cell-types may not always be available. We performed a genome-environment interaction (GxE) meta-analysis on data from 5,683 samples to infer the cell type specificity of whole blood cis-eQTLs. We demonstrate that this method is able to predict neutrophil and lymphocyte specific cis-eQTLs and replicate these predictions in independent cell-type specific datasets. Finally, we show that SNPs associated with Crohn's disease preferentially affect gene expression within neutrophils, including the archetypal NOD2 locus.
  • Guo, Michael H.; Nandakumar, Satish K.; Ulirsch, Jacob C.; Zekavat, Seyedeh M.; Buenrostro, Jason D.; Natarajan, Pradeep; Salem, Rany M.; Chiarle, Roberto; Mitt, Mario; Kals, Mart; Pärn, Kalle; Fischer, Krista; Milani, Lili; Magi, Reedik; Palta, Priit; Gabriel, Stacey B.; Metspalu, Andres; Lander, Eric S.; Kathiresan, Sekar; Hirschhorn, Joel N.; Esko, Tonu; Sankaran, Vijay G. (2017)
    Genetic variants affecting hematopoiesis can influence commonly measured blood cell traits. To identify factors that affect hematopoiesis, we performed association studies for blood cell traits in the population-based Estonian Biobank using high-coverage whole-genome sequencing (WGS) in 2,284 samples and SNP genotyping in an additional 14,904 samples. Using up to 7,134 samples with available phenotype data, our analyses identified 17 associations across 14 blood cell traits. Integration of WGS-based fine-mapping and complementary epigenomic datasets provided evidence for causal mechanisms at several loci, including at a previously undiscovered basophil count-associated locus near the master hematopoietic transcription factor CEBPA. The fine-mapped variant at this basophil count association near CEBPA overlapped an enhancer active in common myeloid progenitors and influenced its activity. In situ perturbation of this enhancer by CRISPR/Cas9 mutagenesis in hematopoietic stem and progenitor cells demonstrated that it is necessary for and specifically regulates CEBPA expression during basophil differentiation. We additionally identified basophil count-associated variation at another more pleiotropic myeloid enhancer near GATA2, highlighting regulatory mechanisms for ordered expression of master hematopoietic regulators during lineage specification. Our study illustrates how population-based genetic studies can provide key insights into poorly understood cell differentiation processes of considerable physiologic relevance.
  • NHLBI TOPMED Lipids Working Grp (2018)
    Lipoprotein(a), Lp(a), is a modified low- density lipoprotein particle that contains apolipoprotein( a), encoded by LPA, and is a highly heritable, causal risk factor for cardiovascular diseases that varies in concentrations across ancestries. Here, we use deep-coverage whole genome sequencing in 8392 individuals of European and African ancestry to discover and interpret both single-nucleotide variants and copy number (CN) variation associated with Lp(a). We observe that genetic determinants between Europeans and Africans have several unique determinants. The common variant rs12740374 associated with Lp(a) cholesterol is an eQTL for SORT1 and independent of LDL cholesterol. Observed associations of aggregates of rare non-coding variants are largely explained by LPA structural variation, namely the LPA kringle IV 2 (KIV2)-CN. Finally, we find that LPA risk genotypes confer greater relative risk for incident atherosclerotic cardiovascular diseases compared to directly measured Lp(a), and are significantly associated with measures of subclinical atherosclerosis in African Americans.
  • NHLBI TOPMED Lipids Working Grp (2018)
    Large-scale deep-coverage whole-genome sequencing (WGS) is now feasible and offers potential advantages for locus discovery. We perform WGS in 16,324 participants from four ancestries at mean depth >29X and analyze genotypes with four quantitative traits-plasma total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol, and triglycerides. Common variant association yields known loci except for few variants previously poorly imputed. Rare coding variant association yields known Mendelian dyslipidemia genes but rare non-coding variant association detects no signals. A high 2M-SNP LDL-C polygenic score (top 5th percentile) confers similar effect size to a monogenic mutation(similar to 30 mg/dl higher for each); however, among those with severe hypercholesterolemia, 23% have a high polygenic score and only 2% carry a monogenic mutation. At these sample sizes and for these phenotypes, the incremental value of WGS for discovery is limited but WGS permits simultaneous assessment of monogenic and polygenic models to severe hypercholesterolemia.
  • Psychiat Genomics Consortium; Lönnqvist, Jouko; Paunio, Tiina (2018)
    Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on similar to 150,000 individuals give a higher accuracy than LDSC estimates based on similar to 400,000 individuals (from combinedmeta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.
  • Salmela, Elina; Renvall, Hanna; Kujala, Jan; Hakosalo, Osmo; Illman, Mia; Vihla, Minna; Leinonen, Eira; Salmelin, Riitta; Kere, Juha (2016)
    Several functional and morphological brain measures are partly under genetic control. The identification of direct links between neuroimaging signals and corresponding genetic factors can reveal cellular-level mechanisms behind the measured macroscopic signals and contribute to the use of imaging signals as probes of genetic function. To uncover possible genetic determinants of the most prominent brain signal oscillation, the parieto-occipital 10-Hz alpha rhythm, we measured spontaneous brain activity with magnetoencephalography in 210 healthy siblings while the subjects were resting, with eyes closed and open. The reactivity of the alpha rhythm was quantified from the difference spectra between the two conditions. We focused on three measures: peak frequency, peak amplitude and the width of the main spectral peak. In accordance with earlier electroencephalography studies, spectral peak amplitude was highly heritable (h(2)>0.75). Variance component-based analysis of 28000 single-nucleotide polymorphism markers revealed linkage for both the width and the amplitude of the spectral peak. The strongest linkage was detected for the width of the spectral peak over the left parieto-occipital cortex on chromosome 10 (LOD=2.814, nominal P
  • Yang, Yaohua; Wu, Lang; Shu, Xiang; Lu, Yingchang; Shu, Xiao-Ou; Cai, Qiuyin; Beeghly-Fadiel, Alicia; Li, Bingshan; Ye, Fei; Berchuck, Andrew; Anton-Culver, Hoda; Banerjee, Susana; Benitez, Javier; Bjorge, Line; Brenton, James D.; Butzow, Ralf; Campbell, Ian G.; Chang-Claude, Jenny; Chen, Kexin; Cook, Linda S.; Cramer, Daniel W.; defazio, Anna; Dennis, Joe; Doherty, Jennifer A.; Doerk, Thilo; Eccles, Diana M.; Edwards, Digna Velez; Fasching, Peter A.; Fortner, Renee T.; Gayther, Simon A.; Giles, Graham G.; Glasspool, Rosalind M.; Goode, Ellen L.; Goodman, Marc T.; Gronwald, Jacek; Harris, Holly R.; Heitz, Florian; Hildebrandt, Michelle A.; Hogdall, Estrid; Hogdall, Claus K.; Huntsman, David G.; Kar, Siddhartha P.; Karlan, Beth Y.; Kelemen, Linda E.; Kiemeney, Lambertus A.; Kjaer, Susanne K.; Koushik, Anita; Lambrechts, Diether; Le, Nhu D.; Levine, Douglas A.; Massuger, Leon F.; Matsuo, Keitaro; May, Taymaa; McNeish, Iain A.; Menon, Usha; Modugno, Francesmary; Monteiro, Alvaro N.; Moorman, Patricia G.; Moysich, Kirsten B.; Ness, Roberta B.; Nevanlinna, Heli; Olsson, Hakan; Onland-Moret, N. Charlotte; Park, Sue K.; Paul, James; Pearce, Celeste L.; Pejovic, Tanja; Phelan, Catherine M.; Pike, Malcolm C.; Ramus, Susan J.; Riboli, Elio; Rodriguez-Antona, Cristina; Romieu, Isabelle; Sandler, Dale P.; Schildkraut, Joellen M.; Setiawan, Veronica W.; Shan, Kang; Siddiqui, Nadeem; Sieh, Weiva; Stampfer, Meir J.; Sutphen, Rebecca; Swerdlow, Anthony J.; Szafron, Lukasz M.; Teo, Soo Hwang; Tworoger, Shelley S.; Tyrer, Jonathan P.; Webb, Penelope M.; Wentzensen, Nicolas; White, Emily; Willett, Walter C.; Wolk, Alicja; Woo, Yin Ling; Wu, Anna H.; Yan, Li; Yannoukakos, Drakoulis; Chenevix-Trench, Georgia; Sellers, Thomas A.; Pharoah, Paul D. P.; Zheng, Wei; Long, Jirong (2019)
    DNA methylation is instrumental for gene regulation. Global changes in the epigenetic landscape have been recognized as a hallmark of cancer. However, the role of DNA methylation in epithelial ovarian cancer (EOC) remains unclear. In this study, high-density genetic and DNA methylation data in white blood cells from the Framingham Heart Study (N = 1,595) were used to build genetic models to predict DNA methylation levels. These prediction models were then applied to the summary statistics of a genome-wide association study (GWAS) of ovarian cancer including 22,406 EOC cases and 40,941 controls to investigate genetically predicted DNA methylation levels in association with EOC risk. Among 62,938 CpG sites investigated, genetically predicted methylation levels at 89 CpG were significantly associated with EOC risk at a Bonferroni-corrected threshold of P <7.94 x 10(-7). Of them, 87 were located at GWAS-identified EOC susceptibility regions and two resided in a genomic region not previously reported to be associated with EOC risk. Integrative analyses of genetic, methylation, and gene expression data identified consistent directions of associations across 12 CpG, five genes, and EOC risk, suggesting that methylation at these 12 CpG may influence EOC risk by regulating expression of these five genes, namely MAPT, HOXB3, ABHD8, ARHGAP27, and SKAP1. We identified novel DNA methylation markers associated with EOC risk and propose that methylation at multiple CpG may affect EOC risk via regulation of gene expression. Significance: Identification of novel DNA methylation markers associated with EOC risk suggests that methylation at multiple CpG may affect EOC risk through regulation of gene expression.
  • Valimaki, Niko; Kuisma, Heli; Pasanen, Annukka; Heikinheimo, Oskari; Sjoberg, Jari; Butzow, Ralf; Sarvilinna, Nanna; Heinonen, Hanna-Riikka; Tolvanen, Jaana; Bramante, Simona; Tanskanen, Tomas; Auvinen, Juha; Uimari, Outi; Alkodsi, Amjad; Lehtonen, Rainer; Kaasinen, Eevi; Palin, Kimmo; Aaltonen, Lauri A. (2018)
    Uterine leiomyomas (ULs) are benign tumors that are a major burden to women's health. A genome-wide association study on 15,453 UL cases and 392,628 controls was performed, followed by replication of the genomic risk in six cohorts. Effects of the risk alleles were evaluated in view of molecular and clinical characteristics. 22 loci displayed a genome-wide significant association. The likely predisposition genes could be grouped to two biological processes. Genes involved in genome stability were represented by TERT, TERC, OBFC1 - highlighting the role of telomere maintenance - TP53 and ATM. Genes involved in genitourinary development, WNT4, WT1, SALL1, MED12, ESR1, GREB1, FOXO1, DMRT1 and uterine stem cell marker antigen CD44, formed another strong subgroup. The combined risk contributed by the 22 loci was associated with MED12 mutation-positive tumors. The findings link genes for uterine development and genetic stability to leiomyomagenesis, and in part explain the more frequent occurrence of UL in women of African origin.
  • Palada, Vinko; Kaunisto, Mari A.; Kalso, Eija (2018)
    Purpose of reviewThe review describes recent advances in genetics and genomics of postoperative pain, the association between genetic variants and the efficacy of analgesics, and the role of pharmacogenomics in the selection of appropriate analgesic treatments for postoperative pain.Recent findingsRecent genetic studies have reported associations of genetic variants in catechol-O-methyltransferase (COMT), brain-derived neurotrophic factor (BDNF), voltage-gated channel alpha subunit 11 (SCN11A) and -opioid receptor (OPRM1) genes with postoperative pain. The recent pharmacogenetics studies revealed an association of the organic cation transporter 1 (OCT1) and ATP-binding cassette C3 (ABCC3) polymorphisms with morphine-related adverse effects, an effect of polymorphisms in cytochrome P450 gene CYP2D6 on the analgesic efficacy of tramadol and no effect of CYP2C8 and CYP2C9 variants on efficacy of piroxicam.SummaryGenetic variants associate with inter-individual variability in drug responses and they can affect pain sensitivity and intensity of postoperative pain. Despite the recent progress in genetics and genomics of postoperative pain, it is still not possible to precisely predict the patients who are genetically predisposed to have severe postoperative pain or who develop chronic postoperative pain.
  • Ritari, J.; Hyvärinen, K.; Koskela, S.; Itälä-Remes, M.; Niittyvuopio, R.; Nihtinen, A.; Salmenniemi, U.; Putkonen, M.; Volin, L.; Kwan, T.; Pastinen, T.; Partanen, J. (2019)
    Allogeneic haematopoietic stem cell transplantation currently represents the primary potentially curative treatment for cancers of the blood and bone marrow. While relapse occurs in approximately 30% of patients, few risk-modifying genetic variants have been identified. The present study evaluates the predictive potential of patient genetics on relapse risk in a genome-wide manner. We studied 151 graft recipients with HLA-matched sibling donors by sequencing the whole-exome, active immunoregulatory regions, and the full MHC region. To assess the predictive capability and contributions of SNPs and INDELs, we employed machine learning and a feature selection approach in a cross-validation framework to discover the most informative variants while controlling against overfitting. Our results show that germline genetic polymorphisms in patients entail a significant contribution to relapse risk, as judged by the predictive performance of the model (AUC = 0.72 [95% CI: 0.63-0.81]). Furthermore, the top contributing variants were predictive in two independent replication cohorts (n = 258 and n = 125) from the same population. The results can help elucidate relapse mechanisms and suggest novel therapeutic targets. A computational genomic model could provide a step toward individualized prognostic risk assessment, particularly when accompanied by other data modalities.
  • Psychiat Genomics Consortium; 23andMe Res Team; Psychosis Endopheno-types Int Cons; Wellcome Trust Case Control Consor; Lee, Phil H.; Anttila, Verneri; Won, Hyejung; Kaprio, Jaakko; Keski-Rahkonen, Anna; Churchhouse, Claire; Rehnström, Karola; Raevuori, Anu; Palotie, Aarno; Daly, Mark J.; Neale, Benjamin M. (2019)
    Genetic influences on psychiatric disorders transcend diagnostic boundaries, suggesting substantial pleiotropy of contributing loci. However, the nature and mechanisms of these pleiotropic effects remain unclear. We performed analyses of 232,964 cases and 494,162 controls from genome-wide studies of anorexia nervosa, attention-deficit/hyper-activity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, schizophrenia, and Tourette syndrome. Genetic correlation analyses revealed a meaningful structure within the eight disorders, identifying three groups of inter-related disorders. Meta-analysis across these eight disorders detected 109 loci associated with at least two psychiatric disorders, including 23 loci with pleiotropic effects on four or more disorders and 11 loci with antagonistic effects on multiple disorders. The pleiotropic loci are located within genes that show heightened expression in the brain throughout the lifespan, beginning prenatally in the second trimester, and play prominent roles in neurodevelopmental processes. These findings have important implications for psychiatric nosology, drug development, and risk prediction.
  • Massinen, Satu; Wang, Jingwen; Laivuori, Krista; Bieder, Andrea; Paez, Isabel Tapia; Jiao, Hong; Kere, Juha (2016)
    Background: The DYX5 locus for developmental dyslexia was mapped to chromosome 3 by linkage study of a large Finnish family, and later, roundabout guidance receptor 1 (ROBO1) was implicated as a candidate gene at DYX5 with suppressed expression from the segregating rare haplotype. A functional magnetoencephalographic study of several family members revealed abnormal auditory processing of interaural interaction, supporting a defect in midline crossing of auditory pathways. In the current study, we have characterized genetic variation in the broad ROBO1 gene region in the DYX5-linked family, aiming to identify variants that would increase our understanding of the altered expression of ROBO1. Methods: We have used a whole genome sequencing strategy on a pooled sample of 19 individuals in combination with two individually sequenced genomes. The discovered genetic variants were annotated and filtered. Subsequently, the most interesting variants were functionally tested using relevant methods, including electrophoretic mobility shift assay (EMSA), luciferase assay, and gene knockdown by lentiviral small hairpin RNA (shRNA) in lymphoblasts. Results: We found one novel intronic single nucleotide variant (SNV) and three novel intergenic SNVs in the broad region of ROBO1 that were specific to the dyslexia susceptibility haplotype. Functional testing by EMSA did not support the binding of transcription factors to three of the SNVs, but one of the SNVs was bound by the LIM homeobox 2 (LHX2) protein, with increased binding affinity for the non-reference allele. Knockdown of LHX2 in lymphoblast cell lines extracted from subjects from the DYX5-linked family showed decreasing expression of ROBO1, supporting the idea that LHX2 regulates ROBO1 also in human. Conclusions: The discovered variants may explain the segregation of dyslexia in this family, but the effect appears subtle in the experimental settings. Their impact on the developing human brain remains suggestive based on the association and subtle experimental support.
  • eQTLGen Consortium; Timmers, Paul R. H. J.; Kettunen, J.; Perola, M.; Ripatti, S. (2019)
    We use a genome-wide association of 1 million parental lifespans of genotyped subjects and data on mortality risk factors to validate previously unreplicated findings near CDKN2B-AS1, ATXN2/BRAP, FURIN/FES, ZW10, PSORS1C3, and 13q21.31, and identify and replicate novel findings near ABO, ZC3HC1, and IGF2R. We also validate previous findings near 5q33.3/EBF1 and FOXO3, whilst finding contradictory evidence at other loci. Gene set and cell-specific analyses show that expression in foetal brain cells and adult dorsolateral prefrontal cortex is enriched for lifespan variation, as are gene pathways involving lipid proteins and homeostasis, vesicle-mediated transport, and synaptic function. Individual genetic variants that increase dementia, cardiovascular disease, and lung cancer - but not other cancers - explain the most variance. Resulting polygenic scores show a mean lifespan difference of around five years of life across the deciles.
  • Laulajainen-Hongisto, Anu; Lyly, Annina; Hanif, Tanzeela; Dhaygude, Kishor; Kankainen, Matti; Renkonen, Risto; Donner, Kati; Mattila, Pirkko; Jartti, Tuomas; Bousquet, Jean; Kauppi, Paula; Toppila-Salmi, Sanna (2020)
    Genome wide association studies (GWASs) have revealed several airway disease-associated risk loci. Their role in the onset of asthma, allergic rhinitis (AR) or chronic rhinosinusitis (CRS), however, is not yet fully understood. The aim of this review is to evaluate the airway relevance of loci and genes identified in GWAS studies. GWASs were searched from databases, and a list of loci associating significantly (p <10(-8)) with asthma, AR and CRS was created. This yielded a total of 267 significantly asthma/AR-associated loci from 31 GWASs. No significant CRS -associated loci were found in this search. A total of 170 protein coding genes were connected to these loci. Of these, 76/170 (44%) showed bronchial epithelial protein expression in stained microscopic figures of Human Protein Atlas (HPA), and 61/170 (36%) had a literature report of having airway epithelial function. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation analyses were performed, and 19 functional protein categories were found as significantly (p <0.05) enriched among these genes. These were related to cytokine production, cell activation and adaptive immune response, and all were strongly connected in network analysis. We also identified 15 protein pathways that were significantly (p <0.05) enriched in these genes, related to T-helper cell differentiation, virus infection, JAK-STAT signaling pathway, and asthma. A third of GWAS-level risk loci genes of asthma or AR seemed to have airway epithelial functions according to our database and literature searches. In addition, many of the risk loci genes were immunity related. Some risk loci genes also related to metabolism, neuro-musculoskeletal or other functions. Functions overlapped and formed a strong network in our pathway analyses and are worth future studies of biomarker and therapeutics.
  • Vaysse, Amaury; Ratnakumar, Abhirami; Derrien, Thomas; Axelsson, Erik; Pielberg, Gerli Rosengren; Sigurdsson, Snaevar; Fall, Tove; Seppala, Eija H.; Hansen, Mark S. T.; Lawley, Cindy T.; Karlsson, Elinor K.; Bannasch, Danika; Vila, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Haggstrom, Jens; Hedhammar, Ake; Andre, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T.; LUPA Consortium (2011)
  • Li, Zitong; Kemppainen, Petri; Rastas, Pasi; Merilä, Juha (2018)
    Genomewide association studies (GWAS) aim to identify genetic markers strongly associated with quantitative traits by utilizing linkage disequilibrium (LD) between candidate genes and markers. However, because of LD between nearby genetic markers, the standard GWAS approaches typically detect a number of correlated SNPs covering long genomic regions, making corrections for multiple testing overly conservative. Additionally, the high dimensionality of modern GWAS data poses considerable challenges for GWAS procedures such as permutation tests, which are computationally intensive. We propose a cluster-based GWAS approach that first divides the genome into many large nonoverlapping windows and uses linkage disequilibrium network analysis in combination with principal component (PC) analysis as dimensional reduction tools to summarize the SNP data to independent PCs within clusters of loci connected by high LD. We then introduce single- and multilocus models that can efficiently conduct the association tests on such high-dimensional data. The methods can be adapted to different model structures and used to analyse samples collected from the wild or from biparental F-2 populations, which are commonly used in ecological genetics mapping studies. We demonstrate the performance of our approaches with two publicly available data sets from a plant (Arabidopsis thaliana) and a fish (Pungitius pungitius), as well as with simulated data.
  • Cazaly, Emma; Saad, Joseph; Wang, Wenyu; Heckman, Caroline; Ollikainen, Miina; Tang, Jing (2019)
    Epigenetic research involves examining the mitotically heritable processes that regulate gene expression, independent of changes in the DNA sequence. Recent technical advances such as whole-genome bisulfite sequencing and affordable epigenomic array-based technologies, allow researchers to measure epigenetic profiles of large cohorts at a genome-wide level, generating comprehensive high-dimensional datasets that may contain important information for disease development and treatment opportunities. The epigenomic profile for a certain disease is often a result of the complex interplay between multiple genetic and environmental factors, which poses an enormous challenge to visualize and interpret these data. Furthermore, due to the dynamic nature of the epigenome, it is critical to determine causal relationships from the many correlated associations. In this review we provide an overview of recent data analysis approaches to integrate various omics layers to understand epigenetic mechanisms of complex diseases, such as obesity and cancer. We discuss the following topics: (i) advantages and limitations of major epigenetic profiling techniques, (ii) resources for standardization, annotation and harmonization of epigenetic data, and (iii) statistical methods and machine learning methods for establishing data-driven hypotheses of key regulatory mechanisms. Finally, we discuss the future directions for data integration that shall facilitate the discovery of epigenetic-based biomarkers and therapies.
  • NHGRI Ctr Common; Abel, Haley J.; Larson, David E.; Regier, Allison A.; Hall, Ira M.; Daly, Mark J.; Palotie, Aarno; Ripatti, Samuli; Salomaa, Veikko; Taskinen, Marja-Riitta (2020)
    Structural variants in more than 17,000 human genomes are mapped and characterized using whole-genome sequencing, showing how this type of variation contributes to rare deleterious coding and noncoding alleles. A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline(1)to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.