Browsing by Subject "next-generation sequencing"

Sort by: Order: Results:

Now showing items 1-20 of 22
  • Mascher, Martin; Muehlbauer, Gary J.; Rokhsar, Daniel S.; Chapman, Jarrod; Schmutz, Jeremy; Barry, Kerrie; Munoz-Amatriain, Maria; Close, Timothy J.; Wise, Roger P.; Schulman, Alan H.; Himmelbach, Axel; Mayer, Klaus F. X.; Scholz, Uwe; Poland, Jesse A.; Stein, Nils; Waugh, Robbie (2013)
  • BEEHIVE Collaboration; Wymant, Chris; Blanquart, Francois; Golubchik, Tanya; Gall, Astrid; Bakker, Margreet; Bezemer, Daniela; Croucher, Nicholas J.; Hall, Matthew; Hillebregt, Mariska; Ong, Swee Hoe; Ratmann, Oliver; Albert, Jan; Bannert, Norbert; Fellay, Jacques; Fransen, Katrien; Gourlay, Annabelle; Grabowski, M. Kate; Gunsenheimer-Bartmeyer, Barbara; Gunthard, Huldrych F.; Kivelä, Pia; Kouyos, Roger; Laeyendecker, Oliver; Liitsola, Kirsi; Meyer, Laurence; Porter, Kholoud; Ristola, Matti; van Sighem, Ard; Berkhout, Ben; Cornelissen, Marion; Kellam, Paul; Reiss, Peter; Fraser, Christophe (2018)
    Studying the evolution of viruses and their molecular epidemiology relies on accurate viral sequence data, so that small differences between similar viruses can be meaningfully interpreted. Despite its higher throughput and more detailed minority variant data, next-generation sequencing has yet to be widely adopted for HIV. The difficulty of accurately reconstructing the consensus sequence of a quasispecies from reads (short fragments of DNA) in the presence of large between-and within-host diversity, including frequent indels, may have presented a barrier. In particular, mapping (aligning) reads to a reference sequence leads to biased loss of information; this bias can distort epidemiological and evolutionary conclusions. De novo assembly avoids this bias by aligning the reads to themselves, producing a set of sequences called contigs. However contigs provide only a partial summary of the reads, misassembly may result in their having an incorrect structure, and no information is available at parts of the genome where contigs could not be assembled. To address these problems we developed the tool shiver to pre-process reads for quality and contamination, then map them to a reference tailored to the sample using corrected contigs supplemented with the user's choice of existing reference sequences. Run with two commands per sample, it can easily be used for large heterogeneous data sets. We used shiver to reconstruct the consensus sequence and minority variant information from paired-end short-read whole-genome data produced with the Illumina platform, for sixty-five existing publicly available samples and fifty new samples. We show the systematic superiority of mapping to shiver's constructed reference compared with mapping the same reads to the closest of 3,249 real references: median values of 13 bases called differently and more accurately, 0 bases called differently and less accurately, and 205 bases of missing sequence recovered. We also successfully applied shiver to whole-genome samples of Hepatitis C Virus and Respiratory Syncytial Virus. shiver is publicly available from
  • Benedek, P.; Jiao, H.; Duvefelt, K.; Skoog, T.; Linde, M.; Kiviluoma, P.; Kere, J.; Eriksson, M.; Angelin, B. (2021)
    Aim To investigate whether genotyping could be used as a cost-effective screening step, preceding next-generation sequencing (NGS), in molecular diagnosis of familial hypercholesterolaemia (FH) in Swedish patients. Methods and results Three hundred patients of Swedish origin with clinical suspicion of heterozygous FH were analysed using a specific array genotyping panel embedding 112 FH-causing mutations in the LDLR, APOB and PCSK9 genes. The mutations had been selected from previous reports on FH patients in Scandinavia and Finland. Mutation-negative cases were further analysed by NGS. In 181 patients with probable or definite FH using the Dutch lipid clinics network (DLCN) criteria (score >= 6), a causative mutation was identified in 116 (64%). Of these, 94 (81%) were detected by genotyping. Ten mutations accounted for more than 50% of the positive cases, with APOB c.10580G>A being the most common. Mutations in LDLR predominated, with (c.2311+1_2312-1)(2514)del (FH Helsinki) and c.259T>G having the highest frequency. Two novel LDLR mutations were identified. In patients with DLCN score < 6, mutation detection rate was significantly higher at younger age. Conclusion A limited number of mutations explain a major fraction of FH cases in Sweden. Combination of selective genotyping and NGS facilitates the clinical challenge of cost-effective genetic screening in suspected FH. The frequency of APOB c.10580G>A was higher than previously reported in Sweden. The lack of demonstrable mutations in the LDLR, APOB and PCSK9 genes in similar to 1/3 of patients with probable FH strongly suggests that additional genetic mechanisms are to be found in phenotypic FH.
  • Koivusaari, Pirjo; Tejesvi, Mysore V.; Tolkkinen, Mikko; Markkola, Annamari; Mykrä, Heikki; Pirttilä, Anna Maria (MDPI, 2019)
    Frontiers in Microbiology 10:651
    Biomass production and decomposition are key processes in ecology, where plants are primarily responsible for production and microbes act in decomposition. Trees harbor foliar microfungi living on and inside leaf tissues, epiphytes, and endophytes, respectively. Early researchers hypothesized that all fungal endophytes are parasites or latent saprophytes, which slowly colonize the leaf tissues for decomposition. While this has been proven for some strains in the terrestrial environment, it is not known whether foliar microfungi from terrestrial origin can survive or perform decomposition in the aquatic environment. On the other hand, aquatic hyphomycetes, fungi which decompose organic material in stream environments, have been suggested to have a plant-associated life phase. Our aim was to study how much the fungal communities of leaves and litter submerged in streams overlap. Ergosterol content on litter, which is an estimator of fungal biomass, was 5–14 times higher in submerged litter than in senescent leaves, indicating active fungal colonization. Leaves generally harbored a different microbiome prior to than after submergence in streams. The Chao1 richness was significantly higher (93.7 vs. 60.7, p = 0.004) and there were more observed operational taxonomic units (OTUs) (78.3 vs. 47.4, p = 0.004) in senescent leaves than in stream-immersed litter. There were more Leotiomycetes (9%, p = 0.014) in the litter. We identified a group of 35 fungi (65%) with both plant- and water-associated lifestyles. Of these, eight taxa had no previous references to water, such as lichenicolous fungi. Six OTUs were classified within Glomeromycota, known as obligate root symbionts with no previous records from leaves. Five members of Basidiomycota, which are rare in aquatic environments, were identified in the stream-immersed litter only. Overall, our study demonstrates that foliar microfungi contribute to fungal diversity in submerged litter.
  • Youssef, Omar; Sarhadi, Virinder; Ehsan, Homa; Böhling, Tom; Carpelan-Holmström, Monika; Koskensalo, Selja; Puolakkainen, Pauli; Kokkola, Arto; Knuutila, Sakari (2017)
    AIM To study cancer hotspot mutations by next-generation sequencing (NGS) in stool DNA from patients with different gastrointestinal tract (GIT) neoplasms. METHODS Stool samples were collected from 87 Finnish patients diagnosed with various gastric and colorectal neoplasms, including benign tumors, and from 14 healthy controls. DNA was isolated from stools by using the PSP (R) Spin Stool DNA Plus Kit. For each sample, 20 ng of DNA was used to construct sequencing libraries using the Ion AmpliSeq Cancer Hotspot Panel v2 or Ion AmpliSeq Colon and Lung Cancer panel v2. Sequencing was performed on Ion PGM. Torrent Suite Software v.5.2.2 was used for variant calling and data analysis. RESULTS NGS was successful in assaying 72 GIT samples and 13 healthy controls, with success rates of the assay being 78% for stomach neoplasia and 87% for colorectal tumors. In stool specimens from patients with gastric neoplasia, five hotspot mutations were found in APC, CDKN2A and EGFR genes, in addition to seven novel mutations. From colorectal patients, 20 mutations were detected in AKT1, APC, ERBB2, FBXW7, KIT, KRAS, NRAS, SMARCB1, SMO, STK11 and TP53. Healthy controls did not exhibit any hotspot mutations, except for two novel ones. APC and TP53 were the most frequently mutated genes in colorectal neoplasms, with five mutations, followed by KRAS with two mutations. APC was the most commonly mutated gene in stools of patients with premalignant/benign GIT lesions. CONCLUSION Our results show that in addition to colorectal neoplasms, mutations can also be assayed from stool specimens of patients with gastric neoplasms.
  • Almusa, Henrikki (Helsingfors universitet, 2013)
    The next-generation sequencing (NGS) platforms create a large amount of sequence in short amount of time, when compared to first generation sequencers. An overview of the NGS platforms is provided with more in-depth look into Illumina Genome Analyzer II as that is used to create the data for the thesis. There were two main aims in this thesis. First, to create a pipeline which can be used to analyse genomic sequencing. Second, to use the pipeline to compare whole human exome capture methods from two manufacturers, Roche Nimblegen and Agilent. The pipeline is describe in detail in material and methods. All the inputs for the pipeline are described and examples shown. In the pipeline the given sequences are first aligned against the reference genome. Then various separate analysis is performed to retrieve variants and coverage of the sequencing. Supplementary results include paired-end anomalies, larger insertion and deletion polymorphisms and assembly of non-aligned sequences. The two capture methods are also described and changes to the manufacturers' recommended protocols are listed. Finally, the section has the options and various inputs used in the pipeline runs of the exome data. The results of the pipeline is a basic level of analysis of the sequencing as well as various graphs showing the quality of the run. All the output files intended for user are described. By using the results of the pipeline, the user can do more in-depth analysis as required by the project. When comparing the two exome capture methods, the Nimblegen capture was shown to be more efficient in capturing the CCDS exome. While the Agilent capture kit provided better one fold coverage over the exome, higher fold coverage (over 10 fold), which is required for reliable variant calling in nextgeneration sequencing, was better reached using the Nimblegen capture kit. Also, significantly fewer false positive paired-end anomalies were observed in the library created by using the Nimblegen capture.
  • Heikkilä, Nelli; Vanhanen, Reetta; Yohannes, Dawit A.; Kleino, Iivari; Mattila, Ilkka P.; Saramäki, Jari; Arstila, T. Petteri (2020)
    A highly diverse repertoire of T cell antigen receptors (TCR) is created in the thymus by recombination of gene segments and the insertion or deletion of nucleotides at the junctions. Using next-generation TCR sequencing we define here the features of recombination and selection in the human TCR alpha and TCR beta locus, and show that a strikingly high proportion of the repertoire is shared by unrelated individuals. The thymic TCRa nucleotide repertoire was more diverse than TCR beta, with 4.1 x 10(6) vs. 0.81 x 10(6) unique clonotypes, and contained nonproductive clonotypes at a higher frequency (69.2% vs. 21.2%). The convergence of distinct nucleotide clonotypes to the same amino acid sequences was higher in TCRa than in TCR beta repertoire (1.45 vs. 1.06 nucleotide sequences per amino acid sequence in thymus). The gene segment usage was biased, and generally all individuals favored the same genes in both TCR alpha and TCR beta loci. Despite the high diversity, a large fraction of the repertoire was found in more than one donor. The shared fraction was bigger in TCR alpha than TCR beta repertoire, and more common in in-frame sequences than in nonproductive sequences. Thus, both biases in rearrangement and thymic selection are likely to contribute to the generation of shared repertoire in humans.
  • Salonen, I. S.; Chronopoulou, P.-M.; Leskinen, E.; Koho, K. A. (2019)
    Metabarcoding is a method that combines high-throughput DNA sequencing and DNA-based identification. Previously, this method has been successfully used to target spatial variation of eukaryote communities in marine sediments, however, the temporal changes in these communities remain understudied. Here, we follow the temporal changes of the eukaryote communities in Baltic Sea surface sediments collected from two coastal localities during three seasons of two consecutive years. Our study reveals that the structure of the sediment eukaryotic ecosystem was primarily driven by annual and seasonal changes in prevailing environmental conditions, whereas spatial variation was a less significant factor in explaining the variance in eukaryotic communities over time. Therefore, our data suggests that shifts in regional climate regime or large-scale changes in the environment are the overdriving factors in shaping the coastal eukaryotic sediment ecosystems rather than small-scale changes in local environmental conditions or heterogeneity in ecosystem structure. More studies targeting temporal changes are needed to further understand the long-term trends in ecosystem stability and response to climate change. Furthermore, this work contributes to the recent efforts in developing metabarcoding applications for environmental biomonitoring, proving a comprehensive option for traditional monitoring approaches.
  • Sulo, Päivi (Helsingin yliopisto, 2019)
    Retrotransposons are genetic elements with the ability to make a copy of themselves and insert the copy into a new location in a genome. Most of the retrotransposons in the human genome are not transposition competent and the remaining copies are prevented from moving by epigenetics. However, some tumors experience abnormal retrotransposon activity resulting in many copies of retrotransposons inserted into new locations. Retrotransposons can be detected from sequenced genome data by bioinformatic tools. One of them is TraFiC, a tool designed to detect somatic retrotransposon insertions from tumor samples. In this Master’s thesis, I test TraFiC with 201 colorectal cancer tumors and one colorectal adenoma and develop tools to further analyze retrotransposon insertions. These tools are TraID, a pipeline to detect transductions, insertions with flanking sequence from source elements, and InSeqR, a pipeline to recreate the inserted sequence from known insertion sites. TraFiC detected 4744 somatic insertions and TraID detected 346 somatic transductions from the tumor samples. 80 % of the detected insertions were identified as true somatic insertions based on visual examination of a subset of the calls. 87 % of insertions detected by TraFiC and 82 % of the insertions detected by TraID had their insertion breakpoints and the sequence flanking them recreated by InSeqR. The detected insertions with their sequence form a reliable and comprehensive call set that can be used to create new knowledge of somatic retrotransposon insertions in colorectal cancer.
  • Uusitalo, Elina; Hammais, Anna; Palonen, Elina; Brandt, Annika; Makela, Ville-Veikko; Kallionpaa, Roope; Jouhilahti, Eeva-Mari; Poyhonen, Minna; Soini, Juhani; Peltonen, Juha; Peltonen, Sirkku (2014)
  • Lindell, Rony (Helsingfors universitet, 2016)
    Next-generation sequencing has evolved during the past 10 years to become the go-to method for genome-wide analysis projects. Based on parallelizable PCR methods adopted from the traditional Sanger sequencing, NGS platforms can produce massive amounts of genetic information in a single run and read an entire DNA molecule within a day. The immense amount of nucleotide sequence data produced by a single sample has brought us to an era of algorithmic optimization for analysis and guring out parallelization schema. For cohort projects generally cloud based systems are used due to vast computing power requirements. Anduril is an integration and parallelization framework well suited for NGS analysis, as is shown in this study. After a brief review of the golden standard methods of NGS analysis, we describe the incorporation of the main tools into the new sequencing bundle for Anduril. Tools for alignment (BWA, Bowtie), recalibration (GATK, Picard-tools) and variant calling (GATK, Samtools, VarScan) are in main focus. The Best Practice of Broad Institute, creators of The Genome Analysis Toolkit (GATK), has been a big inspiration in the creation of our sequencing pipeline. The evolution of sequencing bundle tools into a pipeline is discussed through three separate project examples. First, a small group of 8 chronic myeloid leukemia patient samples were analysed after implementation of the main tools of the pipeline. The results were consistent with previous results, but no novel relevant mutations were found. Second, exome sequencing data from 180 breast cancers with controls available in TCGA (The Cancer Genome Atlas) were processed for use in various projects in our lab. The example showed the power of Anduril in gross cohort analysis projects, enabling automatic parallelization and intelligent work ow management system. Third, we analysed exome data from 330 TCGA ovarian cancers with controls and created a prototypical set of database components for creation of a database of annotated variants for use in analytical queries. Compared to other integration frameworks (e.g. GATK, Crossbow and Hadoop), Anduril is a robust contender for the programming oriented scientist. As cloud computing is becoming at an increasing rate a requirement in large genome-wide analysis projects, Anduril provides an e ective generalizable framework for adding tools, creating pipelines and executing entire work ows on multi-nodal computing servers. As technology advances and available computational resources grow, fast multi-processor analysis can be incorporated into health care more and more for detection of disease causing genes, medication kinetics altering polymorphisms and cancer driving mutations in an everyday setting.
  • Järvinen, Maija (Helsingfors universitet, 2010)
    The growing interest for sequencing with higher throughput in the last decade has led to the development of new sequencing applications. This thesis concentrates on optimizing DNA library preparation for Illumina Genome Analyzer II sequencer. The library preparation steps that were optimized include fragmentation, PCR purification and quantification. DNA fragmentation was performed with focused sonication in different concentrations and durations. Two column based PCR purification method, gel matrix method and magnetic bead based method were compared. Quantitative PCR and gel electrophoresis in a chip were compared for DNA quantification. The magnetic bead purification was found to be the most efficient and flexible purification method. The fragmentation protocol was changed to produce longer fragments to be compatible with longer sequencing reads. Quantitative PCR correlates better with the cluster number and should thus be considered to be the default quantification method for sequencing. As a result of this study more data have been acquired from sequencing with lower costs and troubleshooting has become easier as qualification steps have been added to the protocol. New sequencing instruments and applications will create a demand for further optimizations in future.
  • Batista, Romina; Olsson, Urban; Andermann, Tobias; Aleixo, Alexandre; Ribas, Camila Cherem; Antonelli, Alexandre (2020)
    To elucidate the relationships and spatial range evolution across the world of the bird genus Turdus (Aves), we produced a large genomic dataset comprising ca 2 million nucleotides for ca 100 samples representing 53 species, including over 2000 loci. We estimated time-calibrated maximum-likelihood and multispecies coalescentphylogenies and carried out biogeographic analyses. Our results indicate that there have been considerably fewer trans-oceanic dispersals within the genus Turdus than previously suggested, such that the Palaearctic clade did not originate in America and the African clade was not involved in the colonization of the Americas. Instead, our findings suggest that dispersal from the Western Palaearctic via the Antilles to the Neotropics might have occurred in a single event, giving rise to the rich Neotropical diversity of Turdus observed today, with no reverse dispersals to thePalaearctic or Africa. Our large multilocus dataset, combined with dense species-level sampling and analysed under probabilistic methods, brings important insights into historical biogeography and systematics, even in a scenario of fast and spatially complex diversification.
  • Tuupanen, Sari; Gall, Kimberly; Sistonen, Johanna; Saarinen, Inka; Kämpjärvi, Kati; Wells, Kirsty; Merkkiniemi, Katja; von Nandelstadh, Pernilla; Sarantaus, Laura; Kansakoski, Johanna; Martenson, Emma; Vastinsalo, Hanna; Schleit, Jennifer; Sankila, Eeva-Marja; Kere, Annakarin; Junnila, Heidi; Siivonen, Pauli; Andreevskaya, Margarita; Kytola, Ville; Muona, Mikko; Salmenpera, Pertteli; Myllykangas, Samuel; Koskenvuo, Juha; Alastalo, Tero-Pekka (2022)
    Purpose: Comprehensive genetic testing for inherited retinal dystrophy (IRD) is challenged by difficult-to-sequence genomic regions, which are often mutational hotspots, such as RPGR ORF15. The purpose of this study was to evaluate the diagnostic contribution of RPGR variants in an unselected IRD patient cohort referred for testing in a clinical diagnostic laboratory. Methods: A total of 5201 consecutive patients were analyzed with a clinically validated next-generation sequencing (NGS)-based assay, including the difficult-to-sequence RPGR ORF15 region. Copy number variant (CNV) detection from NGS data was included. Variant interpretation was performed per the American College of Medical Genetics and Genomics guidelines. Results: A confirmed molecular diagnosis in RPGR was found in 4.5% of patients, 24.0% of whom were females. Variants in ORF15 accounted for 74% of the diagnoses; 29% of the diagnostic variants were in the most difficult-to-sequence central region of ORF15 (c.2470-3230). Truncating variants made up the majority (91%) of the diagnostic variants. CNVs explained 2% of the diagnostic cases, of which 80% were one- or two-exon deletions outside of ORF15. Conclusions: Our findings indicate that high-throughput, clinically validated NGS-based testing covering the difficult-to-sequence region of ORF15, in combination with high-resolution CNV detection, can help to maximize the diagnostic yield for patients with IRD. Translational Relevance: These results demonstrate an accurate and scalable method for the detection of RPGR-related variants, including the difficult-to-sequence ORF15 hotspot, which is relevant given current and emerging therapeutic opportunities.
  • Kasurinen, Jutta; Spruit, Cindy M.; Wicklund, Anu; Pajunen, Maria I.; Skurnik, Mikael (2021)
    Bacteriophage vB_EcoM_fHy-Eco03 (fHy-Eco03 for short) was isolated from a sewage sample based on its ability to infect an Escherichia coli clinical blood culture isolate. Altogether, 32 genes encoding hypothetical proteins of unknown function (HPUFs) were identified from the genomic sequence of fHy-Eco03. The HPUFs were screened for toxic properties (toxHPUFs) with a novel, Next Generation Sequencing (NGS)-based approach. This approach identifies toxHPUF-encoding genes through comparison of gene-specific read coverages in DNA from pooled ligation mixtures before electroporation and pooled transformants after electroporation. The performance and reliability of the NGS screening assay was compared with a plating efficiency-based method, and both methods identified the fHy-Eco03 gene g05 product as toxic. While the outcomes of the two screenings were highly similar, the NGS screening assay outperformed the plating efficiency assay in both reliability and efficiency. The NGS screening assay can be used as a high throughput method in the search for new phage-inspired antimicrobial molecules.
  • MYO-SEQ Consortium; Toepf, Ana; Lähdetie, Jaana; Strang-Karlsson, Sonja; Wallgren-Pettersson, Carina (2020)
    Purpose Several hundred genetic muscle diseases have been described, all of which are rare. Their clinical and genetic heterogeneity means that a genetic diagnosis is challenging. We established an international consortium, MYO-SEQ, to aid the work-ups of muscle disease patients and to better understand disease etiology. Methods Exome sequencing was applied to 1001 undiagnosed patients recruited from more than 40 neuromuscular disease referral centers; standardized phenotypic information was collected for each patient. Exomes were examined for variants in 429 genes associated with muscle conditions. Results We identified suspected pathogenic variants in 52% of patients across 87 genes. We detected 401 novel variants, 116 of which were recurrent. Variants inCAPN3,DYSF,ANO5,DMD,RYR1,TTN,COL6A2, andSGCAcollectively accounted for over half of the solved cases; while variants in newer disease genes, such asBVESandPOGLUT1, were also found. The remaining well-characterized unsolved patients (48%) need further investigation. Conclusion Using our unique infrastructure, we developed a pathway to expedite muscle disease diagnoses. Our data suggest that exome sequencing should be used for pathogenic variant detection in patients with suspected genetic muscle diseases, focusing first on the most common disease genes described here, and subsequently in rarer and newly characterized disease genes.
  • Fitak, Robert R.; Mohandesan, Elmira; Corander, Jukka; Burger, Pamela A. (2016)
    The single-humped dromedary (Camelus dromedarius) is the most numerous and widespread of domestic camel species and is a significant source of meat, milk, wool, transportation and sport for millions of people. Dromedaries are particularly well adapted to hot, desert conditions and harbour a variety of biological and physiological characteristics with evolutionary, economic and medical importance. To understand the genetic basis of these traits, an extensive resource of genomic variation is required. In this study, we assembled at 653 coverage, a 2.06 Gb draft genome of a female dromedary whose ancestry can be traced to an isolated population from the Canary Islands. We annotated 21 167 protein-coding genes and estimated similar to 33.7% of the genome to be repetitive. A comparison with the recently published draft genome of an Arabian dromedary resulted in 1.91 Gb of aligned sequence with a divergence of 0.095%. An evaluation of our genome with the reference revealed that our assembly contains more error-free bases (91.2%) and fewer scaffolding errors. We identified similar to 1.4 million single-nucleotide polymorphisms with a mean density of 0.71 x 10(-3) per base. An analysis of demographic history indicated that changes in effective population size corresponded with recent glacial epochs. Our de novo assembly provides a useful resource of genomic variation for future studies of the camel's adaptations to arid environments and economically important traits. Furthermore, these results suggest that draft genome assemblies constructed with only two differently sized sequencing libraries can be comparable to those sequenced using additional library sizes, highlighting that additional resources might be better placed in technologies alternative to short-read sequencing to physically anchor scaffolds to genome maps.
  • Avela, Kristiina; Salonen-Kajander, Riitta; Laitinen, Arja; Ramsden, Simon; Barton, Stephanie; Rudanko, Sirkka-Liisa (2019)
    Purpose To study the genetic aetiology and phenotypes of retinal degeneration (RD) in Finnish children born during 1993-2009. Methods Children with retinal degeneration (N = 68) were investigated during 2012-2014 with a targeted gene analysis or a next-generation sequencing (NGS) based gene panel. Also, a full clinical ophthalmological examination was performed. Results The cohort covered 44% (68/153) of the Finnish children with inherited RD born 1993-2009. X-linked retinoschisis, retinitis pigmentosa, Leber congenital amaurosis and cone-rod dystrophy were the most common clinical diagnoses in the study group. Pathogenic mutations were found in 17 retinal genes. The molecular genetic aetiology was identified in 77% of the patients (in 77% of the families) analysed by NGS method. Several founder mutations were detected including three novel founder mutations c.148delG in TULP1, c.2314C>R (p.Gln772Ter) in RPGRIP1 and c.533G>A (Trp178Ter) in TYR. We also confirmed the previous tentative finding of c.2944 + 1delG in GYCU2D being the most frequent cause of Leber congenital amaurosis (LCA) in Finland. Conclusions Globally, RD is genetically heterogeneous with over 260 disease genes reported so far. This was shown not to be the case in Finland, where the genetic aetiology of RD is caused by a small group of genes, due to several founder mutations that are enriched in the population. We found that X-chromosomal retinoschisis constitutes the major group in Finnish paediatric RD population and is almost exclusively caused by two founder mutations. Several other founder mutations were detected including three novel founder mutations. All in all, the genetic aetiology of 77% of families was identified which is higher than previously reported from other populations, likely due to the specific genomic constitution of the Finns.
  • Lonardi, Stefano; Muñoz-Amatriaín, María; Liang, Qihua; Shu, Shengqiang; Wanamaker, Steve I.; Lo, Sassoum; Tanskanen, Jaakko; Schulman, Alan H.; Zhu, Tingting; Luo, Ming-Cheng; Alhakami, Hind; Ounit, Rachid; Hasan, Abid Md.; Verdier, Jerome; Roberts, Philip A.; Santos, Jansen R.P.; Ndeve, Arsenio; Doležel, Jaroslav; Vrána, Jan; Hokin, Samuel A.; Farmer, Andrew D.; Cannon, Steven B.; Close, Timothy J. (2019)
    Cowpea (Vigna unguiculata [L.] Walp.) is a major crop for worldwide food and nutritional security, especially in sub-Saharan Africa, that is resilient to hot and drought-prone environments. An assembly of the single-haplotype inbred genome of cowpea IT97K-499-35 was developed by exploiting the synergies between single-molecule real-time sequencing, optical and genetic mapping, and an assembly reconciliation algorithm. A total of 519 Mb is included in the assembled sequences. Nearly half of the assembled sequence is composed of repetitive elements, which are enriched within recombination-poor pericentromeric regions. A comparative analysis of these elements suggests that genome size differences between Vigna species are mainly attributable to changes in the amount of Gypsy retrotransposons. Conversely, genes are more abundant in more distal, high-recombination regions of the chromosomes; there appears to be more duplication of genes within the NBS-LRR and the SAUR-like auxin superfamilies compared with other warm-season legumes that have been sequenced. A surprising outcome is the identification of an inversion of 4.2 Mb among landraces and cultivars, which includes a gene that has been associated in other plants with interactions with the parasitic weed Striga gesnerioides. The genome sequence facilitated the identification of a putative syntelog for multiple organ gigantism in legumes. A revised numbering system has been adopted for cowpea chromosomes based on synteny with common bean (Phaseolus vulgaris). An estimate of nuclear genome size of 640.6 Mbp based on cytometry is presented.