Browsing by Subject "SEQUENCE"

Sort by: Order: Results:

Now showing items 1-20 of 96
  • Duggan, Ana T.; Perdomo, Maria F.; Piombino-Mascali, Dario; Marciniak, Stephanie; Poinar, Debi; Emery, Matthew V.; Buchmann, Jan P.; Duchene, Sebastian; Jankauskas, Rimantas; Humphreys, Margaret; Golding, G. Brian; Southon, John; Devault, Alison; Rouillard, Jean-Marie; Sahl, Jason W.; Dutour, Olivier; Hedman, Klaus; Sajantila, Antti; Smith, Geoffrey L.; Holmes, Edward C.; Poinar, Hendrik N. (2016)
    Smallpox holds a unique position in the history of medicine. It was the first disease for which a vaccine was developed and remains the only human disease eradicated by vaccination. Although there have been claims of smallpox in Egypt, India, and China dating back millennia [1-4], the timescale of emergence of the causative agent, variola virus (VARV), and how it evolved in the context of increasingly widespread immunization, have proven controversial [4-9]. In particular, some molecular-clock-based studies have suggested that key events in VARV evolution only occurred during the last two centuries [4-6] and hence in apparent conflict with anecdotal historical reports, although it is difficult to distinguish smallpox from other pustular rashes by description alone. To address these issues, we captured, sequenced, and reconstructed a draft genome of an ancient strain of VARV, sampled from a Lithuanian child mummy dating between 1643 and 1665 and close to the time of several documented European epidemics [1, 2, 10]. When compared to vaccinia virus, this archival strain contained the same pattern of gene degradation as 20th century VARVs, indicating that such loss of gene function had occurred before ca. 1650. Strikingly, the mummy sequence fell basal to all currently sequenced strains of VARV on phylogenetic trees. Molecular-clock analyses revealed a strong clock-like structure and that the timescale of smallpox evolution is more recent than often supposed, with the diversification of major viral lineages only occurring within the 18th and 19th centuries, concomitant with the development of modern vaccination.
  • Kjaerbolling, Inge; Vesth, Tammi; Frisvad, Jens C.; Nybo, Jane L.; Theobald, Sebastian; Kildgaard, Sara; Petersen, Thomas Isbrandt; Kuo, Alan; Sato, Atsushi; Lyhne, Ellen K.; Kogle, Martin E.; Wiebenga, Ad; Kun, Roland S.; Lubbers, Ronnie J. M.; Makela, Miia R.; Barry, Kerrie; Chovatia, Mansi; Clum, Alicia; Daum, Chris; Haridas, Sajeet; He, Guifen; LaButti, Kurt; Lipzen, Anna; Mondo, Stephen; Pangilinan, Jasmyn; Riley, Robert; Salamov, Asaf; Simmons, Blake A.; Magnuson, Jon K.; Henrissat, Bernard; Mortensen, Uffe H.; Larsen, Thomas O.; de Vries, Ronald P.; Grigoriev, Igor V.; Machida, Masayuki; Baker, Scott E.; Andersen, Mikael R. (2020)
    Section Flavi encompasses both harmful and beneficial Aspergillus species, such as Aspergillus oryzae, used in food fermentation and enzyme production, and Aspergillus flavus, food spoiler and mycotoxin producer. Here, we sequence 19 genomes spanning section Flavi and compare 31 fungal genomes including 23 Flavi species. We reassess their phylogenetic relationships and show that the closest relative of A. oryzae is not A. flavus, but A. minisclerotigenes or A. aflatoxiformans and identify high genome diversity, especially in sub-telomeric regions. We predict abundant CAZymes (598 per species) and prolific secondary metabolite gene clusters (73 per species) in section Flavi. However, the observed phenotypes (growth characteristics, polysaccharide degradation) do not necessarily correlate with inferences made from the predicted CAZyme content. Our work, including genomic analyses, phenotypic assays, and identification of secondary metabolites, highlights the genetic and metabolic diversity within section Flavi.
  • Clark, Christine; Palta, Priit; Joyce, Christopher J.; Scott, Carol; Grundberg, Elin; Deloukas, Panos; Palotie, Aarno; Coffey, Alison J. (2012)
  • Horesh, Gal; Blackwell, Grace A.; Tonkin-Hill, Gerry; Corander, Jukka; Heinz, Eva; Thomson, Nicholas R. (2021)
    Escherichia coli is a highly diverse organism that includes a range of commensal and pathogenic variants found across a range of niches and worldwide. In addition to causing severe intestinal and extraintestinal disease, E. coli is considered a priority pathogen due to high levels of observed drug resistance. The diversity in the E. coli population is driven by high genome plasticity and a very large gene pool. All these have made E. coli one of the most well- studied organisms, as well as a commonly used laboratory strain. Today, there are thousands of sequenced E. coli genomes stored in public databases. While data is widely available, accessing the information in order to perform analyses can still be a challenge. Collecting relevant available data requires accessing different sources, where data may be stored in a range of formats, and often requires further manipulation and processing to apply various analyses and extract useful information. In this study, we collated and intensely curated a collection of over 10 000 E. coli and Shigella genomes to provide a single, uniform, high- quality dataset. Shigella were included as they are considered specialized pathovars of E. coli. We provide these data in a number of easily accessible formats that can be used as the foundation for future studies addressing the biological differences between E. coli lineages and the distribution and flow of genes in the E. coli population at a high resolution. The analysis we present emphasizes our lack of understanding of the true diversity of the E. coli species, and the biased nature of our current understanding of the genetic diversity of such a key pathogen.
  • Purps, Josephine; Siegert, Sabine; Willuweit, Sascha; Nagy, Marion; Alves, Cintia; Salazar, Renato; Angustia, Sheila M. T.; Santos, Lorna H.; Anslinger, Katja; Bayer, Birgit; Ayub, Qasim; Wei, Wei; Xue, Yali; Tyler-Smith, Chris; Bafalluy, Miriam Baeta; Martinez-Jarreta, Begona; Egyed, Balazs; Balitzki, Beate; Tschumi, Sibylle; Ballard, David; Court, Denise Syndercombe; Barrantes, Xinia; Bassler, Gerhard; Wiest, Tina; Berger, Burkhard; Niederstaetter, Harald; Parson, Walther; Davis, Carey; Budowle, Bruce; Burri, Helen; Borer, Urs; Koller, Christoph; Carvalho, Elizeu F.; Domingues, Patricia M.; Chamoun, Wafaa Takash; Coble, Michael D.; Hill, Carolyn R.; Corach, Daniel; Caputo, Mariela; D'Amato, Maria E.; Davison, Sean; Decorte, Ronny; Larmuseau, Maarten H. D.; Ottoni, Claudio; Rickards, Olga; Lu, Di; Jiang, Chengtao; Dobosz, Tadeusz; Jonkisz, Anna; Frank, William E.; Furac, Ivana; Gehrig, Christian; Castella, Vincent; Grskovic, Branka; Haas, Cordula; Wobst, Jana; Hadzic, Gavrilo; Drobnic, Katja; Honda, Katsuya; Hou, Yiping; Zhou, Di; Li, Yan; Hu, Shengping; Chen, Shenglan; Immel, Uta-Dorothee; Lessig, Rudiger; Jakovski, Zlatko; Ilievska, Tanja; Klann, Anja E.; Garcia, Cristina Cano; de Knijff, Peter; Kraaijenbrink, Thirsa; Kondili, Aikaterini; Miniati, Penelope; Vouropoulou, Maria; Kovacevic, Lejla; Marjanovic, Damir; Lindner, Iris; Mansour, Issam; Al-Azem, Mouayyad; El Andari, Ansar; Marino, Miguel; Furfuro, Sandra; Locarno, Laura; Martin, Pablo; Luque, Gracia M.; Alonso, Antonio; Miranda, Luis Souto; Moreira, Helena; Mizuno, Natsuko; Iwashima, Yasuki; Moura Neto, Rodrigo S.; Nogueira, Tatiana L. S.; Silva, Rosane; Nastainczyk-Wulf, Marina; Edelmann, Jeanett; Kohl, Michael; Nie, Shengjie; Wang, Xianping; Cheng, Baowen; Nunez, Carolina; Martinez de Pancorbo, Marian; Olofsson, Jill K.; Morling, Niels; Onofri, Valerio; Tagliabracci, Adriano; Pamjav, Horolma; Volgyi, Antonia; Barany, Gusztav; Pawlowski, Ryszard; Maciejewska, Agnieszka; Pelotti, Susi; Pepinski, Witold; Abreu-Glowacka, Monica; Phillips, Christopher; Cardenas, Jorge; Rey-Gonzalez, Danel; Salas, Antonio; Brisighelli, Francesca; Capelli, Cristian; Toscanini, Ulises; Piccinini, Andrea; Piglionica, Marilidia; Baldassarra, Stefania L.; Ploski, Rafal; Konarzewska, Magdalena; Jastrzebska, Emila; Robino, Carlo; Sajantila, Antti; Palo, Jukka U.; Guevara, Evelyn; Salvador, Jazelyn; Corazon De Ungria, Maria; Russell Rodriguez, Jae Joseph; Schmidt, Ulrike; Schlauderer, Nicola; Saukko, Pekka; Schneider, Peter M.; Sirker, Miriam; Shin, Kyoung-Jin; Oh, Yu Na; Skitsa, Iulia; Ampati, Alexandra; Smith, Tobi-Gail; de Calvit, Lina Solis; Stenzl, Vlastimil; Capal, Thomas; Tillmar, Andreas; Nilsson, Helena; Turrina, Stefania; De Leo, Domenico; Verzeletti, Andrea; Cortellini, Venusia; Wetton, Jon H.; Gwynne, Gareth M.; Jobling, Mark A.; Whittle, Martin R.; Sumita, Denilce R.; Wolanska-Nowak, Paulina; Yong, Rita Y. Y.; Krawczak, Michael; Nothnagel, Michael; Roewer, Lutz (2014)
    In a worldwide collaborative effort, 19,630 Y-chromosomes were sampled from 129 different populations in 51 countries. These chromosomes were typed for 23 short-tandem repeat (STR) loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385ab, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, GATAH4, DYS481, DYS533, DYS549, DYS570, DYS576, and DYS643) and using the PowerPlex Y23 System (PPY23, Promega Corporation, Madison, WI). Locus-specific allelic spectra of these markers were determined and a consistently high level of allelic diversity was observed. A considerable number of null, duplicate and off-ladder alleles were revealed. Standard single-locus and haplotype-based parameters were calculated and compared between subsets of Y-STR markers established for forensic casework. The PPY23 marker set provides substantially stronger discriminatory power than other available kits but at the same time reveals the same general patterns of population structure as other marker sets. A strong correlation was observed between the number of Y-STRs included in a marker set and some of the forensic parameters under study. Interestingly a weak but consistent trend toward smaller genetic distances resulting from larger numbers of markers became apparent.
  • Varadharajan, Srinidhi; Rastas, Pasi; Löytynoja, Ari; Matschiner, Michael; Calboli, Federico C. F.; Guo, Baocheng; Nederbragt, Alexander J.; Jakobsen, Kjetill S.; Merilä, Juha (2019)
    The Gasterosteidae fish family hosts several species that are important models for eco-evolutionary, genetic, and genomic research. In particular, a wealth of genetic and genomic data has been generated for the three-spined stickleback (Gasterosteus aculeatus), the "ecology's supermodel," whereas the genomic resources for the nine-spined stickleback (Pungitius pungitius) have remained relatively scarce. Here, we report a high-quality chromosome-level genome assembly of P. pungitius consisting of 5,303 contigs (N50 = 1.2Mbp) with a total size of 521 Mbp. These contigs were mapped to 21 linkage groups using a high-density linkage map, yielding a final assembly with 98.5% BUSCO completeness. A total of 25,062 protein-coding genes were annotated, and about 23% of the assembly was found to consist of repetitive elements. A comprehensive analysis of repetitive elements uncovered centromere-specific tandem repeats and provided insights into the evolution of retrotransposons. A multigene phylogenetic analysis inferred a divergence time of about 26 million years ago (Ma) between nine- and three-spined sticklebacks, which is far older than the commonly assumed estimate of 13 Ma. Compared with the three-spined stickleback, we identified an additional duplication of several genes in the hemoglobin cluster. Sequencing data from populations adapted to different environments indicated potential copy number variations in hemoglobin genes. Furthermore, genome-wide synteny comparisons between three- and nine-spined sticklebacks identified chromosomal rearrangements underlying the karyotypic differences between the two species. The high-quality chromosome-scale assembly of the nine-spined stickleback genome obtained with long-read sequencing technology provides a crucial resource for comparative and population genomic investigations of stickleback fishes and teleosts.
  • Pratas, Diogo; Toppinen, Mari; Pyöriä, Lari; Hedman, Klaus; Sajantila, Antti; Perdomo, Maria F. (2020)
    Background: Advances in sequencing technologies have enabled the characterization of multiple microbial and host genomes, opening new frontiers of knowledge while kindling novel applications and research perspectives. Among these is the investigation of the viral communities residing in the human body and their impact on health and disease. To this end, the study of samples from multiple tissues is critical, yet, the complexity of such analysis calls for a dedicated pipeline. We provide an automatic and efficient pipeline for identification, assembly, and analysis of viral genomes that combines the DNA sequence data from multiple organs. TRACESPipe relies on cooperation among 3 modalities: compression-based prediction, sequence alignment, and de novo assembly. The pipeline is ultra-fast and provides, additionally, secure transmission and storage of sensitive data. Findings: TRACESPipe performed outstandingly when tested on synthetic and ex vivo datasets, identifying and reconstructing all the viral genomes, including those with high levels of single-nucleotide polymorphisms. It also detected minimal levels of genomic variation between different organs. Conclusions: TRACESPipe's unique ability to simultaneously process and analyze samples from different sources enables the evaluation of within-host variability. This opens up the possibility to investigate viral tissue tropism, evolution, fitness, and disease associations. Moreover, additional features such as DNA damage estimation and mitochondrial DNA reconstruction and analysis, as well as exogenous-source controls, expand the utility of this pipeline to other fields such as forensics and ancient DNA studies. TRACESPipe is released under GPLv3 and is available for free download at
  • Li, Sai; Rissanen, Ilona; Zeltina, Antra; Hepojoki, Jussi; Raghwani, Jayna; Harlos, Karl; Pybus, Oliver G.; Huiskonen, Juha T.; Bowden, Thomas A. (2016)
    Hantaviruses, a geographically diverse group of zoonotic pathogens, initiate cell infection through the concerted action of Gn and Gc viral surface glycoproteins. Here, we describe the high-resolution crystal structure of the antigenic ectodomain of Gn from Puumala hantavirus (PUUV), a causative agent of hemorrhagic fever with renal syndrome. Fitting of PUUV Gn into an electron cryomicroscopy reconstruction of intact Gn-Gc spike complexes from the closely related but non-pathogenic Tula hantavirus localized Gn tetramers to the membrane-distal surface of the virion. The accuracy of the fitting was corroborated by epitope mapping and genetic analysis of available PUUV sequences. Interestingly, Gn exhibits greater non-synonymous sequence diversity than the less accessible Gc, supporting a role of the host humoral immune response in exerting selective pressure on the virus surface. The fold of PUUV Gn is likely to be widely conserved across hantaviruses.
  • Valori, Miko; Jansson, Lilja; Kiviharju, Anna; Ellonen, Pekka; Rajala, Hanna; Awad, Shady; Mustjoki, Satu; Tienari, Pentti J. l (2017)
    Somatic mutations have a central role in cancer but their role in other diseases such as autoimmune disorders is poorly understood. Earlier work has provided indirect evidence of rare somatic mutations in autoreactive T-lymphocytes in multiple sclerosis (MS) patients but such mutations have not been identified thus far. We analysed somatic mutations in blood in 16 patients with relapsing MS and 4 with other neurological autoimmune disease. To facilitate the detection of somatic mutations CD4 +, CD8 +, CD19 + and CD4-/CD8-/CD19- cell subpopulations were separated. We performed next-generation DNA sequencing targeting 986 immune related genes. Somatic mutations were called by comparing the sequence data of each cell subpopulation to other subpopulations of the same patient and validated by amplicon sequencing. We found non-synonymous somatic mutations in 12 (60%) patients (10 MS, 1 myasthenia gravis, 1 narcolepsy). There were 27 mutations, all different and mostly novel (67%). They were discovered at subpopulation-wise allelic fractions of 0.2%-4.6% (median 0.95%). Multiple mutations were found in 8 patients. The mutations were enriched in CD8 + cells (85% of mutations). In follow-up after a median time of 2.3 years, 96% of the mutations were still detectable. These results unravel a novel class of persistent somatic mutations, many of which were in genes that may play a role in autoimmunity (ATM, BTK, CD46, CD180, CLIP2, HMMR, IKEF3, ITGB3, KIR3DL2, MAPK10, CD56/NCAM1, RBM6, RORA, RPM and STAT3). Whether some of this class of mutations plays a role in disease is currently unclear, but these results define an interesting hitherto unknown research target for future studies. (C) 2016 The Authors. Published by Elsevier Inc.
  • Acosta, Nidia Obscura; Mäkinen, Veli; Tomescu, Alexandru I. (2018)
    Background: Reconstructing the genome of a species from short fragments is one of the oldest bioinformatics problems. Metagenomic assembly is a variant of the problem asking to reconstruct the circular genomes of all bacterial species present in a sequencing sample. This problem can be naturally formulated as finding a collection of circular walks of a directed graph G that together cover all nodes, or edges, of G. Approach: We address this problem with the "safe and complete" framework of Tomescu and Medvedev (Research in computational Molecular biology-20th annual conference, RECOMB 9649: 152-163, 2016). An algorithm is called safe if it returns only those walks (also called safe) that appear as subwalk in all metagenomic assembly solutions for G. A safe algorithm is called complete if it returns all safe walks of G. Results: We give graph-theoretic characterizations of the safe walks of G, and a safe and complete algorithm finding all safe walks of G. In the node-covering case, our algorithm runs in time O(m(2) + n(3)), and in the edge-covering case it runs in time O(m(2)n); n and m denote the number of nodes and edges, respectively, of G. This algorithm constitutes the first theoretical tight upper bound on what can be safely assembled from metagenomic reads using this problem formulation.
  • Mukherjee, Kingshuk; Alipanahi, Bahar; Kahveci, Tamer; Salmela, Leena; Boucher, Christina (2019)
    Motivation: Optical maps are high-resolution restriction maps (Rmaps) that give a unique numeric representation to a genome. Used in concert with sequence reads, they provide a useful tool for genome assembly and for discovering structural variations and rearrangements. Although they have been a regular feature of modern genome assembly projects, optical maps have been mainly used in post-processing step and not in the genome assembly process itself. Several methods have been proposed for pairwise alignment of single molecule optical maps-called Rmaps, or for aligning optical maps to assembled reads. However, the problem of aligning an Rmap to a graph representing the sequence data of the same genome has not been studied before. Such an alignment provides a mapping between two sets of data: optical maps and sequence data which will facilitate the usage of optical maps in the sequence assembly step itself. Results: We define the problem of aligning an Rmap to a de Bruijn graph and present the first algorithm for solving this problem which is based on a seed-and-extend approach. We demonstrate that our method is capable of aligning 73% of Rmaps generated from the Escherichia coli genome to the de Bruijn graph constructed from short reads generated from the same genome. We validate the alignments and show that our method achieves an accuracy of 99.6%. We also show that our method scales to larger genomes. In particular, we show that 76% of Rmaps can be aligned to the de Bruijn graph in the case of human data.
  • Kant, Ravi; Palva, Airi; von Ossowski, Ingemar (2017)
    As an ecological niche, the mammalian intestine provides the ideal habitat for a variety of bacterial microorganisms. Purportedly, some commensal genera and species offer a beneficial mix of metabolic, protective, and structural processes that help sustain the natural digestive health of the host. Among these sort of gut inhabitants is the Gram-positive lactic acid bacterium Lactobacillus ruminis, a strict anaerobe with both pili and flagella on its cell surface, but also known for being autochthonous (indigenous) to the intestinal environment. Given that the molecular basis of gut autochthony for this species is largely unexplored and unknown, we undertook a study at the genome level to pinpoint some of the adaptive traits behind its colonization behavior. In our pan-genomic probe of L. ruminis, the genomes of nine different strains isolated from human, bovine, porcine, and equine host guts were compiled and compared for in silico analysis. For this, we conducted a geno-phenotypic assessment of protein-coding genes, with an emphasis on those products involved with cell-surface morphology and anaerobic fermentation and respiration. We also categorized and examined the core and accessory genes that define the L. ruminis species and its strains. Here, we made an attempt to identify those genes having ecologically relevant phenotypes that might support or bring about intestinal indigenousness.
  • Cairo, Massimo; Medvedev, Paul; Acosta, Nidia Obscura; Rizzi, Romeo; Tomescu, Alexandru I. (2019)
    In this article, we consider the following problem. Given a directed graph G, output all walks of G that are sub-walks of all closed edge-covering walks of G. This problem was first considered by Tomescu and Medvedev (RECOMB 2016), who characterized these walks through the notion of omnitig. Omnitigs were shown to be relevant for the genome assembly problem from bioinformatics, where a genome sequence must be assembled from a set of reads from a sequencing experiment. Tomescu and Medvedev (RECOMB 2016) also proposed an algorithm for listing all maximal omnitigs, by launching an exhaustive visit from every edge. In this article, we prove new insights about the structure of omnitigs and solve several open questions about them. We combine these to achieve an O(nm)-time algorithm for outputting all the maximal omnitigs of a graph (with n nodes and m edges). This is also optimal, as we show families of graphs whose total omnitig length is Omega(nm). We implement this algorithm arid show that it is 9-12 times faster in practice than the one of Tomescu and Medvedev (RECOMB 2016).
  • Lamnidis, Thiseas C.; Majander, Kerttu; Jeong, Choongwon; Salmela, Elina; Wessman, Anna; Moiseyev, Vyacheslav; Khartanovich, Valery; Balanovsky, Oleg; Ongyerth, Matthias; Weihmann, Antje; Sajantila, Antti; Kelso, Janet; Pääbo, Svante; Onkamo, Päivi; Haak, Wolfgang; Krause, Johannes; Schiffels, Stephan (2018)
    European population history has been shaped by migrations of people, and their subsequent admixture. Recently, ancient DNA has brought new insights into European migration events linked to the advent of agriculture, and possibly to the spread of Indo-European languages. However, little is known about the ancient population history of north-eastern Europe, in particular about populations speaking Uralic languages, such as Finns and Saami. Here we analyse ancient genomic data from 11 individuals from Finland and north-western Russia. We show that the genetic makeup of northern Europe was shaped by migrations from Siberia that began at least 3500 years ago. This Siberian ancestry was subsequently admixed into many modern populations in the region, particularly into populations speaking Uralic languages today. Additionally, we show that ancestors of modern Saami inhabited a larger territory during the Iron Age, which adds to the historical and linguistic information about the population history of Finland.
  • Wysok, Beata; Wojtacka, Joanna; Hänninen, Marja-Liisa; Kivistö, Rauni (2020)
    Campylobacteriosis is one of the most common causes of bacterial gastroenteritis. However, the clinical course of the illness varies in symptoms and severity. The aim of this study was to characterizeCampylobacter jejuni(34 isolates) andC. coli(9 isolates) from persons with diarrheal and non-diarrheal stools at the time of examination and fecal sampling, in Poland by using whole-genome sequencing (WGS). Multilocus sequence typing (MLST) analysis revealed a high diversity with a total of 20 sequence types (STs) among 26Campylobacterisolates from diarrheic and 13 STs among 17 isolates from non-diarrheic persons. ST-50 and ST-257 were most common in both groups. The phenotypic resistance rate was 74.4% for ciprofloxacin, 67.4% for sulfamethoxazole/trimethoprim, 58.1% for amoxicillin, 48.8% for tetracycline, and 46.5% for ceftriaxone. Only single isolates were resistant to erythromycin, gentamicin, and amoxicillin/clavulanic acid. Overall genotypic resistance toward amoxicillin, fluoroquinolones, tetracyclines, and aminoglycosides was predicted to occur in 93.1, 67.4, 48.8, and 11.6% of the isolates, respectively. None of the isolates showed the presence of theerm(B) gene or mutation in 23S rRNA. Neither was variation found in the important target region in L4 and L22 ribosomal proteins. In regard to the CmeABC efflux pump, a set of variable mutations affecting the regulatory region was noted. AllCampylobacterisolates possessed genes associated with adhesion (cadF,jlpA,porA, andpebA) and invasion (ciaB,pldA, andflaC). The type IV secretion system (T4SS) was found in isolates from both diarrheic (15.4%, CI 95%: 6.1-33.5%) and non-diarrheic (23.5%, CI 95%: 9.6-47.3%) persons. The rates of the presence of cytolethal distending toxincdtABCgene cluster and type VI secretion system (T6SS) were higher inCampylobacterisolates obtained from persons with diarrhea (96.2%, CI 95%: 81.7-99.3% and 26.9%, CI 95%: 13.7-46.1%) compared to isolates from non-diarrheic persons (76.5%, CI 95%: 52.7-90.4% and 11.8%, CI 95%: 3.3-34.3%). The lack of statistically significant differences between two groups in tested virulence factors suggests that individual susceptibility of the host might play more determining role in the disease outcome than characteristics of the infecting strain.
  • Cervantes, Sandra; Vuosku, Jaana; Pyhajarvi, Tanja (2021)
    Despite their ecological and economical importance, conifers genomic resources are limited, mainly due to the large size and complexity of their genomes. Additionally, the available genomic resources lack complete structural and functional annotation. Transcriptomic resources have been commonly used to compensate for these deficiencies, though for most conifer species they are limited to a small number of tissues, or capture only a fraction of the genes present in the genome. Here we provide an atlas of gene expression patterns for conifer Pinus sylvestris across five tissues: embryo, megagametophyte, needle, phloem and vegetative bud. We used a wide range of tissues and focused our analyses on the expression profiles of genes at tissue level. We provide comprehensive information of the per-tissue normalized expression level, indication of tissue preferential upregulation and tissue-specificity of expression. We identified a total of 48,001 tissue preferentially upregulated and tissue specifically expressed genes, of which 28% have annotation in the Swiss-Prot database. Even though most of the putative genes identified do not have functional information in current biological databases, the tissue-specific patterns discovered provide valuable information about their potential functions for further studies, as for example in the areas of plant physiology, population genetics and genomics in general. As we provide information on tissue specificity at both diploid and haploid life stages, our data will also contribute to the understanding of evolutionary rates of different tissue types and ploidy levels.
  • Kivikoski, Mikko; Rastas, Pasi; Löytynoja, Ari; Merila, Juha (2021)
    We describe an integrative approach to improve contiguity and haploidy of a reference genome assembly and demonstrate its impact with practical examples. With two novel features of Lep-Anchor software and a combination of dense linkage maps, overlap detection and bridging long reads, we generated an improved assembly of the nine-spined stickleback (Pungitius pungitius) reference genome. We were able to remove a significant number of haplotypic contigs, detect more genetic variation and improve the contiguity of the genome, especially that of X chromosome. However, improved scaffolding cannot correct for mosaicism of erroneously assembled contigs, demonstrated by a de novo assembly of a 1.6-Mbp inversion. Qualitatively similar gains were obtained with the genome of three-spined stickleback (Gasterosteus aculeatus). Since the utility of genome-wide sequencing data in biological research depends heavily on the quality of the reference genome, the improved and fully automated approach described here should be helpful in refining reference genome assemblies.
  • Hayat, Amir; Hussain, Shabir; Bilal, Muhammad; Kausar, Mehran; Almuzzaini, Bader; Abbas, Safdar; Tanveer, Adeena; Khan, Amjad; Siddiqi, Saima; Foo, Jia Nee; Ahmad, Farooq; Khan, Feroz; Khan, Bushra; Anees, Mariam; Mäkitie, Outi; Alfadhel, Majid; Ahmad, Wasim; Umair, Muhammad (2020)
  • Kankainen, Matti; Ojala, Teija; Holm, Liisa (2012)
  • Savriama, Yoland; Valtonen, Mia; Kammonen, Juhana I.; Rastas, Pasi; Smolander, Olli-Pekka; Lyyski, Annina; Häkkinen, Teemu J.; Corfe, Ian J.; Gerber, Sylvain; Salazar-Ciudad, Isaac; Paulin, Lars; Holm, Liisa; Löytynoja, Ari; Auvinen, Petri; Jernvall, Jukka (2018)
    An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material, and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian palaeontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between grey and ringed seals. We analysed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human–Neanderthal distance, but still within that of morphologically similar species pairs known to hybridize. By contrast, morphological and developmental analyses show grey and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that grey–ringed seal hybridization is not limited to captivity. Taken together, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.