Browsing by Subject "READ ALIGNMENT"

Sort by: Order: Results:

Now showing items 1-20 of 41
  • Arredondo-Alonso, Sergio; Pöntinen, Anna K.; Cleon, Francois; Gladstone, Rebecca A.; Schurch, Anita C.; Johnsen, Pal J.; Samuelsen, Orjan; Corander, Jukka (2021)
    Background: Bacterial whole-genome sequencing based on short-read technologies often results in a draft assembly formed by contiguous sequences. The introduction of long-read sequencing technologies permits those contiguous sequences to be unambiguously bridged into complete genomes. However, the elevated costs associated with long-read sequencing frequently limit the number of bacterial isolates that can be long-read sequenced. Here we evaluated the recently released 96 barcoding kit from Oxford Nanopore Technologies (ONT) to generate complete genomes on a high-throughput basis. In addition, we propose an isolate selection strategy that optimizes a representative selection of isolates for long-read sequencing considering as input large-scale bacterial collections. Results: Despite an uneven distribution of long reads per barcode, near-complete chromosomal sequences (assembly contiguity = 0.89) were generated for 96 Escherichia coli isolates with associated short-read sequencing data. The assembly contiguity of the plasmid replicons was even higher (0.98), which indicated the suitability of the multiplexing strategy for studies focused on resolving plasmid sequences. We benchmarked hybrid and ONT-only assemblies and showed that the combination of ONT sequencing data with short-read sequencing data is still highly desirable (i) to perform an unbiased selection of isolates for long-read sequencing, (ii) to achieve an optimal genome accuracy and completeness, and (iii) to include small plasmids underrepresented in the ONT library. Conclusions: The proposed long-read isolate selection ensures the completion of bacterial genomes that span the genome diversity inherent in large collections of bacterial isolates. We show the potential of using this multiplexing approach to close bacterial genomes on a high-throughput basis.
  • Pratas, Diogo; Toppinen, Mari; Pyöriä, Lari; Hedman, Klaus; Sajantila, Antti; Perdomo, Maria F. (2020)
    Background: Advances in sequencing technologies have enabled the characterization of multiple microbial and host genomes, opening new frontiers of knowledge while kindling novel applications and research perspectives. Among these is the investigation of the viral communities residing in the human body and their impact on health and disease. To this end, the study of samples from multiple tissues is critical, yet, the complexity of such analysis calls for a dedicated pipeline. We provide an automatic and efficient pipeline for identification, assembly, and analysis of viral genomes that combines the DNA sequence data from multiple organs. TRACESPipe relies on cooperation among 3 modalities: compression-based prediction, sequence alignment, and de novo assembly. The pipeline is ultra-fast and provides, additionally, secure transmission and storage of sensitive data. Findings: TRACESPipe performed outstandingly when tested on synthetic and ex vivo datasets, identifying and reconstructing all the viral genomes, including those with high levels of single-nucleotide polymorphisms. It also detected minimal levels of genomic variation between different organs. Conclusions: TRACESPipe's unique ability to simultaneously process and analyze samples from different sources enables the evaluation of within-host variability. This opens up the possibility to investigate viral tissue tropism, evolution, fitness, and disease associations. Moreover, additional features such as DNA damage estimation and mitochondrial DNA reconstruction and analysis, as well as exogenous-source controls, expand the utility of this pipeline to other fields such as forensics and ancient DNA studies. TRACESPipe is released under GPLv3 and is available for free download at https://github.com/viromelab/tracespipe.
  • Dinu, Liviu P.; Ionescu, Radu Tudor; Tomescu, Alexandru I. (2014)
  • Jones, W; Gong, BS; Novoradovskaya, N; Li, D; Kusko, R; Richmond, TA; Johann, DJ; Bisgin, H; Sahraeian, SME; Bushel, PR; Pirooznia, M; Wilkins, K; Chierici, M; Bao, WJ; Basehore, LS; Lucas, AB; Burgess, D; Butler, DJ; Cawley, S; Chang, CJ; Chen, GC; Chen, T; Chen, YC; Craig, DJ; Del Pozo, A; Foox, J; Francescatto, M; Fu, YT; Furlanello, C; Giorda, K; Grist, KP; Guan, MJ; Hao, YY; Happe, S; Hariani, G; Haseley, N; Jasper, J; Jurman, G; Kreil, DP; Labaj, P; Lai, K; Li, JY; Li, QZ; Li, YL; Li, ZG; Liu, ZC; Lopez, MS; Miclaus, K; Miller, R; Mittal, VK; Mohiyuddin, M; Pabon-Pena, C; Parsons, BL; Qiu, FJ; Scherer, A; Shi, TL; Stiegelmeyer, S; Suo, C; Tom, N; Wang, D; Wen, ZN; Wu, LH; Xiao, WZ; Xu, C; Yu, Y; Zhang, JY; Zhang, YF; Zhang, ZH; Zheng, YT; Mason, CE; Willey, JC; Tong, WD; Shi, LM; Xu, J (2021)
    BackgroundOncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance.ResultsIn reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5-100x more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels.ConclusionThese new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.
  • Larson, Eric D.; Magno, Jose Pedrito M.; Steritz, Matthew J.; Llanes, Erasmo Gonzalo d; Cardwell, Jonathan; Pedro, Melquiadesa; Roberts, Tori Bootpetch; Einarsdottir, Elisabet; Rosanes, Rose Anne Q.; Greenlee, Christopher; Santos, Rachel Ann P.; Yousaf, Ayesha; Streubel, Sven-Olrik; Santos, Aileen Trinidad R.; Ruiz, Amanda G.; Mae Lagrana-Villagracia, Sheryl; Ray, Dylan; Yarza, Talitha Karisse L.; Scholes, Melissa A.; Anderson, Catherine B.; Acharya, Anushree; Gubbels, Samuel P.; Bamshad, Michael J.; Cass, Stephen P.; Lee, Nanette R.; Shaikh, Rehan S.; Nickerson, Deborah A.; Mohlke, Karen L.; Prager, Jeremy D.; Cruz, Teresa Luisa G.; Yoon, Patricia J.; Abes, Generoso T.; Schwartz, David A.; Chan, Abner L.; Wine, Todd M.; Maria Cutiongco-de la Paz, Eva; Friedman, Norman; Kechris, Katerina; Kere, Juha; Leal, Suzanne M.; Yang, Ivana; Patel, Janak A.; Tantoco, Ma Leah C.; Riazuddin, Saima; Chan, Kenny H.; Mattila, Petri S.; Reyes-Quintos, Maria Rina T.; Ahmed, Zubair M.; Jenkins, Herman A.; Chonmaitree, Tasnee; Hafren, Lena; Chiong, Charlotte M.; Santos-Cortez, Regie Lyn P. (2019)
    A genetic basis for otitis media is established, however, the role of rare variants in disease etiology is largely unknown. Previously a duplication variant within A2ML1 was identified as a significant risk factor for otitis media in an indigenous Filipino population and in US children. In this report exome and Sanger sequencing was performed using DNA samples from the indigenous Filipino population, Filipino cochlear implantees, US probands, Finnish, and Pakistani families with otitis media. Sixteen novel, damaging A2ML1 variants identified in otitis media patients were rare or low-frequency in population-matched controls. In the indigenous population, both gingivitis and A2ML1 variants including the known duplication variant and the novel splice variant c.4061 + 1 G>C were independently associated with otitis media. Sequencing of salivary RNA samples from indigenous Filipinos demonstrated lower A2ML1 expression according to the carriage of A2ML1 variants. Sequencing of additional salivary RNA samples from US patients with otitis media revealed differentially expressed genes that are highly correlated with A2ML1 expression levels. In particular, RND3 is upregulated in both A2ML1 variant carriers and high-A2ML1 expressors. These findings support a role for A2ML1 in keratinocyte differentiation within the middle ear as part of otitis media pathology and the potential application of ROCK inhibition in otitis media.
  • Patro, Rob; Salmela, Leena (2021)
    DNA and RNA sequencing is a core technology in biological andmedical research. The high throughput of these technologies and the consistent development of new experimental assays and biotechnologies demand the continuous development of methods to analyze the resulting data. The RECOMB SatelliteWorkshop on Massively Parallel Sequencing brings together leading researchers in computational genomics to discuss emerging frontiers in algorithm development for massively parallel sequencing data. The 10th meeting in this series, RECOMBSeq 2020, was scheduled to be held in Padua, Italy, but due to the ongoing COVID-19 pandemic, the meeting was carried out virtually instead. The online workshop featured keynote talks by Paola Bonizzoni and Zamin Iqbal, two highlight talks, ten regular talks, and three short talks. Seven of the works presented in the workshop are featured in this edition of iScience, and many of the talks are available online in the RECOMB-Seq 2020 YouTube channel.
  • Fang, Bohao; Momigliano, Paolo; Kahilainen, Kimmo K.; Merilä, Juha (2022)
    The European whitefish (Coregonus lavaretus) species complex is a classic example of recent adaptive radiation. Here, we examine a whitefish population introduced to northern Finnish Lake Tsahkal in the late 1960s, where three divergent morphs (viz. littoral, pelagic, and profundal feeders) were found 10 generations after. Using demographic modeling based on genomic data, we show that whitefish morphs evolved during a phase of strict isolation, refuting a rapid sympatric divergence scenario. The lake is now an artificial hybrid zone between morphs originated in allopatry. Despite their current syntopy, clear genetic differentiation remains between two of the three morphs. Using admixture mapping, we identify five SNPs associated with gonad weight variation, a proxy for sexual maturity and spawning time. We suggest that ecological adaptations in spawning time evolved in allopatry are currently maintaining partial reproductive isolation in the absence of other barriers to gene flow.
  • Virtanen, Jenni; Smura, Teemu; Aaltonen, Kirsi; Moisander-Jylhä, Anna-Maria; Knuuttila, Anna; Vapalahti, Olli; Sironen, Tarja (2019)
    Aleutian mink disease virus (AMDV) is the causative agent of Aleutian disease (AD), which affects mink of all genotypes and also infects other mustelids such as ferrets, martens and badgers. Previous studies have investigated diversity in Finnish AMDV strains, but these studies have been restricted to small parts of the virus genome, and mostly from newly infected farms and free-ranging mustelids. Here, we investigated the diversity and evolution of Finnish AMDV strains by sequencing the complete coding sequences of 31 strains from mink originating from farms differing in their virus history, as well as from free-ranging mink. The data set was supplemented with partial genomes obtained from 26 strains. The sequences demonstrate that the Finnish AMDV strains have considerable diversity, and that the virus has been introduced to Finland in multiple events. Frequent recombination events were observed, as well as variation in the evolutionary rate in different parts of the genome and between different branches of the phylogenetic tree. Mink in the wild carry viruses with high intra-host diversity and are occasionally even co-infected by two different strains, suggesting that free-ranging mink tolerate chronic infections for extended periods of time. These findings highlight the need for further sampling to understand the mechanisms playing a role in the evolution and pathogenesis of AMDV.
  • Cairns, Johannes; Jokela, Roosa; Hultman, Jenni; Tamminen, Manu; Virta, Marko; Hiltunen, Teppo (2018)
    Experimental microbial ecology and evolution have yielded foundational insights into ecological and evolutionary processes using simple microcosm setups and phenotypic assays with one- or two-species model systems. The fields are now increasingly incorporating more complex systems and exploration of the molecular basis of observations. For this purpose, simplified, manageable and well-defined multispecies model systems are required that can be easily investigated using culturing and high-throughput sequencing approaches, bridging the gap between simpler and more complex synthetic or natural systems. Here we address this need by constructing a completely synthetic 33 bacterial strain community that can be cultured in simple laboratory conditions. We provide whole-genome data for all the strains as well as metadata about genomic features and phenotypic traits that allow resolving individual strains by amplicon sequencing and facilitate a variety of envisioned mechanistic studies. We further show that a large proportion of the strains exhibit coexistence in co-culture over serial transfer for 48 days in the absence of any experimental manipulation to maintain diversity. The constructed bacterial community can be a valuable resource in future experimental work.
  • Trotta, Luca; Norberg, Anna; Taskinen, Mervi; Beziat, Vivien; Degerman, Sofie; Wartiovaara-Kautto, Ulla; Välimaa, Hannamari; Jahnukainen, Kirsi; Casanova, Jean-Laurent; Seppänen, Mikko; Saarela, Janna; Koskenvuo, Minna; Martelius, Timi (2018)
    Background: The telomere biology disorders (TBDs) include a range of multisystem diseases characterized by mucocutaneous symptoms and bone marrow failure. In dyskeratosis congenita (DKQ, the clinical features of TBDs stem from the depletion of crucial stem cell populations in highly proliferative tissues, resulting from abnormal telomerase function. Due to the wide spectrum of clinical presentations and lack of a conclusive laboratory test it may be challenging to reach a clinical diagnosis, especially if patients lack the pathognomonic clinical features of TBDs. Methods: Clinical sequencing was performed on a cohort of patients presenting with variable immune phenotypes lacking molecular diagnoses. Hypothesis-free whole-exome sequencing (WES) was selected in the absence of compelling diagnostic hints in patients with variable immunological and haematological conditions. Results: In four patients belonging to three families, we have detected five novel variants in known TBD-causing genes (DKC1, TERT and RTEL1). In addition to the molecular findings, they all presented shortened blood cell telomeres. These findings are consistent with the displayed TBD phenotypes, addressing towards the molecular diagnosis and subsequent clinical follow-up of the patients. Conclusions: Our results strongly support the utility of WES-based approaches for routine genetic diagnostics of TBD patients with heterogeneous or atypical clinical presentation who otherwise might remain undiagnosed.
  • Katainen, Riku; Donner, Iikki; Cajuso, Tatiana; Kaasinen, Eevi; Palin, Kimmo; Mäkinen, Veli; Aaltonen, Lauri A.; Pitkänen, Esa (2018)
    Next-generation sequencing (NGS) is routinely applied in life sciences and clinical practice, but interpretation of the massive quantities of genomic data produced has become a critical challenge. The genome-wide mutation analyses enabled by NGS have had a revolutionary impact in revealing the predisposing and driving DNA alterations behind a multitude of disorders. The workflow to identify causative mutations from NGS data, for example in cancer and rare diseases, commonly involves phases such as quality filtering, case-control comparison, genome annotation, and visual validation, which require multiple processing steps and usage of various tools and scripts. To this end, we have introduced an interactive and user-friendly multi-platform-compatible software, BasePlayer, which allows scientists, regardless of bioinformatics training, to carry out variant analysis in disease genetics settings. A genome-wide scan of regulatory regions for mutation clusters can be carried out with a desktop computer in -10 min with a dataset of 3 million somatic variants in 200 whole-genome-sequenced (WGS) cancers.
  • Veltsos, Paris; Ridout, Kate E.; Toups, Melissa A.; Gonzalez-Martinez, Santiago C.; Muyle, Aline; Emery, Olivier; Rastas, Pasi; Hudzieczek, Vojtech; Hobza, Roman; Vyskot, Boris; Marais, Gabriel A. B.; Filatov, Dmitry A.; Pannell, John R. (2019)
    Suppressed recombination allows divergence between homologous sex chromosomes and the functionality of their genes. Here, we reveal patterns of the earliest stages of sex-chromosome evolution in the diploid dioecious herb Mercurialis annua on the basis of cytological analysis, de novo genome assembly and annotation, genetic mapping, exome resequencing of natural populations, and transcriptome analysis. The genome assembly contained 34,105 expressed genes, of which 10,076 were assigned to linkage groups. Genetic mapping and exome resequencing of individuals across the species range both identified the largest linkage group, LG1, as the sex chromosome. Although the sex chromosomes of M. annua are karyotypically homomorphic, we estimate that about one-third of the Y chromosome, containing 568 transcripts and spanning 22.3 cM in the corresponding female map, has ceased recombining. Nevertheless, we found limited evidence for Y-chromosome degeneration in terms of gene loss and pseudogenization, and most X- and Y-linked genes appear to have diverged in the period subsequent to speciation between M. annua and its sister species M. huetii, which shares the same sex-determining region. Taken together, our results suggest that the M. annua Y chromosome has at least two evolutionary strata: a small old stratum shared with M. huetii, and a more recent larger stratum that is probably unique to M. annua and that stopped recombining similar to 1 MYA. Patterns of gene expression within the nonrecombining region are consistent with the idea that sexually antagonistic selection may have played a role in favoring suppressed recombination.
  • Parnanen, Katariina M. M.; Hultman, Jenni; Markkanen, Melina; Satokari, Reetta; Rautava, Samuli; Lamendella, Regina; Wright, Justin; McLimans, Christopher J.; Kelleher, Shannon L.; Virta, Marko P. (2022)
    Background Infants are at a high risk of acquiring fatal infections, and their treatment relies on functioning antibiotics. Antibiotic resistance genes (ARGs) are present in high numbers in antibiotic-naive infants' gut microbiomes, and infant mortality caused by resistant infections is high. The role of antibiotics in shaping the infant resistome has been studied, but there is limited knowledge on other factors that affect the antibiotic resistance burden of the infant gut. Objectives Our objectives were to determine the impact of early exposure to formula on the ARG load in neonates and infants born either preterm or full term. Our hypotheses were that diet causes a selective pressure that influences the microbial community of the infant gut, and formula exposure would increase the abundance of taxa that carry ARGs. Methods Cross-sectionally sampled gut metagenomes of 46 neonates were used to build a generalized linear model to determine the impact of diet on ARG loads in neonates. The model was cross-validated using neonate metagenomes gathered from public databases using our custom statistical pipeline for cross-validation. Results Formula-fed neonates had higher relative abundances of opportunistic pathogens such as Staphylococcus aureus, Staphylococcus epidermidis, Klebsiella pneumoniae, Klebsiella oxytoca, and Clostridioides difficile. The relative abundance of ARGs carried by gut bacteria was 69% higher in the formula-receiving group (fold change, 1.69; 95% CI: 1.12-2.55; P = 0.013; n = 180) compared to exclusively human milk-fed infants. The formula-fed infants also had significantly less typical infant bacteria, such as Bifidobacteria, that have potential health benefits. Conclusions The novel finding that formula exposure is correlated with a higher neonatal ARG burden lays the foundation that clinicians should consider feeding mode in addition to antibiotic use during the first months of life to minimize the proliferation of antibiotic-resistant gut bacteria in infants.
  • A., Galarza Juan; Dhaygude, Kishor; Behnaz, Ghaedi; Kaisa, Suisto; Janne, Valkonen; Johanna, Mappes (2019)
    Insect metamorphosis is one of the most recognized processes delimiting transitions between phenotypes. It has been traditionally postulated as an adaptive process decoupling traits between life stages, allowing evolutionary independence of pre- and post-metamorphic phenotypes. However, the degree of autonomy between these life stages varies depending on the species and has not been studied in detail over multiple traits simultaneously. Here, we reared full-sib larvae of the warningly coloured wood tiger moth (Arctia plantaginis) in different temperatures and examined their responses for phenotypic (melanization change, number of moults), gene expression (RNA-seq and qPCR of candidate genes for melanization and flight performance) and life-histories traits (pupal weight, and larval and pupal ages). In the emerging adults, we examined their phenotypes (melanization and size) and compared them at three condition proxies: heat absorption (ability to engage flight), flight metabolism (ability to sustain flight) and overall flight performance. We found that some larval responses, as evidenced by gene expression and change in melanization, did not have an effect on the adult (i.e. size and wing melanization), whereas other adult traits such as heat absorption, body melanization and flight performance were found to be impacted by rearing temperature. Adults reared at high temperature showed higher resting metabolic rate, lower body melanization, faster heating rate, lower body temperature at take-off and inferior flight performance than cold-reared adults. Thus our results did not unambiguously support the environment-matching hypothesis. Our results illustrate the importance of assessing multiple traits across life stages as these may only be partly decoupled by metamorphosis. This article is part of the theme issue 'The evolution of complete metamorphosis'.
  • Reza Ghanavi, Hamid; Twort, Victoria Gwendoline; Duplouy, Anne (2021)
    Models estimate that up to 80% of all butterfly and moth species host vertically transmitted endosymbiotic microorganisms, which can affect the host fitness, metabolism, reproduction, population dynamics, and genetic diversity, among others. The supporting empirical data are however currently highly biased towards the generally more colourful butterflies, and include less information about moths. Additionally, studies of symbiotic partners of Lepidoptera predominantly focus on the common bacterium Wolbachia pipientis, while infections by other inherited microbial partners have more rarely been investigated. Here, we mine the whole genome sequence data of 47 species of Erebidae moths, with the aims to both inform on the diversity of symbionts potentially associated with this Lepidoptera group, and discuss the potential of metagenomic approaches to inform on host associated microbiome diversity. Based on the result of Kraken2 and MetaPhlAn2 analyses, we found clear evidence of the presence of Wolbachia in four species. Our result also suggests the presence of three other bacterial symbionts (Burkholderia spp., Sodalis spp. and Arsenophonus spp.) in three other moth species. Additionally, we recovered genomic material from bracovirus in about half of our samples. The detection of the latter, usually found in mutualistic association to braconid parasitoid wasps, may inform on host-parasite interactions that take place in the natural habitat of the Erebidae moths, suggesting either contamination with material from species of the host community network, or horizontal transfer of members of the microbiome between interacting species.
  • Karkman, Antti; Parnanen, Katariina; Larsson, D. G. Joakim (2019)
    Discharge of treated sewage leads to release of antibiotic resistant bacteria, resistance genes and antibiotic residues to the environment. However, it is unclear whether increased abundance of antibiotic resistance genes in sewage and sewage-impacted environments is due to on-site selection pressure by residual antibiotics, or is simply a result of fecal contamination with resistant bacteria. Here we analyze relative resistance gene abundance and accompanying extent of fecal pollution in publicly available metagenomic data, using crAssphage sequences as a marker of human fecal contamination (crAssphage is a bacteriophage that is exceptionally abundant in, and specific to, human feces). We find that the presence of resistance genes can largely be explained by fecal pollution, with no clear signs of selection in the environment, with the exception of environments polluted by very high levels of anti-biotics from manufacturing, where selection is evident. Our results demonstrate the necessity to take into account fecal pollution levels to avoid making erroneous assumptions regarding environmental selection of antibiotic resistance.
  • Norri, Tuukka; Cazaux, Bastien; Dönges, Saska; Valenzuela, Daniel; Mäkinen, Veli (2021)
    Motivation: Variant calling workflows that utilize a single reference sequence are the de facto standard elementary genomic analysis routine for resequencing projects. Various ways to enhance the reference with pangenomic information have been proposed, but scalability combined with seamless integration to existing workflows remains a challenge. Results: We present PanVC with founder sequences, a scalable and accurate variant calling workflow based on a multiple alignment of reference sequences. Scalability is achieved by removing duplicate parts up to a limit into a founder multiple alignment, that is then indexed using a hybrid scheme that exploits general purpose read aligners. Our implemented workflow uses GATK or BCFtools for variant calling, but the various steps of our workflow (e.g. vcf2multialign tool, founder reconstruction) can be of independent interest as a basis for creating novel pangenome analysis workflows beyond variant calling.
  • UWCMG (2018)
    Non-secretor status due tohomozygosity for the commonFUT2 variant c. 461G> A(p. Trp154*) is associated witheither risk for autoimmune diseases or protection against viral diarrhea and HIV. We determined the role of FUT2 in otitis media susceptibility by obtaining DNA samples from 609 multi-ethnic families and simplex case subjectswith otitis media. Exome and Sanger sequencing, linkage analysis, and Fisher exact and transmission disequilibrium tests (TDT) were performed. The common FUT2 c. 604C> T (p. Arg202*) variant co-segregates with otitismedia in a Filipino pedigree (LOD = 4.0). Additionally, a rare variant, c. 412C> T (p. Arg138Cys), is associated with recurrent/ chronic otitismedia in European-American children (p = 1.2310(-5)) and US trios (TDT p = 0.01). The c. 461G> A (p. Trp154*) variant was also overtransmitted in US trios (TDT p = 0.01) and was associated with shifts inmiddle ear microbiota composition (PERMANOVA p <10(-7)) and increased biodiversity. When all missense and nonsense variants identified in multi-ethnic US trios withCADD> 20 were combined, FUT2 variantswere over-transmitted in trios (TDTp = 0.001). Fut2 is transiently upregulated inmouse middle ear after inoculation withnon-typeable Haemophilus influenzae. Four FUT2 variants-namely p. Ala104Val, p. Arg138Cys, p. Trp154*, and p. Arg202*-reduced A antigen in mutant-transfected COS-7 cells, while the nonsense variants also reduced FUT2 protein levels. Common and rare FUT2 variants confer susceptibility to otitis media, likely by modifying the middle ear microbiome through regulation of A antigen levels in epithelial cells. Our familiesdemonstratemarked intra-familial genetic heterogeneity, suggesting thatmultiple combinations of common and rare variants plus environmental factors influence the individual otitis media phenotype as a complex trait.
  • Duru, Ilhan Cem; Andreevskaya, Margarita; Laine, Pia; Rode, Tone Mari; Ylinen, Anne; Lovdal, Trond; Bar, Nadav; Crauwels, Peter; Riedel, Christian U.; Bucur, Florentina Ionela; Nicolau, Anca Ioana; Auvinen, Petri (2020)
    BackgroundHigh pressure processing (HPP; i.e. 100-600MPa pressure depending on product) is a non-thermal preservation technique adopted by the food industry to decrease significantly foodborne pathogens, including Listeria monocytogenes, from food. However, susceptibility towards pressure differs among diverse strains of L. monocytogenes and it is unclear if this is due to their intrinsic characteristics related to genomic content. Here, we tested the barotolerance of 10 different L. monocytogenes strains, from food and food processing environments and widely used reference strains including clinical isolate, to pressure treatments with 400 and 600MPa. Genome sequencing and genome comparison of the tested L. monocytogenes strains were performed to investigate the relation between genomic profile and pressure tolerance.ResultsNone of the tested strains were tolerant to 600MPa. A reduction of more than 5 log(10) was observed for all strains after 1min 600MPa pressure treatment. L. monocytogenes strain RO15 showed no significant reduction in viable cell counts after 400MPa for 1min and was therefore defined as barotolerant. Genome analysis of so far unsequenced L. monocytogenes strain RO15, 2HF33, MB5, AB199, AB120, C7, and RO4 allowed us to compare the gene content of all strains tested. This revealed that the three most pressure tolerant strains had more than one CRISPR system with self-targeting spacers. Furthermore, several anti-CRISPR genes were detected in these strains. Pan-genome analysis showed that 10 prophage genes were significantly associated with the three most barotolerant strains.ConclusionsL. monocytogenes strain RO15 was the most pressure tolerant among the selected strains. Genome comparison suggests that there might be a relationship between prophages and pressure tolerance in L. monocytogenes.
  • Vatanen, Tommi; Plichta, Damian R.; Somani, Juhi; Muench, Philipp C.; Arthur, Timothy D.; Hall, Andrew Brantley; Rudolf, Sabine; Oakeley, Edward J.; Ke, Xiaobo; Young, Rachel A.; Haiser, Henry J.; Kolde, Raivo; Yassour, Moran; Luopajärvi, Kristiina; Siljander, Heli; Virtanen, Suvi M.; Ilonen, Jorma; Uibo, Raivo; Tillmann, Vallo; Mokurov, Sergei; Dorshakova, Natalya; Porter, Jeffrey A.; McHardy, Alice C.; Lahdesmaki, Harri; Vlamakis, Hera; Huttenhower, Curtis; Knip, Mikael; Xavier, Ramnik J. (2019)
    The human gut microbiome matures towards the adult composition during the first years of life and is implicated in early immune development. Here, we investigate the effects of microbial genomic diversity on gut microbiome development using integrated early childhood data sets collected in the DIABIMMUNE study in Finland, Estonia and Russian Karelia. We show that gut microbial diversity is associated with household location and linear growth of children. Single nucleotide polymorphism- and metagenomic assembly-based strain tracking revealed large and highly dynamic microbial pangenomes, especially in the genus Bacteroides, in which we identified evidence of variability deriving from Bacteroides-targeting bacteriophages. Our analyses revealed functional consequences of strain diversity; only 10% of Finnish infants harboured Bifidobacterium longum subsp. infantis, a subspecies specialized in human milk metabolism, whereas Russian infants commonly maintained a probiotic Bifidobacterium bifidum strain in infancy. Groups of bacteria contributing to diverse, characterized metabolic pathways converged to highly subject-specific configurations over the first two years of life. This longitudinal study extends the current view of early gut microbial community assembly based on strain-level genomic variation.