Browsing by Subject "READ ALIGNMENT"

Sort by: Order: Results:

Now showing items 21-30 of 30
  • Gauthier, Jérémy; Pajkovic, Mila; Neuenschwander, Samuel; Kaila, Lauri; Schmid, Sarah; Orlando, Ludovic; Alvarez, Nadir (2020)
    Erosion of biodiversity generated by anthropogenic activities has been studied for decades in many areas at species level, using taxa monitoring. In contrast, genetic erosion within species has rarely been tracked, and is often studied by inferring past population dynamics from contemporaneous estimators. An alternative to such inferences is the direct examination of past genes, by analysing museum collection specimens. While providing direct access to genetic variation over time, historical DNA is usually not optimally preserved, and it is necessary to apply genotyping methods based on hybridization-capture to unravel past genetic variation. In this study, we apply such a method (i.e., HyRAD), to large time series of two butterfly species in Finland, and present a new bioinformatic pipeline, namely PopHyRAD, that standardizes and optimizes the analysis of HyRAD data at the within-species level. In the localities for which the data retrieved have sufficient power to accurately examine genetic dynamics through time, we show that genetic erosion has increased across the last 100 years, as revealed by signatures of allele extinctions and heterozygosity decreases, despite local variations. In one of the two butterflies (Erebia embla), isolation by distance also increased through time, revealing the effect of greater habitat fragmentation over time.
  • Cairns, Johannes; Jokela, Roosa; Becks, Lutz; Mustonen, Ville; Hiltunen, Teppo (2020)
    By exposing an experimental 34-species bacterial community to different levels of pulse antibiotic disturbance with or without immigration, the authors identify a highly repeatable community response, the magnitude of which increases with increasing antibiotic levels. In an era of pervasive anthropogenic ecological disturbances, there is a pressing need to understand the factors that constitute community response and resilience. A detailed understanding of disturbance response needs to go beyond associations and incorporate features of disturbances, species traits, rapid evolution and dispersal. Multispecies microbial communities that experience antibiotic perturbation represent a key system with important medical dimensions. However, previous microbiome studies on this theme have relied on high-throughput sequencing data from uncultured species without the ability to explicitly account for the role of species traits and immigration. Here, we serially passage a 34-species defined bacterial community through different levels of pulse antibiotic disturbance, manipulating the presence or absence of species immigration. To understand the ecological community response measured using amplicon sequencing, we combine initial trait data measured for each species separately and metagenome sequencing data revealing adaptive mutations during the experiment. We found that the ecological community response was highly repeatable within the experimental treatments, which could be attributed in part to key species traits (antibiotic susceptibility and growth rate). Increasing antibiotic levels were also coupled with an increasing probability of species extinction, making species immigration critical for community resilience. Moreover, we detected signals of antibiotic-resistance evolution occurring within species at the same time scale, leaving evolutionary changes in communities despite recovery at the species compositional level. Together, these observations reveal a disturbance response that presents as classic species sorting, but is nevertheless accompanied by rapid within-species evolution.
  • Eronen-Rasimus, Eeva Liisa; Hultman, Jenni; Hai, T; Pessi, Igor S; Collins, Eric; Wright, S; Laine, Pia; Viitamäki, Sirja; Lyra, Christina; Thomas, David Neville; Golyshin, Peter; Luhtanen, Anne-Mari; Kuosa, Harri; Kaartokallio, Hermanni (2021)
    Poly-3-hydroxyalkanoic acids (PHAs) are bacterial storage polymers commonly used in bioplastic production. Halophilic bacteria are industrially interesting organisms, as their salinity tolerance and psychrophilic nature lowers sterility requirements and subsequent production costs. We investigated PHA synthesis in two bacterial strains, Halomonas sp. 363 and Paracoccus sp. 392, isolated from Southern Ocean sea ice and elucidated the related PHA biopolymer accumulation and composition with various approaches, such as transcriptomics, microscopy, and chromatography. We show that both bacterial strains produce PHAs at 4 degrees C when the availability of nitrogen and/or oxygen limited growth. The genome of Halomonas sp. 363 carries three phaC synthase genes and transcribes genes along three PHA pathways (I to III), whereas Paracoccus sp. 392 carries only one phaC gene and transcribes genes along one pathway (I). Thus, Halomonas sp. 363 has a versatile repertoire of phaC genes and pathways enabling production of both short- and medium-chain-length PHA products. IMPORTANCE Plastic pollution is one of the most topical threats to the health of the oceans and seas. One recognized way to alleviate the problem is to use degradable bioplastic materials in high-risk applications. PHA is a promising bioplastic material as it is nontoxic and fully produced and degraded by bacteria. Sea ice is an interesting environment for prospecting novel PHA-producing organisms, since traits advantageous to lower production costs, such as tolerance for high salinities and low temperatures, are common. We show that two sea-ice bacteria, Halomonas sp. 363 and Paracoccus sp. 392, are able to produce various types of PHA from inexpensive carbon sources. Halomonas sp. 363 is an especially interesting PHA-producing organism, since it has three different synthesis pathways to produce both short- and medium-chain-length PHAs.
  • Burny, Claire; Nolte, Viola; Nouhaud, Pierre; Dolezal, Marlies; Schloetterer, Christian (2020)
    Evolve and resequencing (E&R) studies investigate the genomic responses of adaptation during experimental evolution. Because replicate populations evolve in the same controlled environment, consistent responses to selection across replicates are frequently used to identify reliable candidate regions that underlie adaptation to a new environment. However, recent work demonstrated that selection signatures can be restricted to one or a few replicate(s) only. These selection signatures frequently have weak statistical support, and given the difficulties of functional validation, additional evidence is needed before considering them as candidates for functional analysis. Here, we introduce an experimental procedure to validate candidate loci with weak or replicate-specific selection signature(s). Crossing an evolved population from a primary E&R experiment to the ancestral founder population reduces the frequency of candidate alleles that have reached a high frequency. We hypothesize that genuine selection targets will experience a repeatable frequency increase after the mixing with the ancestral founders if they are exposed to the same environment (secondary E&R experiment). Using this approach, we successfully validate two overlapping selection targets, which showed a mutually exclusive selection signature in a primary E&R experiment of Drosophila simulans adapting to a novel temperature regime. We conclude that secondary E&R experiments provide a reliable confirmation of selection signatures that either are not replicated or show only a low statistical significance in a primary E&R experiment unless epistatic interactions predominate. Such experiments are particularly helpful to prioritize candidate loci for time-consuming functional follow-up investigations.
  • Icay, Katherine; Chen, Ping; Cervera Taboada, Alejandra; Rantanen, Ville; Lehtonen, Rainer; Hautaniemi, Sampsa (2016)
    Background: Large-scale sequencing experiments are complex and require a wide spectrum of computational tools to extract and interpret relevant biological information. This is especially true in projects where individual processing and integrated analysis of both small RNA and complementary RNA data is needed. Such studies would benefit from a computational workflow that is easy to implement and standardizes the processing and analysis of both sequenced data types. Results: We developed SePIA (Sequence Processing, Integration, and Analysis), a comprehensive small RNA and RNA workflow. It provides ready execution for over 20 commonly known RNA-seq tools on top of an established workflow engine and provides dynamic pipeline architecture to manage, individually analyze, and integrate both small RNA and RNA data. Implementation with Docker makes SePIA portable and easy to run. We demonstrate the workflow's extensive utility with two case studies involving three breast cancer datasets. SePIA is straightforward to configure and organizes results into a perusable HTML report. Furthermore, the underlying pipeline engine supports computational resource management for optimal performance. Conclusion: SePIA is an open-source workflow introducing standardized processing and analysis of RNA and small RNA data. SePIA's modular design enables robust customization to a given experiment while maintaining overall workflow structure.
  • Parducci, Laura; Alsos, Inger Greve; Unneberg, Per; Pedersen, Mikkel W.; Han, Lu; Lammers, Youri; Salonen, J. Sakari; Väliranta, Minna M.; Slotte, Tanja; Wohlfarth, Barbara (2019)
    The lake sediments of Hasseldala Port in south-east Sweden provide an archive of local and regional environmental conditions similar to 14.5-9.5 ka BP (thousand years before present) and allow testing DNA sequencing techniques to reconstruct past vegetation changes. We combined shotgun sequencing with plant micro- and macrofossil analyses to investigate sediments dating to the Allerod (14.1-12.7 ka BP), Younger Dryas (12.7-11.7 ka BP), and Preboreal (
  • Aska, Elli-Mari; Dermadi, Denis; Kauppi, Liisa (2020)
    DNA mismatch repair (MMR) corrects replication errors and is recruited by the histone mark H3K36me3, enriched in exons of transcriptionally active genes. To dissect in vivo the mutational landscape shaped by these processes, we employed single-cell exome sequencing on T cells of wild-type andMMR-deficient (Mlh1(-/-)) mice. Within active genes, we uncovered a spatial bias in MMR efficiency: 3' exons, often H3K36me3-enriched, acquire significantly fewer MMR-dependent mutations compared with 5' exons. Huwe1 and Mcm7 genes, both active during lymphocyte development, stood out as mutational hotspots in MMR-deficient cells, demonstrating their intrinsic vulnerability to replication error in this cell type. Both genes are H3K36me3-enriched, which can explain MMR-mediated elimination of replication errors in wild-type cells. Thus, H3K36me3 can boost MMR in transcriptionally active regions, both locally and globally. This offers an attractive concept of thriftyMMR targeting, where critical genes in each cell type enjoy preferential shielding against de novo mutations.
  • MAGIC; Alonso, Lorena; Piron, Anthony; Moran, Ignasi; Groop, Leif; Torrents, David (2021)
    Genome-wide association studies (GWASs) identified hundreds of signals associated with type 2 diabetes (T2D). To gain insight into their underlying molecular mechanisms, we have created the translational human pancreatic islet genotype tissue-expression resource (TIGER), aggregating >500 human islet genomic datasets from five cohorts in the Horizon 2020 consortium T2DSystems. We impute genotypes using four reference panels and meta-analyze cohorts to improve the coverage of expression quantitative trait loci (eQTL) and develop a method to combine allele-specific expression across samples (cASE). We identify >1 million islet eQTLs, 53 of which colocalize with T2D signals. Among them, a low-frequency allele that reduces T2D risk by half increases CCND2 expression. We identify eight cASE colocalizations, among which we found a T2D-associated SLC30A8 variant. We make all data available through the TIGER portal (, which represents a comprehensive human islet genomic data resource to elucidate how genetic variation affects islet function and translates into therapeutic insight and precision medicine for T2D.
  • Seo, Seung Bum; Zeng, Xiangpei; King, Jonathan L.; Larue, Bobby L.; Assidi, Mourad; Al-Qahtani, Mohamed H.; Sajantila, Antti; Budowle, Bruce (2015)
    Background: Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGM (TM)) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeq (TM) (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results: 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions: In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.
  • Varhaug, Kristin N.; Nido, Gonzalo S.; de Coo, Irenaeus; Isohanni, Pirjo; Suomalainen, Anu; Tzoulis, Charalampos; Knappskog, Per; Bindoff, Laurence A. (2020)
    Objective: The aim of this study was to evaluate if urinary sediment cells offered a robust alternative to muscle biopsy for the diagnosis of single mtDNA deletions. Methods: Eleven adult patients with progressive external ophthalmoplegia and a known single mtDNA deletion were investigated. Urinary sediment cells were used to isolate DNA, which was then subjected to long-range polymerase chain reaction. Where available, the patient's muscle DNA was studied in parallel. Breakpoint and thus deletion size were identified using both Sanger sequencing and next generation sequencing. The level of heteroplasmy was determined using quantitative polymerase chain reaction. Results: We identified the deletion in urine in 9 of 11 cases giving a sensitivity of 80%. Breakpoints and deletion size were readily detectable in DNA extracted from urine. Mean heteroplasmy level in urine was 38% +/- 26 (range 8 - 84%), and 57% +/- 28 (range 12 - 94%) in muscle. While the heteroplasmy level in urinary sediment cells differed from that in muscle, we did find a statistically significant correlation between these two levels (R = 0.714, P = 0.031(Pearson correlation)). Interpretation: Our findings suggest that urine can be used to screen patients suspected clinically of having a single mtDNA deletion. Based on our data, the use of urine could considerably reduce the need for muscle biopsy in this patient group.