Bayesian identification of bacterial strains from sequencing data

Show full item record



Sankar , A , Malone , B M , Bayliss , S C , Pascoe , B , Méric , G , Hitchings , M D , Sheppard , S K , Feil , E J , Corander , J I & Honkela , A J H 2016 , ' Bayesian identification of bacterial strains from sequencing data ' , Microbial Genomics , vol. 2 , no. 8 .

Title: Bayesian identification of bacterial strains from sequencing data
Author: Sankar, Aravind; Malone, Brandon Michael; Bayliss, Sion C.; Pascoe, Ben; Méric, Guillaume; Hitchings, Matthew D.; Sheppard, Samuel K.; Feil, Edward J.; Corander, Jukka Ilmari; Honkela, Antti Juho Henrikki
Contributor organization: Helsinki Institute for Information Technology
Department of Computer Science
Complex Systems Computation research group / Petri Myllymäki
The Finnish Center of Excellence in Computational Inference Research (COIN)
Department of Mathematics and Statistics
Jukka Corander / Principal Investigator
Probabilistic Mechanistic Models for Genomics research group / Antti Honkela
Biostatistics Helsinki
Date: 2016-08-25
Language: eng
Number of pages: 9
Belongs to series: Microbial Genomics
ISSN: 2057-5858
Abstract: Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an environmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel pipeline for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence-based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at
Subject: 113 Computer and information sciences
1183 Plant biology, microbiology, virology
Peer reviewed: Yes
Rights: cc_by_nc
Usage restriction: openAccess
Self-archived version: publishedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
mgen000075.pdf 450.2Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record