A high-throughput multiplexing and selection strategy to complete bacterial genomes

Title: A high-throughput multiplexing and selection strategy to complete bacterial genomes
Author: Arredondo-Alonso, Sergio; Pöntinen, Anna K.; Cleon, Francois; Gladstone, Rebecca A.; Schurch, Anita C.; Johnsen, Pal J.; Samuelsen, Orjan; Corander, Jukka
Contributor organization: Department of Mathematics and Statistics
Helsinki Institute for Information Technology
Biostatistics Helsinki
Jukka Corander / Principal Investigator
Date: 2021-12
Language: eng
Number of pages: 13
Belongs to series: GigaScience
ISSN: 2047-217X
DOI: https://doi.org/10.1093/gigascience/giab079
URI: http://hdl.handle.net/10138/340198
Abstract: Background: Bacterial whole-genome sequencing based on short-read technologies often results in a draft assembly formed by contiguous sequences. The introduction of long-read sequencing technologies permits those contiguous sequences to be unambiguously bridged into complete genomes. However, the elevated costs associated with long-read sequencing frequently limit the number of bacterial isolates that can be long-read sequenced. Here we evaluated the recently released 96 barcoding kit from Oxford Nanopore Technologies (ONT) to generate complete genomes on a high-throughput basis. In addition, we propose an isolate selection strategy that optimizes a representative selection of isolates for long-read sequencing considering as input large-scale bacterial collections. Results: Despite an uneven distribution of long reads per barcode, near-complete chromosomal sequences (assembly contiguity = 0.89) were generated for 96 Escherichia coli isolates with associated short-read sequencing data. The assembly contiguity of the plasmid replicons was even higher (0.98), which indicated the suitability of the multiplexing strategy for studies focused on resolving plasmid sequences. We benchmarked hybrid and ONT-only assemblies and showed that the combination of ONT sequencing data with short-read sequencing data is still highly desirable (i) to perform an unbiased selection of isolates for long-read sequencing, (ii) to achieve an optimal genome accuracy and completeness, and (iii) to include small plasmids underrepresented in the ONT library. Conclusions: The proposed long-read isolate selection ensures the completion of bacterial genomes that span the genome diversity inherent in large collections of bacterial isolates. We show the potential of using this multiplexing approach to close bacterial genomes on a high-throughput basis.
1181 Ecology, evolutionary biology
11832 Microbiology and virology
Peer reviewed: Yes
Rights: cc_by
Usage restriction: openAccess
Self-archived version: publishedVersion

