Kohdista: An efficient method to index and query possible Rmap alignments : Algorithms for Molecular Biology

Show full item record



Permalink

http://hdl.handle.net/10138/312706

Citation

Muggli , M D , Puglisi , S J & Boucher , C 2019 , ' Kohdista: An efficient method to index and query possible Rmap alignments : Algorithms for Molecular Biology ' , Algorithms for Molecular Biology , vol. 14 , no. 1 , 25 . https://doi.org/10.1186/s13015-019-0160-9

Title: Kohdista: An efficient method to index and query possible Rmap alignments : Algorithms for Molecular Biology
Author: Muggli, M.D.; Puglisi, S.J.; Boucher, C.
Contributor: University of Helsinki, Department of Computer Science
Date: 2019
Language: eng
Belongs to series: Algorithms for Molecular Biology
ISSN: 1748-7188
URI: http://hdl.handle.net/10138/312706
Abstract: Background: Genome-wide optical maps are ordered high-resolution restriction maps that give the position of occurrence of restriction cut sites corresponding to one or more restriction enzymes. These genome-wide optical maps are assembled using an overlap-layout-consensus approach using raw optical map data, which are referred to as Rmaps. Due to the high error-rate of Rmap data, finding the overlap between Rmaps remains challenging. Results: We present Kohdista, which is an index-based algorithm for finding pairwise alignments between single molecule maps (Rmaps). The novelty of our approach is the formulation of the alignment problem as automaton path matching, and the application of modern index-based data structures. In particular, we combine the use of the Generalized Compressed Suffix Array (GCSA) index with the wavelet tree in order to build Kohdista. We validate Kohdista on simulated E. coli data, showing the approach successfully finds alignments between Rmaps simulated from overlapping genomic regions. Conclusion: we demonstrate Kohdista is the only method that is capable of finding a significant number of high quality pairwise Rmap alignments for large eukaryote organisms in reasonable time. © 2019 The Author(s).
Subject: FM-index
Graph algorithms
Index based data structures
Optical mapping
113 Computer and information sciences
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
Muggli_et_al.pdf 1.496Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record