Succinct dynamic de Bruijn graphs

Show full item record



Permalink

http://hdl.handle.net/10138/346271

Citation

Alipanahi , B , Kuhnle , A , Puglisi , S J , Salmela , L & Boucher , C 2021 , ' Succinct dynamic de Bruijn graphs ' , Bioinformatics , vol. 37 , no. 14 , pp. 1946-1952 . https://doi.org/10.1093/bioinformatics/btaa546

Title: Succinct dynamic de Bruijn graphs
Author: Alipanahi, Bahar; Kuhnle, Alan; Puglisi, Simon J.; Salmela, Leena; Boucher, Christina
Contributor organization: Department of Computer Science
Helsinki Institute for Information Technology
Algorithmic Bioinformatics
Bioinformatics
Date: 2021-07-15
Language: eng
Number of pages: 7
Belongs to series: Bioinformatics
ISSN: 1367-4803
DOI: https://doi.org/10.1093/bioinformatics/btaa546
URI: http://hdl.handle.net/10138/346271
Abstract: Motivation: The de Bruijn graph is one of the fundamental data structures for analysis of high throughput sequencing data. In order to be applicable to population-scale studies, it is essential to build and store the graph in a space- and time-efficient manner. In addition, due to the ever-changing nature of population studies, it has become essential to update the graph after construction, e.g. add and remove nodes and edges. Although there has been substantial effort on making the construction and storage of the graph efficient, there is a limited amount of work in building the graph in an efficient and mutable manner. Hence, most space efficient data structures require complete reconstruction of the graph in order to add or remove edges or nodes. Results: In this article, we present DynamicBOSS, a succinct representation of the de Bruijn graph that allows for an unlimited number of additions and deletions of nodes and edges. We compare our method with other competing methods and demonstrate that DynamicBOSS is the only method that supports both addition and deletion and is applicable to very large samples (e.g. greater than 15 billion k-mers). Competing dynamic methods, e.g. FDBG cannot be constructed on large scale datasets, or cannot support both addition and deletion, e.g. BiFrost.
Subject: 1182 Biochemistry, cell and molecular biology
113 Computer and information sciences
111 Mathematics
Peer reviewed: Yes
Usage restriction: openAccess
Self-archived version: acceptedVersion


Files in this item

Total number of downloads: Loading...

Files Size Format View
main.pdf 894.8Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record