Bit-parallel sequence-to-graph alignment

Show simple item record

dc.contributor University of Helsinki, Genome-scale Algorithmics research group / Veli Mäkinen en
dc.contributor.author Rautiainen, Mikko
dc.contributor.author Mäkinen, Veli
dc.contributor.author Marschall, Tobias
dc.date.accessioned 2019-12-12T11:11:01Z
dc.date.available 2019-12-12T11:11:01Z
dc.date.issued 2019-10-01
dc.identifier.citation Rautiainen , M , Mäkinen , V & Marschall , T 2019 , ' Bit-parallel sequence-to-graph alignment ' , Bioinformatics , vol. 35 , no. 19 , pp. 3599-3607 . https://doi.org/10.1093/bioinformatics/btz162 en
dc.identifier.issn 1367-4803
dc.identifier.other PURE: 128884132
dc.identifier.other PURE UUID: 8d72c7be-0656-4794-9e4a-92f720d4fa87
dc.identifier.other WOS: 000499322300008
dc.identifier.other ORCID: /0000-0003-4454-1493/work/66032984
dc.identifier.uri http://hdl.handle.net/10138/308160
dc.description.abstract Motivation: Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction and variant calling with respect to a variation graph. Results: We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers' bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with vertical bar V vertical bar nodes and vertical bar E vertical bar edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of O(vertical bar V vertical bar+(sic)m/w(sic)vertical bar E vertical bar logw) for acyclic graphs and O(vertical bar V vertical bar+m vertical bar E vertical bar logw) for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared with a previous (asymptotically optimal) alignment algorithm. en
dc.format.extent 9
dc.language.iso eng
dc.relation.ispartof Bioinformatics
dc.rights en
dc.subject DE-BRUIJN GRAPHS en
dc.subject APPROXIMATE en
dc.subject ALGORITHM en
dc.subject ACCURATE en
dc.subject 1182 Biochemistry, cell and molecular biology en
dc.subject 113 Computer and information sciences en
dc.subject 111 Mathematics en
dc.title Bit-parallel sequence-to-graph alignment en
dc.type Article
dc.description.version Peer reviewed
dc.identifier.doi https://doi.org/10.1093/bioinformatics/btz162
dc.type.uri info:eu-repo/semantics/other
dc.type.uri info:eu-repo/semantics/publishedVersion
dc.contributor.pbl
dc.contributor.pbl
dc.contributor.pbl
dc.contributor.pbl
dc.contributor.pbl

Files in this item

Total number of downloads: Loading...

Files Size Format View
btz162.pdf 977.7Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record