Hardness of Covering Alignment : Phase Transition in Post-Sequence Genomics

Show full item record



Permalink

http://hdl.handle.net/10138/298635

Citation

Rizzi , R , Cairo , M , Mäkinen , V , Tomescu , A I & Valenzuela , D 2019 , ' Hardness of Covering Alignment : Phase Transition in Post-Sequence Genomics ' , IEEE/ACM Transactions on Computational Biology and Bioinformatics , vol. 16 , no. 1 , pp. 23-30 . https://doi.org/10.1109/TCBB.2018.2831691

Title: Hardness of Covering Alignment : Phase Transition in Post-Sequence Genomics
Author: Rizzi, Romeo; Cairo, Massimo; Mäkinen, Veli; Tomescu, Alexandru I.; Valenzuela, Daniel
Contributor: University of Helsinki, Genome-scale Algorithmics research group / Veli Mäkinen
University of Helsinki, Department of Computer Science
University of Helsinki, Helsinki Institute for Information Technology
Date: 2019-02
Language: eng
Number of pages: 8
Belongs to series: IEEE/ACM Transactions on Computational Biology and Bioinformatics
ISSN: 1545-5963
URI: http://hdl.handle.net/10138/298635
Abstract: Covering alignment problems arise from recent developments in genomics; so called pan-genome graphs are replacing reference genomes, and advances in haplotyping enable full content of diploid genomes to be used as basis of sequence analysis. In this paper, we show that the computational complexity will change for natural extensions of alignments to pan-genome representations and to diploid genomes. More broadly, our approach can also be seen as a minimal extension of sequence alignment to labelled directed acyclic graphs (labeled DAGs). Namely, we show that finding a covering alignment of two labeled DAGs is NP-hard even on binary alphabets. A covering alignment asks for two paths R-1 (red) and G(1) (green) in DAG D-1 and two paths R-2 (red) and G(2) (green) in DAG D-2 that cover the nodes of the graphs and maximize the sum of the global alignment scores: asosp(R-1), sp(R-2)) + asosp(G(1)), sp(G(2))), where sp(P) is the concatenation of labels on the path P. Pair-wise alignment of haplotype sequences forming a diploid chromosome can be converted to a two-path coverable labelled DAG, and then the covering alignment models the similarity of two diploids over arbitrary recombinations. We also give a reduction to the other direction, to show that such a recombination-oblivious diploid alignment is NP-hard on alphabets of size 3.
Subject: 113 Computer and information sciences
Alignment
edit distance
directed acyclic graph
diploid genome
pan-genome
NP-hard problem
COMPLEXITY
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
main.pdf 545.6Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record