Evaluating approaches to find exon chains based on long reads

Show full item record



Permalink

http://hdl.handle.net/10138/217898

Citation

Kuosmanen , A , Norri , T & Mäkinen , V 2018 , ' Evaluating approaches to find exon chains based on long reads ' , Briefings in Bioinformatics , vol. 19 , no. 3 , pp. 404-414 . https://doi.org/10.1093/bib/bbw137

Title: Evaluating approaches to find exon chains based on long reads
Author: Kuosmanen, Anna; Norri, Tuukka; Mäkinen, Veli
Contributor organization: Genome-scale Algorithmics research group / Veli Mäkinen
Department of Computer Science
Helsinki Institute for Information Technology
Bioinformatics
Algorithmic Bioinformatics
Date: 2018-05
Language: eng
Number of pages: 11
Belongs to series: Briefings in Bioinformatics
ISSN: 1467-5463
DOI: https://doi.org/10.1093/bib/bbw137
URI: http://hdl.handle.net/10138/217898
Abstract: Transcript prediction can be modeled as a graph problem where exons are modeled as nodes and reads spanning two or more exons are modeled as exon chains. Pacific Biosciences third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions. We survey several approaches to find the exon chains corresponding to long reads in a splicing graph, and experimentally study the performance of these methods using simulated data to allow for sensitivity/precision analysis. Our experiments show that short reads from second-generation sequencing can be used to significantly improve exon chain correctness either by error-correcting the long reads before splicing graph creation, or by using them to create a splicing graph on which the long-read alignments are then projected. We also study the memory and time consumption of various modules, and show that accurate exon chains lead to significantly increased transcript prediction accuracy. Availability: The simulated data and in-house scripts used for this article are available at http://www.cs.helsinki.fi/group/gsa/exon-chains/exon-chains-bib.tar.bz2.
Subject: 113 Computer and information sciences
alternative splicing
transcript prediction
RNA sequencing
split-read alignment
RNA-SEQ DATA
MESSENGER-RNA
TRANSCRIPTOME
QUANTIFICATION
Peer reviewed: Yes
Rights: cc_by_nc
Usage restriction: openAccess
Self-archived version: publishedVersion


Files in this item

Total number of downloads: Loading...

Files Size Format View
bbw137.pdf 852.6Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record