FEELnc : a tool for long non-coding RNA annotation and its application to the dog transcriptome

Show full item record




Wucher , V , Legeai , F , Hedan , B , Rizk , G , Lagoutte , L , Leeb , T , Jagannathan , V , Cadieu , E , David , A , Lohi , H , Cirera , S , Fredholm , M , Botherel , N , Leegwater , P A J , Le Beguec , C , Fieten , H , Johnson , J , Alfoldi , J , Andre , C , Lindblad-Toh , K , Hitte , C & Derrien , T 2017 , ' FEELnc : a tool for long non-coding RNA annotation and its application to the dog transcriptome ' , Nucleic Acids Research , vol. 45 , no. 8 , e57 . https://doi.org/10.1093/nar/gkw1306

Title: FEELnc : a tool for long non-coding RNA annotation and its application to the dog transcriptome
Author: Wucher, Valentin; Legeai, Fabrice; Hedan, Benoit; Rizk, Guillaume; Lagoutte, Laetitia; Leeb, Tosso; Jagannathan, Vidhya; Cadieu, Edouard; David, Audrey; Lohi, Hannes; Cirera, Susanna; Fredholm, Merete; Botherel, Nadine; Leegwater, Peter A. J.; Le Beguec, Celine; Fieten, Hille; Johnson, Jeremy; Alfoldi, Jessica; Andre, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Derrien, Thomas
Contributor organization: University of Helsinki
Research Programs Unit
Hannes Tapani Lohi / Principal Investigator
Veterinary Biosciences
Veterinary Genetics
Research Programme for Molecular Neurology
Date: 2017-05-05
Language: eng
Number of pages: 12
Belongs to series: Nucleic Acids Research
ISSN: 0305-1048
DOI: https://doi.org/10.1093/nar/gkw1306
URI: http://hdl.handle.net/10138/185913
Abstract: Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc.
1182 Biochemistry, cell and molecular biology
Peer reviewed: Yes
Rights: cc_by_nc
Usage restriction: openAccess
Self-archived version: publishedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
gkw1306.pdf 1.751Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record