Data structures based on k-mers for querying large collections of sequencing data sets

Näytä kaikki kuvailutiedot



Pysyväisosoite

http://hdl.handle.net/10138/327976

Lähdeviite

Marchet , C , Boucher , C , Puglisi , S J , Medvedev , P , Salson , M & Chikhi , R 2021 , ' Data structures based on k-mers for querying large collections of sequencing data sets ' , Genome Research , vol. 31 , no. 1 . https://doi.org/10.1101/gr.260604.119

Julkaisun nimi: Data structures based on k-mers for querying large collections of sequencing data sets
Tekijä: Marchet, Camille; Boucher, Christina; Puglisi, Simon J.; Medvedev, Paul; Salson, Mikael; Chikhi, Rayan
Tekijän organisaatio: Department of Computer Science
Helsinki Institute for Information Technology
Algorithmic Bioinformatics
Bioinformatics
Päiväys: 2021-01
Kieli: eng
Sivumäärä: 12
Kuuluu julkaisusarjaan: Genome Research
ISSN: 1088-9051
DOI-tunniste: https://doi.org/10.1101/gr.260604.119
URI: http://hdl.handle.net/10138/327976
Tiivistelmä: High-throughput sequencing data sets are usually deposited in public repositories (e.g., the European Nucleotide Archive) to ensure reproducibility. As the amount of data has reached petabyte scale, repositories do not allow one to perform online sequence searches, yet, such a feature would be highly useful to investigators. Toward this goal, in the last few years several computational approaches have been introduced to index and query large collections of data sets. Here, we propose an accessible survey of these approaches, which are generally based on representing data sets as sets of k-mers. We review their properties, introduce a classification, and present their general intuition. We summarize their performance and highlight their current strengths and limitations.
Avainsanat: DE-BRUIJN GRAPHS
ALIGNMENT-FREE
SEARCH
QUANTIFICATION
DATABASES
THOUSANDS
READS
1182 Biochemistry, cell and molecular biology
1184 Genetics, developmental biology, physiology
Vertaisarvioitu: Kyllä
Tekijänoikeustiedot: cc_by
Pääsyrajoitteet: openAccess
Rinnakkaistallennettu versio: publishedVersion


Tiedostot

Latausmäärä yhteensä: Ladataan...

Tiedosto(t) Koko Formaatti Näytä
Genome_Res._2021_Marchet_1_12.pdf 489.3KB PDF Avaa tiedosto

Viite kuuluu kokoelmiin:

Näytä kaikki kuvailutiedot