TY - T1 - Data structures based on k-mers for querying large collections of sequencing data sets SN - / UR - http://hdl.handle.net/10138/327976 T3 - A1 - Marchet, Camille; Boucher, Christina; Puglisi, Simon J.; Medvedev, Paul; Salson, Mikael; Chikhi, Rayan A2 - PB - Y1 - 2021 LA - eng AB - High-throughput sequencing data sets are usually deposited in public repositories (e.g., the European Nucleotide Archive) to ensure reproducibility. As the amount of data has reached petabyte scale, repositories do not allow one to perform online sequence searches, yet, such a feature would be highly useful to investigators. Toward this goal, in the last few years several computational approaches have been introduced to index and query large collections of data sets. Here, we propose an accessib... VO - IS - SP - OP - KW - DE-BRUIJN GRAPHS; ALIGNMENT-FREE; SEARCH; QUANTIFICATION; DATABASES; THOUSANDS; READS; 1182 Biochemistry, cell and molecular biology; 1184 Genetics, developmental biology, physiology N1 - PP - ER -