Browsing by Subject "Barcode"

Sort by: Order: Results:

Now showing items 1-3 of 3
  • Somervuo, Panu; Koskinen, Patrik; Mei, Peng; Holm, Liisa; Auvinen, Petri; Paulin, Lars (2018)
    Background: Current high-throughput sequencing platforms provide capacity to sequence multiple samples in parallel. Different samples are labeled by attaching a short sample specific nucleotide sequence, barcode, to each DNA molecule prior pooling them into a mix containing a number of libraries to be sequenced simultaneously. After sequencing, the samples are binned by identifying the barcode sequence within each sequence read. In order to tolerate sequencing errors, barcodes should be sufficiently apart from each other in sequence space. An additional constraint due to both nucleotide usage and basecalling accuracy is that the proportion of different nucleotides should be in balance in each barcode position. The number of samples to be mixed in each sequencing run may vary and this introduces a problem how to select the best subset of available barcodes at sequencing core facility for each sequencing run. There are plenty of tools available for de novo barcode design, but they are not suitable for subset selection. Results: We have developed a tool which can be used for three different tasks: 1) selecting an optimal barcode set from a larger set of candidates, 2) checking the compatibility of user-defined set of barcodes, e.g. whether two or more libraries with existing barcodes can be combined in a single sequencing pool, and 3) augmenting an existing set of barcodes. In our approach the selection process is formulated as a minimization problem. We define the cost function and a set of constraints and use integer programming to solve the resulting combinatorial problem. Based on the desired number of barcodes to be selected and the set of candidate sequences given by user, the necessary constraints are automatically generated and the optimal solution can be found. The method is implemented in C programming language and web interface is available at http://ekhidna2.biocenter.helsinki.fi/barcosel. Conclusions: Increasing capacity of sequencing platforms raises the challenge of mixing barcodes. Our method allows the user to select a given number of barcodes among the larger existing barcode set so that both sequencing errors are tolerated and the nucleotide balance is optimized. The tool is easy to access via web browser.
  • Somervuo, Panu; Koskinen, Patrik; Mei, Peng; Holm, Liisa; Auvinen, Petri; Paulin, Lars (BioMed Central, 2018)
    Abstract Background Current high-throughput sequencing platforms provide capacity to sequence multiple samples in parallel. Different samples are labeled by attaching a short sample specific nucleotide sequence, barcode, to each DNA molecule prior pooling them into a mix containing a number of libraries to be sequenced simultaneously. After sequencing, the samples are binned by identifying the barcode sequence within each sequence read. In order to tolerate sequencing errors, barcodes should be sufficiently apart from each other in sequence space. An additional constraint due to both nucleotide usage and basecalling accuracy is that the proportion of different nucleotides should be in balance in each barcode position. The number of samples to be mixed in each sequencing run may vary and this introduces a problem how to select the best subset of available barcodes at sequencing core facility for each sequencing run. There are plenty of tools available for de novo barcode design, but they are not suitable for subset selection. Results We have developed a tool which can be used for three different tasks: 1) selecting an optimal barcode set from a larger set of candidates, 2) checking the compatibility of user-defined set of barcodes, e.g. whether two or more libraries with existing barcodes can be combined in a single sequencing pool, and 3) augmenting an existing set of barcodes. In our approach the selection process is formulated as a minimization problem. We define the cost function and a set of constraints and use integer programming to solve the resulting combinatorial problem. Based on the desired number of barcodes to be selected and the set of candidate sequences given by user, the necessary constraints are automatically generated and the optimal solution can be found. The method is implemented in C programming language and web interface is available at http://ekhidna2.biocenter.helsinki.fi/barcosel . Conclusions Increasing capacity of sequencing platforms raises the challenge of mixing barcodes. Our method allows the user to select a given number of barcodes among the larger existing barcode set so that both sequencing errors are tolerated and the nucleotide balance is optimized. The tool is easy to access via web browser.
  • Liimatainen, Kare; Niskanen, Tuula; Dima, Bálint; Ammirati, Joseph F.; Kirk, Paul M.; Kytövuori, Ilkka (2020)
    So far approximately 144,000 species of fungi have been named but sequences of the majority of them do not exist in the public databases. Therefore, the quality and coverage of public barcode databases is a bottleneck that hinders the study of fungi. Cortinarius is the largest genus of Agaricales with thousands of species world-wide. The most diverse subgenus in Cortinarius is Telamonia and its species have been considered one of the most taxonomically challenging in the Agaricales. Its high diversity combined with convergent, similar appearing taxa have earned it a reputation of being an impossible group to study. In this study a total of 746 specimens, including 482 type specimens representing 184 species were sequenced. Also, a significant number of old types were successfully sequenced, 105 type specimens were over 50 years old and 18 type specimens over 100 years old. Altogether, 20 epi- or neotypes are proposed for recently commonly used older names. Our study doubles the number of reliable DNA-barcodes of species of C. subgenus Telamonia in the public sequence databases. This is also the first extensive phylogenetic study of the subgenus. A majority of the sections and species are shown in a phylogenetic context for the first time. Our study shows that nomenclatural problems, even in difficult groups like C. subgenus Telamonia, can be solved and consequently identification of species based on ITS barcodes becomes an easy task even for non-experts of the genus.