Genome-based natural product biosynthetic gene cluster discovery : from sequencing to mining

Show full item record

Title: Genome-based natural product biosynthetic gene cluster discovery : from sequencing to mining
Author: Wang, Hao
Contributor: University of Helsinki, Faculty of Agriculture and Forestry, Department of Food and Environmental Sciences, Division of Microbiology and Biotechnology
Publisher: Helsingin yliopisto
Date: 2014-03-14
Language: en
Thesis level: Doctoral dissertation (article-based)
Abstract: Natural products are small molecules produced by a range of living organisms. They may be toxic or have pharmaceutical applications as antibiotics, anticancer, antiparasitic and anti-fungal agents. Natural products are commonly synthesized by nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), such as microcystins. Ribosomal pathways in cyanobacteria are also known for the synthesis of bacteriocins, lantibiotics, cyanobactins and microviridins. Genes encoding biosynthetic enzymes of these systems are often found together and form gene clusters. The filamentous cyanobacterium Anabaena sp. strain 90, a hepatotoxin producer isolated from a bloom of a Finnish lake, was selected for genome sequencing, in order to explore its full capacity of bioactive compound production. The 5.3-Mb Anabaena sp. 90 genome displays a multi-chromosomal composition with five circular replicons: two chromosomes and three plasmids. A total of four non-ribosomal biosynthetic gene clusters, which are responsible for the production of anabaenopeptilides, anabaenopeptins, microcystins and the novel glycolipopeptides hassallidins, were identified in chromosome I. Genome annotation revealed that Anabaena sp. 90 genome also harbors an anacyclamide-encoding cyanobactin gene cluster and seven putative bacteriocin gene clusters, which belong to the ribosomal pathways. These biosynthetic gene clusters amount to a total of ~250 kb, and 5% of the genome. Analysis of the Anabaena sp. 90 genome suggested that cyanobacteria might produce bacteriocins. A thorough genome mining at the phylum level was conducted targeting the discovery of cyanobacterial bacteriocin biosynthetic pathways. The results demonstrated the common presence of bacteriocin gene clusters in cyanobacteria. A total of 145 bacteriocin gene clusters were discovered, the majority of them were previously unknown. Based on their gene organization and domain composition, these gene clusters were classified into seven groups. This classification is supported by the phylogenetic analysis, which also indicates independent evolutionary trajectories of the gene clusters in different groups. By scrutinizing the surrounding regions of these gene clusters, a total of 290 putative precursors were located. They showed diverse structures and very little sequence conservation of the core peptide. To explore the distribution of NRPSs and PKSs, a comprehensive genome-mining study was carried out and demonstrated their widespread occurrence across the three domains of life, with the discovery of 3,339 gene clusters from 991 organisms, by examining a total of 2,699 genomes. The majority of these gene clusters were found in bacteria, in which high correlation between bacterial genome size and the capacity of NRPS and PKS biosynthetic pathways was observed. Currently, PKSs are classified into three types. Type I PKSs and NRPSs are known to share a modular scheme with a multidomain structure. Surprisingly, a large number (8,906) of enzymes encoding a single NRPS or type I PKS functional domain were found. These monodomain enzymes have a similar genetic organization to type II PKSs, which are nonmodular enzymes. The finding of common occurrence of nonmodular NRPSs and type I PKSs substantially differs from the current knowledge. Furthermore, a total of 314 gene clusters comprised mostly of monodomain enzymes were found. In addition, sequence analysis suggested that the evolution of NRPS machineries was a combination of common descent and horizontal gene transfer.Natural products are bioactive compounds produced by living organisms. They have diverse chemical structures and broad biological activities, which lend themselves to pharmaceutical applications such as drug lead candidates. Nonribosomal peptides and polyketides are the most commonly utilized natural products. Recently, ribosomally synthesized natural products, such as bacteriocins, lantibiotics and cyanobactins, were also found with interesting activities and appeared as potential sources of novel medical agents. Given the advancement of DNA sequencing techniques and exponential growth of genomic data, more than three thousands of biosynthetic pathways of nonribosomal peptides, polyketides and bacteriocins were discovered in this study by complete genome sequencing and systematic genome mining. The majority of these pathways have unknown end-products, which highlights the power of genome mining in discovering novel secondary metabolites biosynthetic machineries. Genome sequencing revealed that 5% of Anabaena sp. 90 genome is dedicated to the production of bioactive peptides. Genome mining demonstrated the widespread occurrence of bacteriocin gene clusters in cyanobacteria, which were shown as a rich source of natural products biosynthesized by both nonribosomal and ribosomal pathways. Furthermore, a comprehensive genomic survey of nonribosomal peptide and polyketide biosynthetic pathways demonstrated their widespread distribution across three domains of life. This atlas showed that Proteobacteria, Actinobacteria, Firmicutes and Cyanobacteria in bacteria, and phylum of Ascomycotain in fungi contained higher number of these gene clusters and may produce a vast array of nonribosomal peptides and polyketides. The common occurrence of non-canonical nonmodular biosynthetic enzymes of peptide synthethase and type I polyketide synthase was also revealed. The knowledge discovered in this study provides a solid basis for the exploration of natural product biosynthetic capacity, for example to aid drug discovery.
Subject: microbiology
Rights: This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

Files in this item

Total number of downloads: Loading...

Files Size Format View
genomeba.pdf 633.4Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record