Keywords: Secondary metabolites; Natural products; Terpenes; Terpene cyclase; Gas-chromatography mass-spectrometry; Germacradienol; Geosmin; Farnesyl pyrophosphate; Isoprene; Genome mining; Heterologous overexpression
Identification, cloning, overexpression and characterization of putative terpene cyclase from Stackebrandtia nassauensis
Abstract
The goal of the study was to partially establish the role of a putative terpene synthase gene (snas_1127) identified from a soil-dwelling Gram-positive, filamentous actinobacteria, Stackebrandtia nassauensis DSM 44728 by genome mining to be a functional germacradienol synthase enzyme. The snas_1127 gene of S. nassauensis was cloned by PCR and heterologously expressed in Escherichia coli as an N-terminal-His6-tag protein. Incubation of the recombinant protein, SNAS_1127, with farnesyl diphosphate (FPP) in the presence of Mg2+ gave a sesquiterpene germacradienol; a precursor for many industrially important terpene metabolites. However, the bioinformatics analysis revealed that the target protein contains two conserved domains, viz. C-terminal domain and N-terminal domain. It is already established that the C-terminal domain catalyzed the conversion of farnesyl pyrophosphate (FPP) to germacradienol and thus germacradienol serves as a substrate for the N-terminal domain and gets converted to geosmin. On manual comparison of the identified protein sequence with already characterized germacradienol/ geosmin synthases, it was observed that there is a point mutation in the NSE triad (274E instead of the usual 274D) of the N-terminal domain. This might be why SNGS catalyzed only the conversion of FPP to germacradien-4-ol, and due to point mutation, the further conversion didn’t go as expected. Therefore, no other identifiable products were observed on GC-MS. As SNGS catalyzed the conversion of FPP to germacradienol, which being the precursor for the synthesis of many industrially useful metabolites, can be synthesized in a controlled manner without the risk of further conversion by the N-terminal domain.
Terpenes are the largest and structurally the most diverse group of natural products produced mainly by plants, fungi, and animals; bacteria produce a few. Terpene synthases (TPSs) are involved in the catalysis of one of the most complex cyclization reactions of nature, i.e., cyclization of linear isoprene precursors into cyclic terpenes. These terpenes have a broad range of applications as pharmaceuticals, perfumes, and fragrances, agrochemicals and jet fuels, etc.
The present study was aimed to identify a novel putative terpene synthase from Stackebrandtia nassauensis DSM 44728 genome sequence by genome database mining approach. The identified snas_1127 (sngs) gene was then cloned and overexpressed, followed by its functional characterization. The findings of this study are described as follows:
The putative TPS gene (sngs) from S. nassauensis DSM 44728 genome was identified using the FASTA sequence of a previously characterized pentalenene synthase (protein ID: ADO85594.1) from Streptomyces exfoliatus UC5319 as a BLASTp query. The sngs gene was found to have an open reading frame of 2310 bp encoding a protein (SNGS) of 769 amino acids with a theoretical molecular weight of 86.26 kDa and pI 5.24. The protein sequence of the identified putative TPS showed 27% sequence identity with the query sequence. It was further subjected to multiple sequence alignment with previously characterized TPSs by the Clustal W2 program. It was found to contain all the characteristic and conserved motifs of a TPS in its catalytic site, i.e., aspartate rich motif [92-DDHFLE-97] and NSE triad [NDLFSYQRE] in Nterminal domain and [459-DDLYPTV-465] and [602-NDLFSYQKE-611] in the C-terminal domain. Phylogenetic analysis of SNGS revealed that it belongs to the sesquiterpene synthase family. It plays a similar function as the others in the clade as it catalyzes the bioconversion of farnesyl diphosphate substrate into germacradienol/ geosmin. The TPS gene identified in the present study (sngs) was independent of any biosynthetic gene cluster or operon. Also, the ProtParam analysis shed light on the physicochemical properties of SNGS; the grand average of hydropathicity (GRAVY) of SNGS was found to be -0.36, indicating that SNGS is a hydrophilic protein and will be obtained in the soluble fraction. TMHMM and CELLO results also supported the ProtParam analysis output and confirmed that it is a cytosolic protein. The aliphatic index of 80.96 indicated that the protein is thermally stable and contains a high amount of hydrophobic amino acid residues. Its instability index (47.04) suggested that the protein may be unstable.
A homology model of SNGS (Uniprot ID: D3QB20) was constructed with the help of an ITASSER server based on the crystal structure of germacradienol/ geosmin synthase from Streptomyces coelicolor as a template (PDB ID: 5DZ2 Chain A) with N-terminal domain showing 68.8% and C-terminal domain showing 39.8% identity with SNGS protein. The C-score of the SNGS model was calculated to be -1.25, signifying that the model generated with high confidence. TM-score and RMSD values for SNGS were measured to be 0.56 + 0.15 and 0.00 Å which indicated that the constructed model is accurate and of high quality. The correctness and consistency of the backbone conformation of the constructed model were evaluated through the Ramachandran plot, using PROCHECK and the plot showed that 97.7% of the amino acid residues of SNGS were present in the most favored, additionally and generously allowed regions. Further, the ERRAT plot was used to measure the overall quality factor for the generated model of SNGS and found to be 88.375, which indicated the accuracy of the created model.
To start with molecular cloning, S. nassauensis DSM 44728 was incubated and grown in N-Z amine broth, and its genomic DNA was extracted by the proteinase K-lysozyme lysis method. Gene-specific primers were designed using Geneious R7 software to amplify the sngs gene from the extracted genomic DNA of S. nassauensis. As the primers had high GC content, Q5® High-fidelity DNA polymerase was used to amplify the sngs gene. PCR product for the entire coding region of 2310 bp was obtained without any non-specific band at 70°C. The amplicons and the expression vector pET28a(+) were digested with EcoRI and HindIII restriction enzymes and subjected to gel purification. Restricted digested and gel purified sngs and pET28a(+) cohesive ends were ligated in a molar ratio of 5:1 (insert: vector) to form the recombinant construct (pETsngs). For increasing the copy number of this recombinant construct, it was transformed into the cloning host E. coli DH10B. The positive transformants were confirmed by colony PCR, and the isolated recombinant plasmids were confirmed by band shift assay on the agarose gel. For the band shift assay, the isolated plasmids were subjected to restriction digestion by EcoRI and HindIII enzymes to confirm the size of the insert (sngs). The integrity of the recombinant construct was confirmed by DNA sequencing. It was found that the positive recombinant contains the desired gene, which is 100% identical to the sequence from the NCBI database.
The recombinant pETsngs was then transformed into the expression host E. coli BL21(DE3) to overexpress the sngs gene regulated under the T7 promoter. The recombinant protein SNGS was initially expressed on a small scale in control cells (containing empty pET28a(+) vector), uninduced, and induced cells (containing pETsngs) at 0.01 mM, 0.05 mM, and 0.1 mM IPTG at 18°C for 24 hours. There were no SNGS protein expression bands in uninduced cells on SDS-PAGE, while in induced cells, a thick protein expression band was observed with a small amount of leaky expression in uninduced cells. The best induction of protein was obtained at 0.05 mM IPTG at 18°C after 24 hours, and so, those induction conditions for the expression of SNGS protein were selected for the large-scale overexpression of the protein. The SNGS protein was expressed as a soluble protein containing the hexahistidine tag. Its molecular weight was calculated to be around 90 kDa (i.e., 86.26 kDa theoretical molecular weight of SNGS + 3.83 kDa of His6 tag). The overexpressed His6-tagged SNGS protein was subjected to immobilized metal affinity chromatography-based purification with Ni-NTA agarose resin with the help of a fast protein liquid chromatography system. Eluted fraction of the recombinant His6-tagged SNGS protein was obtained as a single peak on the Prime View interface of the FPLC system. The expression and purity of SNGS protein was estimated on an SDS-PAGE gel, where a single band of SNGS corresponding to 90 kDa was observed. The eluted fractions were then pursued for protein quantification by Bradford assay. From one liter of culture, 7.28 mg of highly purified SNGS protein was obtained.
The characterization studies of SNGS activity were conducted by incubating the purified SNGS protein with farnesyl diphosphate (FPP) in assay buffer containing 50mM Tris-Cl (pH 8.2), 20% v/v glycerol, 2mM MgCl2, 0.2mM β-mercaptoethanol at 30°C for 4 hours. The reaction was overlaid with n-pentane to prevent the escape of the volatile product thus formed. Extraction of the product was done with pentane-DCM in a ratio of 5:1. The conversion product of FPP was analyzed by gas chromatography-mass spectrometry. 5MS silica-based capillary column containing 5% phenyl arylene crosslinked with 95% dimethylpolysiloxane was used for chromatography using helium as the carrier gas. The products were ionized through EI positive ion mode and were detected on MS detector at 280°C within the m/z range of 50-300.
Further, the products were identified by comparing the obtained peak pattern with the reference mass spectral library known as Tandem Mass Spectral Library (NIST MS Search 2.0) of the National Institute of Standards and Technology, US Government. GC-MS analysis of the products resulting from incubation of FPP with µM recombinant S. nassauensis DSM 44728 germacradienol synthase showed several peaks with one prominent reproducible peak of a molecular ion of m/z = 222, retention time 10.36 min, identical in GC retention time and mass pectrum with (-)-germacradien-4-ol displaying the major fragment (base ion peak) of 59 and other fragments as signature peak pattern when compared with NIST MS library; indicating towards the product of FPP is a sesquiterpene with 100% similarity to germacradien-4-ol.
In conclusion, this research work partially established the role of a putative TPS gene identified from S. nassauensis DSM 44728 by genome mining to be a functional germacradienol synthas enzyme. However, the bioinformatics analysis revealed that the target protein contains two conserved domains, viz. C-terminal domain and N-terminal domain. It is already established that the C-terminal domain catalyzes the conversion of FPP to germacradienol, and the germacradienol thus formed serves as a substrate for the N-terminal domain and gets converted to geosmin. On manual comparison of the identified protein sequence with already characterized germacradienol/ geosmin synthases, it was observed that there is a point mutation in the NSE triad (274E instead of the usual 274D) of the N-terminal domain. This might be why SNGS catalyzed only the conversion of FPP to germacradien-4-ol, and due to point mutation, the further conversion didn’t go as expected. Therefore, no other identifiable products were observed on GC-MS. As SNGS catalyzed the conversion of FPP to germacradienol, which being the precursor for the synthesis of many industrially useful metabolites, can be synthesized in a controlled manner without the risk of further conversion by the N-terminal domain. Thus, this enzyme can also serve as a potential candidate for the initial step in the bioresolution of terpene metabolites of economic importance such as geosmin (as petrichor in perfumeries), bisabolene (as biodiesel), etc.
Pranav Bhaskar, Dipti Sareen (2020). Bioinformatics approach to understand nature's unified mechanism of stereo-divergent synthesis of isoprenoid skeletons. World Journal of Microbiology & Biotechnology; 36, 142. https://doi.org/10.1007/s11274-020-02918-y [Impact factor = 4.253]
Pranav Bhaskar (2023). Identification of the functional germacradienol synthase homolog from Stackebrandtia nassauensis. [In preparation].