Large-Scale Characterization of Chain Elongation Enzymes
Just two 5-carbon allylic phosphates, dimethyallyl phosphate (DMAPP) and isopentyl diphosphate (IPP), are the starting units for the tens of thousands of isoprenoid compounds found in nature. Isoprenoids are critical components of many cellular processes and structures, for example they are essential to membrane fluidity (cholesterol), photosynthesesis (accessory pigments such as fucoxanthin) and signal transduction (Ras prenylation). The chemistry behind formation of isoprenoid structures is likewise remarkable. Enzymes known as trans polyprenyl transferases (E-PTSs) catalyze formation of C(5)n carbocations via concomitant Mg2+-stabilized dissociation of a pyrophosphate anion. The resulting carbocation is susceptible to addition of a nucleophile, such as the allylic C4 of IPP, which generates an intermediate that deprotonates to form the trans elongated product. A different class of enzymes form cis products, albeit these are much less common in nature.
E-PTSs form a subgroup of the Type I Isoprenoid Synthase (IS) Superfamily, one of the functionally diverse enzyme superfamilies under investigation by EFI Bridging Projects. The IS Superfamily was targeted specifically because the simplicity of limited substrates coupled with the complexity of thousands of possible products represents a novel challenge for functional prediction. Although the IS Superfamily contains many distinct subgroups, including enzymes that catalyze complex rearrangements and cyclizations, the E-PTSs were the ideal group to begin testing and development of a strategy for functional prediction in this superfamily.
The E-PTS subgroup is large (~6000 sequences total), so a sample set of enzymes providing broad coverage was selected by the Superfamily/Genome Core for experimental characterization and computational prediction. Of the 248 targets cloned and expressed, 79 were purified in the EFI Protein Core and pure samples distributed to the Structure Core and IS Bridging Project for experimental structure determination and assay, respectively. In parallel, the Computation Core began developing methods to predict side chain specificity via modeling and docking. The Computation Core started with a retrospective analysis of E-PTSs with known function and structure and quickly realized that simple estimation of cavity size was not an adequate predictor of product length. However, from the available liganded structures they surmised that the orientation and location of the diphosphate head group of DMAPP and active site Mg2+ anions were so consistent that they could be considered “covalent” for the purposes of docking. Using modified algorithms for covalent docking from Prime and allowing active site side chains to conformationally respond to the growing polyprenyl chain resulted in a much more robust approach. Prenyl units were iteratively built off of the stationary diphosphate, and their conformations were sampled to reveal the lowest Lennard-Jones energy. In the relatively small training set (n = 10) this strategy predicted the correct chain lengths for 80% of the cases and were within one isoprenoid unit for another 10%.
With this initial success, the Computation Core went on to predict the specificities for the experimental target set. To benchmark the method, they first examined 34 enzymes for which the products had already been determined in the IS Bridging Project. Apo structures for 10 of these targets were solved in the Structure Core and homology models were generated for the remaining 24. Using these templates, the Computation Core was able to correctly predict chain length for 53% and was within a single C5 unit for another 30% of the cases. In blind predictions on the remaining 40 targets for which assays had not been completed prior to computational analysis but models could be generated, on assay it was found that 65% were predicted correctly and 30% were again within one isoprene unit. It is important to note that many E-PTSs, in particular those with short and medium chain specificity, have been shown to produce a range of polyprenyl products instead of single definitive product. As such, these results represent momentous progress especially when considering that the computational predictions are markedly better than annotations currently available in even the best databases (e.g. TrEMBL), which are the only predictions that are generally available to the community. The information provided from this seminal study is of immediate utility to those with an interest in isoprenoid elongation and furthermore presents a powerful approach to large scale prediction that may be used for other groups of enzymes.
View the publication here.