Discovering Novel Biology
The Protein Structure Initiative (PSI) has solved thousands of x-ray crystal structures to catalogue the protein structure universe. Although this effort has considerably increased our understanding of the structural landscape, the in vitro activity and in vivo function remains unknown for most of the proteins characterized by PSI centers. As the EFI develops strategies for large-scale functional assignment, “deorphanization” of these structures is an obvious way to both hone EFI methodology and capitalize on the new wealth of structural information.
“Orphaned” PSI structures were targeted by the EFI for computational prediction and experimental characterization. This effort included an EN superfamily member deposited in the PDB with the code 2PMQ. The protein was annotated as a “mandelate racemase/muconate lactonizing enzyme” but work done by the EFI and during the EFI’s predecessor program, P01 GM071790, made it clear that this annotation was incorrect. Furthermore, 2PMQ was not sufficiently similar to any known functions to confidently transfer annotation or even provide functional clues. Challenging situations such as this are common in the post-genomic era and accentuate the need for sophisticated computational approaches to assign function.
In the case of 2PMQ, the EFI Computation Core applied a strategy called “pathway docking” first described in a retrospective analysis of E. coli glycolysis (Kalyanaraman and Jacobson 2010). This approach successively docks a virtual ligand library into structures or homology models of every enzyme in a pathway. The power of this approach is that the product of an upstream enzyme is the substrate for the downstream enzyme and so forth. Therefore overall docking results are enriched with self-consistent metabolite substructures. The Computation Core used this methodology on 2PMQ and the enzymes/binding proteins encoded by neighboring genes, including a periplasmic binding protein and a dioxygenase, with the assumption that these constitute a pathway. By examining in silico docking results for 87,098 potential ligands, a betaine (a permethylated quaternary amine, figure below) was predicted to be the substrate for 2PMQ, with proline betaine and 4‑hydroxy proline betaine ranked first and second. Instrumental to this conclusion was the evaluation of poses for a non-enzyme, the periplasmic binding protein of an ABC transporter, that appeared to form a pi‑cation “cage” for a quaternary ammonium.
These computational predictions initiated a cascade of experimental validation. Since 2PMQ is from Roseovarius sp. HTCC2601, a pelagic organism that cannot be readily cultured, orthologues from Paracoccus denitrificans and Pelagibaca bermudensis were targeted. Protein samples were prepared in the EFI Protein Core and provided to the Structure Core for confirmation of the computationally predicted poses. Samples sent to the EN Bridging Project were used to test just four betaines: trans-4‑hydroxy-L-proline betaine (tHyp-B), L-proline betaine (L-Pro-B), D/L-carnitine, and glycine betaine. Both sets of experimental results validated the predictions; the liganded x-ray crystal structure closely matched the predicted pose, and both tHyp-B and L-Pro-B were readily epimerized/racemized at C2 with catalytic efficiencies (kcat/KM) that reached or exceeded 103 M-1s-1.
Many species of bacteria use betaines such as tHyp-B as osmoprotectants. This role was established in P. denitrificans through a battery of genetic and metabolomic studies by the EFI Microbiology Core. P. denitrificans was found to utilize both tHyp-B and L-Pro-B as carbon and nitrogen sources at low salt concentrations; addition of tHyp-B to high salt media offset growth inhibition, consistent with its use as an osmoprotectant. Furthermore, metabolomic experiments identified the metabolites that were predicted in the catabolism of tHyp-B to α‑ketoglutarate based on the chemistries characteristic of the various enzyme families. These metabolites were observed only when the betaine, but not succinate, was used to supplement growth. Additional evidence was garnered through qRT‑PCR and gene disruption experiments which unequivocally tied 2PMQ and the associated operon proteins to the degradation of tHyp-B under low salt conditions and tHyp-B’s osmoprotectant properties during salt stress, thus confirming the function of both 2PMQ and the biosynthetic pathway.
Starting with an unliganded structure solved the PSI, this work serves as prospective validation that pathway docking is a powerful approach for prediction in vitro enzymatic activity. While the success of this project was dependent on the Computation Core’s use of pathway docking to enhance the reliability of the predictions of substrate specificities, the integration of multiple disciplines allowed the complete and unequivocal assignment of a nontrivial in vivo function for the entire metabolic pathway.
View the publication here.
Computationally predicted and experimentally validated pathway for catabolism of trans-4‑hydroxy-L-proline betaine (tHyp-B) via “orphaned” enzyme 2PMQ.