This Current Topic presented the EFI organization and operations to the readership of Biochemistry for the purposes of dissemination and invitation of outside input and collaborations.
The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic, we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include (1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation), (2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia, (3) computational and bioinformatic tools for using the strategy, (4) provision of experimental protocols and/or reagents for enzyme production and characterization, and (5) dissemination of data via the EFI's Website, http://enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal, and pharmaceutical efforts.
Figure 4: Representative sequence similarity networks for the mandelate racemase (MR) subgroup of the enolase superfamily. Sequences are shown as nodes (dots); connections with BLASTP E values more stringent than a specified threshold are shown as edges (lines). (A) BLASTP E values of <10–40. (B) BLASTP E values of <10–80. As the BLASTP E value threshold is made more stringent, the sequences separate into discrete clusters; at <10–80, many of the clusters are isofunctional families. Nodes colored gray have unknown functions.
Figure 6: (A) Tm0936 (AH superfamily). Computationally predicted pose of the high-energy intermediate (green) superimposed on experimental structure (red, with electron density contours).(43) (B) BC0371 (EN superfamily) in complex with substrate N-succinyl Arg, as predicted by homology modeling and docking (cyan) as well as determined by crystallography (yellow).(41) Both panels are reproduced with permission from the publisher. Copyright 2007. Nature Publishing Group.
Reprinted with permission from Biochemistry.
© 2011 American Chemical Society.