Project period : 2010 - present
Debra Dunaway-Mariano, UNM
Karen N. Allen, Boston University
Chemistry
Structure
Challenges for Function Assignment
Value to Integrated Strategy
The HAD superfamily derives its name from 2-haloacid dehalogenase, which functions in microbial degradation of chlorinated pollutants and was the first member to be structurally characterized. In addition to dehalogenases, HAD superfamily members also include phosphoesterases, ATPases, phosphonatases, and sugar phosphomutases. However, the vast majority (~80%) of HAD superfamily members are thought to function as phosphatases (phosphohydrolases). HAD members are found in all three kingdoms of life and identification of >79,000 unique members to date equates to multiple homologues per organism. For example, 28 are found in E. coli; 35 in Salmonella typhimurium; 31 in Pseudomonas aeruginosa; 30 in Mycobacterium tuberculosis; 31 in Bacillus cereus; 24 in Bacteroides fragilis; 24 in Streptococcus pneumoniae; 45 in Saccharomyces cerevisiae; 84 in Caenorhabditis elegans; 169 in Arabidopsis thaliana; 292 in Selaginella moellendorffii; 183 in humans.
The core catalytic domain of HAD superfamily members contains a modified Rossmann fold with four highly conserved sequence motifs localized to loop regions (Figure HAD1). Residues within these motifs contribute catalytic features to the active site and thereby are used to identify HAD superfamily members. Substrate specificity and occlusion/inclusion of solvent is regulated by “caps” inserted into the core domain. Caps can be inserted in a β-hairpin proceeding β-strand 1 (C1) or after β-strand 3 (C2a or C2b) thereby adding modularity to the core structure (Figure HAD2). Both the C1 and C2 cap types undergo extensive movement during the catalytic cycle. In general, capped HADs process small metabolites that can be sequestered within the active site by cap closure. Macromolecule substrates (e.g. proteins or DNA) are processed by “capless” C0 HAD homologues which provide a much larger contact area. Further complexity is added to HAD superfamily members by fusion with a plethora of other functional domains.
Catalysis by all members of the HAD superfamily proceeds via two partial reactions (Figure HAD3). The first step involves attack by a strictly conserved nucleophilic Asp on the electrophilic center of the substrate (most commonly phosphorus but may also be carbon as for 2-haloacid dehalogenases). Formation of the enzyme-bound intermediate results in displacement of the substrate leaving group. In the second step, the enzyme-bound intermediate is hydrolyzed to regenerate the enzyme catalyst. Asp serves as the ideal nucleophile for phosphatases due to the moderate kinetic stability of the phospho-Asp intermediate coupled with the ability to modulate this stability by appropriate placement of active site residues that either accelerate or hinder hydrolysis by solvent water. Except for 2-haloacid dehalogenases, HAD superfamily members rely on coordination of Mg2+ to the nucleophilic Asp and the substrate phosphate to neutralize the highly anionic environment. The Mg2+ also contributes to the overall stability of the HAD fold.
HAD phosphatases function in multiple metabolic contexts including primary metabolism (e.g. serine and histidine biosynthesis), secondary metabolism (e.g. carbohydrates of capsular and lipid A biosynthesis), regulation (e.g. balance of dNTP pools via deoxyribonucleotidases), cell housekeeping (e.g. dephosphorylation of accumulating metabolites to alleviate stalled metabolic pathways), and nutrient uptake (e.g. dephosphorylation of metabolites for transport). Experimental activity screens suggest that the typical HAD phosphatase has loose substrate specificity coupled with modest catalytic efficiency (kcat/KM ~103 to 104 M-1s-1). Thus the surfeit of substrate possibilities coupled with ambiguous physiological roles presents a very challenging scenario for functional assignment. However, these issues also represent fundamental but pervasive problems in genomic enzymology that critically need to be addressed. Collaboration with the Computation and Superfamily/Genome Cores enables focused and informed functional predictions. These hypotheses are then tested in by HAD Bridging Project and, in a limited number of cases, the Microbiology Core en route to formulating a general strategy for functional assignment.
Representative References