This post is about an article that got published last week in Journal of Biological Chemistry (JBC). Let me tell you why I found this very interesting. Metagenomic sequences are filling and going to fill the databases with lot of new sequences. In the case of enzymes there is going to be a huge list of sequences from new genomic sequences that on a preliminary screening shows as having a potential enzymatic activity. However, most of them do not register any activity on substrates. This becomes problematic for two reasons:
- There are hardly any distinguishing features in homolog sequences that can be used to identify active vs. non active
- In most cases, homologs act on different substrates, either exclusively or have mixed specificity
Thus, any identification/feature that can shed light on substrate specificity (which can tell whether the enzyme will be active or not) would be of immense help to screen true-positives more effectively. In this paper, Sukharnikov et al have used the Glycosyl Hydrolase 48 (GH48) family of enzymes that have shown to have cellulolytic activity. Basically, they are endoglucanases that cleave an internal glycosidic bond.
So, they took the sequences of GH48 with known activity from CAZy and other sequences that were picked from NCBI’s nr database using the Pfam GH48 domain information, did a multiple sequence alignment and built a tree. Using this one can easily find orthologs (one copy per genome and come from a phyla that shares the same ancestor with another species), paralogs (two or more copies per genome), and horzontal gene transferred (based on phyletic distribution and probabilistic approach) genes (HTG).
It was clearly seen that the prokaryotic GH48 sequences shared a common ancestor; paralogs retained the conserved residues in the catalytic domain and showed “innovation” with the auxillary domains (like Carbohydrate binding module or CBM). The insteresting outcome of this analysis was the horizontally transferred genes (HTG) from the prokaryotic genome to eukaryotic genome (Fungi and Insects). To test this, one of the HTG genes from Hahella chejuensis when tested on amorphous cellulose, it showed cellulase activity.
By this time, you might wonder where I am leading all this too. I left the best part for the last, since the authors solved the structure of the HGT GH48, and when compared with other structural homologs, a particular omega loop facing the substrate binding part of the protein has a change in conformation. In other GH48 sturctures, this loop has identical conformation, but not in the HTG GH48! Moreover, the insect GH48 sequences (obtained from metagenomic sequences) were seen to lack cellulolytic activity and had chitinase activity and this was seen due to absence of the omega loop.
In summary, the authors suggest two things:
- for GH48 sequences in the prokaryotic to be cellulolytic the conserved residues from the prokarytes can be used as a genomic signature
- The GH48 from metagenomic insect sequences have evolved to accommodate the bulkier chitin. For which, they probably had to lose the omega loop.
So, its all in the loop…
UPDATE: The structure details of H. chejuensis can be found here – http://www.rcsb.org/pdb/explore/explore.do?structureId=4fus
Sukharnikov, L., Alahuhta, M., Brunecky, R., Upadhyay, A., Himmel, M., Lunin, V., & Zhulin, I. (2012). Sequence, structure, and evolution of cellulases in the glycoside hydrolase family 48 Journal of Biological Chemistry DOI: 10.1074/jbc.M112.405720