Monthly Archives: October 2012

Image courtesy: Google

Not always, but sometimes one wants to flatten the interactions between a protein and a ligand. The aim is to unclutter the three-dimensional (3D) information to a 2D image. With such visualizations, the advantages is that one gets to see the various interactions without any of them getting buried, and concentrate on the crucial ones that are the key to protein-ligand interactions. The situations where these 2D representations are used are broadly of two areas:

  1. Plotting the interactions of protein-ligand complexes in the existing data (from PDB database)
  2. Plotting the interactions between a protein and a potential drug/small molecule from a molecular docking result. Again, the input could be from a single small molecule docking or from a virtual screening.

In this post, we will see three tools that help us in achieving the goal of plotting protein-ligand interactions.
LIGPLOT – For many years, Ligplot (1) has been the choice for plotting 2D interactions. Infact, the database pdbsum makes ligplot images for a given protein-ligand interactions. The main two things shown are the hydrogen bonds and hydrophobic interactions.

Hydrogen bonds are indicated by dashed lines between the atoms involved, while hydrophobic contacts are represented by an arc with spokes radiating towards the ligand atoms they contact. The contacted atoms are shown with spokes radiating back.


PoseView – This is a new tool that came out two years ago (2). It has Ligplot-like image generation but it has more features than Ligplot.

The 2D depiction shows hydrogen bonds as dashed lines between the interaction partners on either side. Hydrophobic interactions are illustrated as smooth contour lines between the respective amino acids and the ligand.

Recently PDB database incroporated poseview with the structures that are present. So, one can get the 2D plots straight away from PDB itself in the Ligand section of each protein, for example here. The web-interface for PoseView can be accessed here.


BINding ANAlyzer (BINANA) – This is probably the most recent protein-ligand representation tool (3). Although, not exactly a 2D plotting tool, it has more features than Ligplot or PoseView, namely it can plot electrostatic interactions, pi pi stacking, cation-pi interactions, and more. The only downside is that it needs the input in .PDBQT format. This can be obtained via AutoDock Tools. The output can be visualized via VMD, thus making the 2D back into 3D bu with distinguishable features.



1. Wallace AC, Laskowski RA, & Thornton JM (1995). LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein engineering, 8 (2), 127-34 PMID: 7630882

2. Stierand, K., & Rarey, M. (2010). Drawing the PDB: Protein−Ligand Complexes in Two Dimensions ACS Medicinal Chemistry Letters, 1 (9), 540-545 DOI: 10.1021/ml100164p

3. Durrant, J., & McCammon, J. (2011). BINANA: A novel algorithm for ligand-binding characterization Journal of Molecular Graphics and Modelling, 29 (6), 888-893 DOI: 10.1016/j.jmgm.2011.01.004


Protein folding funnel.
Image courtesy: Dill KA and Chan HS, Nat St. Mol. Biol. 1997, 4 (1).

In 2008, a report in Science was published indicating that we now know how proteins fold and the problem of protein folding is somewhat solved (1). However recent research show that understanding the protein folding is still unresolved. In the current issue (30th October 2012) of Proceedings of National Academy of Sciences of USA (PNAS), a special feature titled “Chemical Physics Of Protein Folding” shows current cutting-edge research in this area. In the introduction the editors say:

Although the basic ideas about the folding energy landscape have turned out to be quite simple, entering even into some undergraduate textbooks, exploring their consequences in real systems has required painstaking intellectual analysis, as well as detailed computer simulations and experiments that still stretch the bounds of what is feasible.

The special feature not only covers the computational side but also the experimental approach that complement most of the studies. The diversity in the area and the unresolved questions make the problem of protein folding an interesting topic for further research. The main reason is the computations throw up possibilities of doing highly challenging single molecule experiments and the time scales challenging current computational power.

In summary the following sums up the current research.

The topics discussed in this issue are only a small part of the work in the folding field. Nevertheless, they make clear that protein folding is a vibrant, living, interdisciplinary part of the natural sciences.

Access the articles here:

1. Service, R. (2008). Problem Solved* (*sort of) Science, 321 (5890), 784-786 DOI: 10.1126/science.321.5890.784

2. Wolynes, P., Eaton, W., & Fersht, A. (2012). From the Cover: Chemical physics of protein folding Proceedings of the National Academy of Sciences, 109 (44), 17770-17771 DOI: 10.1073/pnas.1215733109

This post is about an article that got published last week in Journal of Biological Chemistry (JBC). Let me tell you why I found this very interesting. Metagenomic sequences are filling and going to fill the databases with lot of new sequences. In the case of enzymes there is going to be a huge list of sequences from new genomic sequences that on a preliminary screening shows as having a potential enzymatic activity. However, most of them do not register any activity on substrates. This becomes problematic for two reasons:

  1. There are hardly any distinguishing features in homolog sequences that can be used to identify active vs. non active
  2. In most cases, homologs act on different substrates, either exclusively or have mixed specificity

Thus, any identification/feature that can shed light on substrate specificity (which can tell whether the enzyme will be active or not) would be of immense help to screen true-positives more effectively. In this paper, Sukharnikov et al have used the Glycosyl Hydrolase 48 (GH48) family of enzymes that have shown to have cellulolytic activity. Basically, they are endoglucanases that cleave an internal glycosidic bond.

So, they took the sequences of GH48 with known activity from CAZy and other sequences that were picked from NCBI’s nr database using the Pfam GH48 domain information, did a multiple sequence alignment and built a tree. Using this one can easily find orthologs (one copy per genome and come from a phyla that shares the same ancestor with another species), paralogs (two or more copies per genome), and horzontal gene transferred (based on phyletic distribution and probabilistic approach) genes (HTG).

It was clearly seen that the prokaryotic GH48 sequences shared a common ancestor; paralogs retained the conserved residues in the catalytic domain and showed “innovation” with the auxillary domains (like Carbohydrate binding module or CBM). The insteresting outcome of this analysis was the horizontally transferred genes (HTG) from the prokaryotic genome to eukaryotic genome (Fungi and Insects). To test this, one of the HTG genes from Hahella chejuensis when tested on amorphous cellulose, it showed cellulase activity.

By this time, you might wonder where I am leading all this too. I left the best part for the last, since the authors solved the structure of the HGT GH48, and when compared with other structural homologs, a particular omega loop facing the substrate binding part of the protein has a change in conformation. In other GH48 sturctures, this loop has identical conformation, but not in the HTG GH48! Moreover, the insect GH48 sequences (obtained from metagenomic sequences) were seen to lack cellulolytic activity and had chitinase activity and this was seen due to absence of the omega loop.

In summary, the authors suggest two things:

  • for GH48 sequences in the prokaryotic to be cellulolytic the conserved residues from the prokarytes can be used as a genomic signature
  • The GH48 from metagenomic insect sequences have evolved to accommodate the bulkier chitin. For which, they probably had to lose the omega loop.

So, its all in the loop…

UPDATE: The structure details of H. chejuensis can be found here –

Sukharnikov, L., Alahuhta, M., Brunecky, R., Upadhyay, A., Himmel, M., Lunin, V., & Zhulin, I. (2012). Sequence, structure, and evolution of cellulases in the glycoside hydrolase family 48 Journal of Biological Chemistry DOI: 10.1074/jbc.M112.405720

Proteins are always in motion. It may be “global” or “local” movements. The global motions, that are encoded in the architecture, arise due to changes during binding to a substrate, whereas local movements, that complement global motions, are seen as in the rearrangement of loops and adjustment of the side chains of the residues.

In the recent paper by Liu and Bahar, they have analyzed the systematically the “key residues that mediate structural dynamics”. They took 34 enzymes that are structurally and functionally varied and determine relative mobility for each residue, and the mutation propensity at that position. They used Gaussian Network Model (GNM), to study the global modes and relative mobility, and Shannon’s Mutual Information (MI), to identify co-evolved residues, to demonstrate

the importance of structural adaptability in sustaining functional dynamics of the enzyme notwithstanding sequence variations that confer specificity

So, what does all this mean? In one sentence, this paper says those residues in a protein that show least mobility (relative to others in the same protein) are most likely to be conserved. This also means that the correlation of the residue being less mobile and the residue’s degree of substitution may be the reason why we see a relatively small set of protein folds .

Take home points from this paper:

  • conserved residues have minimal fluctuations in global modes
  • increase in sequence variability is accompanied with increase in conformational mobility
  • co-evolved residues are either involved in substrate binding or assist the residues involved in substrate binding
  • finally, a mobility scale for the 20 amino acids that can used for understanding dynamics in other proteins.
Liu, Y., & Bahar, I. (2012). Sequence Evolution Correlates with Structural Dynamics Molecular Biology and Evolution, 29 (9), 2253-2263 DOI: 10.1093/molbev/mss097



Further reading:

Ribbon representation of Triosephosphate isomerase by Prof. Jane Richardson (Image Source: Wikipedia)

So, here goes my first post of structural bioinformatics. 🙂

In college when I was flipping the pages of a biochemistry textbook, I remember getting completely mesmerized by the amazing images of proteins, which I learnt later were the ribbon representation of proteins. Digging for more information I came across the name Prof. Jane Richardson, who pioneered this particular representation of proteins.

In my opinion, the publication The Anatomy and Taxonomy of Protein Structure in 1981 is an important milestone in Structural Bioinformatics. (I know that I am not going chronologically). I have heard from my PI that there used to a coloring book or proteins made available from her lab, where we could color the proteins by hand and understand the topology. I am sure it would have been an interesting thing to do, while waiting for a script to finish its job!

Scientific importance of the work

  • An easy way to represent protein structures

    What do I understand from this blob? (Image source:

  • One can understand the topology of the protein easily (directionality shown in beta-strands)
  • It is not hard on the eyes to see the detail and at the same to understand the protein structure

The publication also classified proteins on the basis of structure and thus taxonomy of proteins form the second part of the publication. In her own words

 a suitable view was chosen (consistent for each subcategory of structure), and plotter output was obtained at a consistent scale (approximately 20Å per inch on the final drawings as reproduced here). The schematic was drawn on top of the plotter output for accuracy, with continual reference to the stereo for the third dimension. Loops, and to some extent β strands, were smoothed for comprehensibility, and shifts of 1 or 2Å were sometimes necessary in order to avoid ambiguity at crossing points. A uniform set of graphical conventions was adopted (see Section III,A,3 for explanation) in which β strands are shown as arrows, helices as spiral ribbons, and nonrepetitive structure as ropes. Location and extent of β strands and helices are sometimes based on published descriptions and hydrogen-bonding diagrams, but often must be judged from the stereo view itself. Very short β interactions are shown as arrows when they form part of a larger sheet but may be left out if they are isolated. Foreshortening, overlaps, edge appearance, and relative size change are used to provide depth cues.

So now if you are thinking of coloring a protein structure, print this image and the other images in the link above and use your imagination to fill in the colors. 🙂

FIG. 76. Parallel α/β: classic doubly wound β sheets.
Image source:


Jane S. Richardson (1981). The Anatomy and Taxonomy of Protein Structure Advances in Protein Chemistry, 34, 167-339 DOI: 10.1016/S0065-3233(08)60520-3


Wordle of Structural Bioinformatics definition according to Wikipedia.

Hi Everyone,

From today I will bring you, on a regular basis, articles related to Structural Bioinformatics. Two things propelled me to start this blog.

  1. When I realized recently that a structural bioinformatician are still not taken seriously by some others. (Yes, that’s right. We are sometimes the butt of jokes usually during their water break!). No hard feelings, wet-lab guys and gals. Don’t worry there is no acid-spewing acrimonious stuff coming here. I did not decide to start this blog to get back to you. 😛
  2. Reading the Nature news article on one of the 2012 Nobel prize in Chemistry winner Prof. Kobilka. He published the ground-breaking work of an active GPCR in complex with a G protein.

So, you are asking what’s this blog about? Well, to begin with it is about bioinformatics articles that are recently published (and once in a while we would take a trip to the past) that specifically discusses about proteins and their structures. Sounds good?

Let’s see how it proceeds and where this journey takes me to. 🙂

Yours Truly,

A structural bioinformatician…