Archive

Structural Biology

We all have neighbors who help us in our hour of need. Some go out of the way as well. In enzymes too, it seems, that neighbors play a crucial role. Lafond et al in their recent publication in the Journal of Biological Chemistry report the invovlement of neighboring chains of the same enzyme, lichenase. Apart from the role of stabilizing the quarternary structure (a trimer), they are also invovled in the enzymatic activity.

Sacchrophagus degradans is a marine bacteria that has been credited with the capacity of degrading diverse polysaccharides substrates. The list includes, but not limited to, agar, cellulose, chitin, xylan, carboxymethylcellulose, avicel, laminarin, wheat arabinoxylan, glucomannan, lichenan, curdlan, pachyman, and others. Its genome has 19 coding regions for enzymes that belong to the same CAZy family called GH5.

ResearchBlogging.orgGH5 class of enzymes are predominantly endoglucanases, i.e. cleave an internal beta-glycosidic bond in the cellulose polymer. They are also characterized by sharing the same protein structural fold, namely the (alpha/beta)8 fold. There are eight beta strands with alternating helices forming a barrel. The enzyme Lafond et al named SdGluc5_26A, also belongs to GH5 family with the classical (alpha/beta)8 fold. However, they also found a stretch of 38 residues at the N terminus that seemed interesting. This N-terminus is not floppy, but binds to the active site of the neighboring chain.

5a8n_a

Image of SdGluc5_26A made using PyMOL. (PDB id: 5a8n)

In the figure above, the Trp residue (shown in green sticks) specifically binds to the active site of the neighboring chain. See the figure of the trimer below to see how they interact. Such an arrangement made SdGluc5_26A behave with lichenase activity. In the parlance of carbohydrate active enzymes, this Trp was binding to the -3 subsite of the active site.

Image made using PyMOL. (PDB id: 5a8n)

Image of SdGluc5_26A trimer made using PyMOL. (PDB id: 5a8n)

So, the next step was to find out what happened to the activity of SdGluc5_26A, when this protruding N-terminal sequence is deleted. It was observed that upon deletion, SdGluc5_26A now behaved as a endo-beta(1,4)-glucanase. In other words, without this N-terminal part the enzyme switched its activity from an exo (chewing at the ends of the polymer) to an endo (chewing in the middle) reactive enzyme.

Given that SdGluc5_26A can act on variety of substrates, it only logical to think that this 38 residue stretch plays an important role in substrate specificity. Now, the question is if there is any allostery and cooperative mechanism that can be the reason for substrate binding? Something to chew upon! 😉

References:

  1. Lafond M, Sulzenbacher G, Freyd T, Henrissat B, Berrin JG, & Garron ML (2016). the quaternary structure of a glycoside hydrolase dictates specificity towards beta-glucans. The Journal of biological chemistry PMID: 26755730

 

 

The Mosquito Net (1912) by John Singer Sargent. Licensed under Public Domain via Wikimedia Commons

The Mosquito Net (1912) by John Singer Sargent. Licensed under Public Domain via Wikimedia Commons

 

We all know how pesky mosquitoes can be. Did you know that the ability of a mosquito to find a suitable host to feed is due to thermotaxis? This behavior, being attracted/repelled due to high/low temperature, is seen in other organisms as well such as Drosophila melanogaster and Caenorhabditis elegans. 

However, the behaviour is more pronounced among blood-feeding pests (kissing bugs, bedbugs, Ticks, and mosquitoes including Aedes aegypti). Aedes aegypti is a vector for many flaviviral diseases (Dengue fever, Yellow fever, etc.) Until now, it was well established that thermotaxis requires specific thermosensors that activate the sensory signals for a subsequent flight response in a mosquito. However, how exactly they function was not resolved.

ResearchBlogging.orgIn a recent paper by Corfas and Vosshall [1] describe the use of zinc-finger nuclease-mediated genome editing method to identify the role of two receptors TRPA1 and GR19 in Aedes aegypti‘s attraction to heat. It was found that these receptors help the mosquito to identify the host for feeding (in the temperature range of 43-50 deg Celcius), however they avoid surfaces that exhibit above 50 deg Celcius. [Read the recent editorial on genome editing in Genome Biology]

The sequence (923 residues long) of this receptor (Uniprot id: Q0IFQ4) has at least five transmembrane regions that are approximately 20-25 residues long. A cursory glance at homologous sequences shows that it shares 37% sequence identity with the a de novo designed protein (PDB id:2xeh).

The homology modeled structure showing coiled coil region (residues 189-338). Although, the eLife paper does not talk about structure, I felt that this paper deserves a mention here. The reason is the structural biology/bioinformatics possibilities with this novel target. It is a suitable target for designing inhibitors that would potentially act as mosquito repellents.

Also, combined with the method described in my previous post on mutating transmembrane proteins as a method of making them crystallize, I guess the 3D structure of this important protein will come to light sooner!

Homology modeled region of TRPA1, from ModBase

Homology modeled region of TRPA1 (189-338), from ModBase

 

References:

  1. Corfas RA, & Vosshall LB (2015). The cation channel TRPA1 tunes mosquito thermotaxis to host temperatures. eLife, 4 PMID: 26670734
  2. Greppi, Chloe and Budelli, Gonzalo and Garrity, Paul A (2015). Some like it hot, but not too hot. eLife, 4

I am sharing this guest post of mine that was published in Cell’s Crosstalk: Biology in 3D Blog. Yes, the journal Cell!

Here is the link: http://www.cell.com/crosstalk/why-do-i-blog-about-structural-bioinformatics

Enjoy!

Why Do I Blog about Structural Bioinformatics?:Biology in 3D

When someone says that they have a blog, the stereotypical response would be, “About your travels?” or “Hmmm … Recipes … Must be a delicious blog!” And when one confesses that said blog is about scientific research, the jaw drops. I presume it has to do with the notion that blogging science is not that much fun!

Two things inspired me to become a blogger: (1) an amazing community of scientific bloggers at Research Blogging, who inspired me with their wonderful posts; and (2) my view that structural biology and structural bioinformatics are not getting the exposure they deserve. Thus inspired and motivated, I begun blogging about four years ago, and was able to channel some of my thoughts and energy into my blog, called Getting to Know Structural Bioinformatics.

Guest author and blogger
Raghu Yennamalli

Why do I blog? Blogging is fun! For me, blogging is about sharing with the world recent research and tidbits on structural biology and bioinformatics. Most importantly, it is about sharing the excitement that I feel after reading a paper. In some sense, blogging about research is similar to a journal club, where I am able to share the latest research with my peers. However, unlike a journal club, the audience for my blog is the entire world.

Blogging is also dynamic and interactive, because it allows me to engage in conversation with others (specifically students) when they weigh in with their comments. Below I highlight some of the best practices that I’ve developed over the years that help me with balancing my research, teaching, and personal responsibilities with my blogging.

Selecting the paper

The main way I find articles that I want to blog about is by scouring through the table of contents of the journals I am interested in. Sometimes I also hear about exciting protein structures via friends and other blogs that I follow. I try to have a balanced approach and highlight structural work on systems that are “hot topics” as well as papers that just captured my interest and fancy.

In the early days of my blogging, I was trying to collate and compile tools and techniques that would come in handy for students working with protein structures. I wanted my blog to be a handy place for myself and others to find tips and tricks. Over time, the range of topics and papers I cover has broadened, and although I still cover a lot of method development work, I cover other topics as well. In general, once I make up my mind about the paper I want to blog about, I start reading it, give myself some time to soak in the method and outcome of the paper, and try to think critically as to what possible gaps or methods that the authors could have done to make the paper better. Alternatively, I also analyze the paper’s novelty with respect to structural bioinformatics.

Composing the blog post

I should confess that the monthly posts in Protein Spotlight by Vivienne Baillie Gerritsen are my inspiration while composing posts. I love her writing style and also the manner in which artwork is included in every post, to make it fun to read. Like Protein Spotlight, blogs have the advantage of including other multimedia items, for example using animated gifs and YouTube videos that make the post much easier for the reader. So, I start finding an appropriate image from an art database that best fits the topic (of course, giving credit where it is due). When it is about a tool/software, I figure the best approach is to use said tool/software and include a “first-hand” experience of how I perceived it. Also, I try to include an additional tidbit or information that the authors mention in passing.

Balancing things

With an active teaching and research schedule, finding time to blog does become a challenge. I try to make it a fun process, so that it does not feel cumbersome. If one looks at the frequency of my posts, I try to maintain at least one post per month. Looking at others’ blogs at Research Blogging, I realize that one post a month is a low turnout, and I try to post as frequently as possible. Sometimes, the problem is sheer lack of time or not finding exciting enough material to blog about. However, this does not mean that exciting research is not out there. The key is to find a balance between blogging and other duties. I have had discussions with other bloggers who blog on other nonscience topics, and we observed that the main turnoff in blogging is when one delves deeper and over time a particular post becomes “work.” Maneuvering that roadblock is key to maintaining a successful blog.

In the end, as at the beginning, it all comes down to having fun and sharing with the world my excitement about the type of scientific research I enjoy. I think this is probably the feeling others who blog share as well, and I can see it in some of the blogs I follow, such as the following:


Raghu Yennamalli completed his PhD in Computational Biology and Bioinformatics in 2008 from Jawaharlal Nehru University. He conducted postdoctoral research at Iowa State University, University of Wisconsin-Madison, and Rice University. Currently, he is an Assistant Professor at Jaypee University of Information Technology. He can be contacted at ragothaman AT gmail DOT com.

In the 90s morphing of two unrelated images was popular and mostly it was used for entertainment purposes. For example: the famous video of Michael Jackson’s pop hit “Black or White”.

Courtesy: Google

Courtesy: Google

This morphing method was also used to analyze changes in protein motions, like in domain rearrangement. A popular webserver, where you can get an animated gif of your protein’s motion (assuming you have two distinct conformations), is the Morph server (http://www2.molmovdb.org/) from Gerstein’s Lab. In many cases this gave us insight of how the protein could dynamically change from one form to another.

ResearchBlogging.orgThe change in structural forms of a protein is not a trivial problem. We would need to generate ensembles of protein structures for many purposes. 1) Understand conformational transition paths, 2) Generating more realistic receptors for docking 3) in turn understand the flexible and rigid parts of the protein, and few other applications.

Till now, one could use Normal mode analysis and Molecular Dynamics methods to generate ensemble. It is here that ConTemplate tries to bring in fresh perspective to generate an ensemble of structures.

ConTemplate mines the PDB for existing structures and gives the user a set of possible conformations. The main presumptions are that for any given PDB structure, it has more than one available structure, and there are additional conformations available for proteins that undergo major conformational changes.

For the dataset created for ConTemplate the maximum RMSD between two structures of the same protein is 5 Angstroms. 69.2% of the proteins have less than 1 Angstroms RMSD. Thus, the method uses an interesting three-step process:

  1. using the query it searches for structural equivalents using GESAMT aligner. Here using the structural alignment sequence alignments are generated.
  2. it runs BLAST to identify additional conformations for all structural equivalents obtained in step 1. A representative template is identified
  3. Finally, Modeller is used to build model structures using this template in various conformations.

The advantage of ConTemplate is that it yields a more relevant set of conformations for the query protein. I tried running a query to the server and I would say that I got some interesting results. Screenshot below:

contemplate

Superposition of models created in ConTemplate for PDB id; 1ECE

Superposition of models created in ConTemplate for PDB id; 1ECE

References:
Narunsky A, Nepomnyachiy S, Ashkenazy H, Kolodny R, & Ben-Tal N (2015). ConTemplate Suggests Possible Alternative Conformations for a Query Protein of Known Structure. Structure (London, England : 1993), 23 (11), 2162-70 PMID: 26455800

ResearchBlogging.orgReblogging this blog post

http://loonylabs.org/2015/11/24/protein-structure-biotechnology-personalized-medicines/

Professor Meiering and her colleagues were able to incorporate both structure and function into the design process by using bioinformatics to leverage information from nature. They then analyzed what they made and measured how long it took for the folded, functional protein to unfold and breakdown.

Using a combination of biophysical and computational analyses, the team discovered this kinetic stability can be successfully modeled based on the extent to which the protein chain loops back on itself in the folded structure. Because their approach to stability is also quantitative, the protein’s stability can be adjusted to naturally break down when it is no longer needed.

Reference:

Broom A, Ma SM, Xia K, Rafalia H, Trainor K, Colón W, Gosavi S, & Meiering EM (2015). Designed protein reveals structural determinants of extreme kinetic stability. Proceedings of the National Academy of Sciences of the United States of America, 112 (47), 14605-10 PMID: 26554002

Image reproduced under Creative Commons licence. Source: Wikimedia commons

The Cellular Prion Protein (PrPc) like Dr. Jekyll converts into PrPSc , a fatal conformational form, like Mr. Hyde, and is responsible for a variety of neurodegenrative disorders. Unlike the use of a potion, this molecular Jekyll and Hyde undergoes conformational change in low pH environment, such as in endosomes. While, there has been many studies done in the past of how this conformational change happens,  a recent paper has tried to list the structural and dynamic properties using Molecular Dynamics.

ResearchBlogging.orgTo list these properties,three structures were taken into consideration; one NMR structure (PDB id: 1QLX) and two X-ray structures (PDB id: 2W9E and 3HAK). Interestingly the 3HAK structure is from a SNP variant of human PrPc, where the Met129 is replaced by Val129. Furthermore, those who genetically have this variant are less susceptible to Prion diseases!

Structural alignment of 1QLX (blue), 2W9E (red), and 3HAK (orange) with Met129/Val129 shown as sticks.

Structural alignment of 1QLX (blue), 2W9E (red), and 3HAK (orange) with Met129/Val129 shown as sticks. Image made using PyMOL

Using an in-house MD package called in lucem molecular mechanicsilmm for short, Chen et al simulated the three structures under two different pH conditions (pH 5 and pH 7) and under two different temperatures (298K/25C and 310K/37C), totaling for about 3.6 microseconds of simulation. (For each structure under each condition the MD simulation was performed in triplicates.)

Analyzing the MD results they found that at 37C and low pH the C-terminal globular domain had significant destabilization effects.

  • The helix HA and its neighboring loop S1-HA for the SNP variant was higher compared to other two structures at 37C and low pH. It is interesting to note that the S1-HA loop becomes a strand during the prion’s conversion.
  • At low pH, another helix HB destabilizes, where the His187 becomes solvent exposed, leading to partial unfolding of the C-terminus.
  • Two residues, Phe198 and Met134, converting from being part of the hydrophobic core to being exposed to the solvent may be involved in partial unfolding and might possibly provide aggregation sites.
  • The X-loop in the Val129 SNP variant’s structure took a different conformation that was not populated by the other two structures.
  • Formation of new secondary structures of the N-terminus region to either alpha and beta strands is spontaneous. While, in all two structures both alpha and beta strands formation was seen, in the SNP variant alpha strands were rarely formed. (This N-terminus region is missing from the solved structures and hence was modeled and in each starting structure this region was unstructured.)

These results give more insights into the conversion of the benign form of human Prion to the infectious form.

References:

  1. Chen, W., van der Kamp, M., & Daggett, V. (2014). Structural and Dynamic Properties of the Human Prion Protein Biophysical Journal, 106 (5), 1152-1163 DOI: 10.1016/j.bpj.2013.12.053

I am sure this blog’s readers are aware of the PDB format. This format, created in the 1970s, is a standardized format for data derived from X-ray diffraction and NMR studies [1]. Until 2006, homology/theoretical models were also accepted for deposition, but not any more [2]. [See previous post on Protein Model Portal for submitting homology/theoretical models]

ResearchBlogging.orgThe current limit of PDB format is that a coordinate file with more than 62 chains and 99,999 atoms cannot be uploaded as a single file and hence was split into three or four separate PDB depositions. To overcome this limitation, a new format has been on the works and recently the working group announced the new format recommendations. [3, 4]

Not that I warned you. The first FAQ [4] on this link says:

What should every PDB user know about PDBx/mmCIF?
The PDB file format will be phased out in 2016.
PDBx/mmCIF will become the standard PDB archive format in 2014.

What is this new format?
To illustrate the changes, the first image is the ATOM records in the current PDB format. And, the second image is the PDBx format of the same information.

format1The new PDBx format:
format2Did you observe any changes? Here is a comparative image below that has the ATOM records aligned one below the other.
compareThe first thing that caught my eye was the order of the columns the new format is using. Also, the extra decimal positions for occupancy and B-factor columns. Now, if you look at the second image, we saw some extra lines before the ATOM records. These are the list of things in the “atom_site” category. The new format has the following categories and under each category, there is a detailed description of what goes into it. For example, the ATOM records is what is called as the ATOM_SITE category. Under this, there is information describing atomic positions.

PDB to PDBx correspondences

This link describes what items in PDB correspond to the new PDBx format

The website (http://mmcif.wwpdb.org/) and the FAQ have tons of information. Check it out, and get familiar with the new format of PDB! You have two years to learn it. 🙂

References:

  1. http://www.rcsb.org/pdb/static.do?p=file_formats/pdb/index.html. Accessed: 2014-03-07. (Archived by WebCite® at http://www.webcitation.org/6NtpuqkvU)
  2. http://deposit.rcsb.org/depoinfo/depofaq.html. Accessed: 2014-03-07. (Archived by WebCite® at http://www.webcitation.org/6NtpzBByw)
  3. http://www.emdatabank.org/lrg_strct_dpstn.html. Accessed: 2014-03-07. (Archived by WebCite® at http://www.webcitation.org/6Ntq1KxKY)
  4. http://www.wwpdb.org/workshop/wgroup.html. Accessed: 2014-03-07. (Archived by WebCite® at http://www.webcitation.org/6Ntq3BY5u)
  5. http://mmcif.wwpdb.org/docs/faqs/pdbx-mmcif-faq-general.html. Accessed: 2014-03-07. (Archived by WebCite® at http://www.webcitation.org/6Ntq56e1O)

Yes, the extra “g” was intentional. You see, 2014 is the International Year of Crystallography declared by the United Nations. So, Crystallographers are “Bragg”ing about it! [You see what I did there? 🙂 ]

ResearchBlogging.org

In this month’s issue (February 2014) of Biophysical Journal, the biophysicist couple Prof. Jane RIchardson and David Richardson came out with an article that commemorates this special year. The number 54 gains value here.

  • In this issue they highlight 54 protein structures that basically, as they put it, “illuminated” the field of biophysics.
  • The number 54 also denotes the number of years since the structure of myoglobin was solved.
  • Additionally, If you wanted a year long celebration, you need at least one structure solved by X-ray crystallography per week. So, you can look at one molecular structure at a time and marvel at it. (As a bonus, you get two more.)

As I read the article, I was squeeing with delight, as it had hand drawn pictures of the earliest solved structures! Those pictures definitely upped the oomph factor for these proteins. The best part is this article is open access. So, I can make a slideshow of these structures! Lo and Behold!

This slideshow requires JavaScript.

If you want to add some unique information about any of these proteins, look out for the list to be available in WIkipedia. http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Biophysics#New_articles

Wait, that’s NOT all. If you go to IYCR2014 website, there is tons of information there. For example, if you go to events, you can look at the year long celebrations happening around the world.

If you point your browser to http://www.iycr2014.org/learn/educational-materials you will find a good list of things about crystallography one can learn about!

References:

  1. http://www.iycr2014.org/
  2. Jane S. Richardson and David C. Richardson (2014). Biophysical Highlights from 54 Years of Macromolecular Crystallography Biophysical Journal, 106 (3), 510-525 DOI: 10.1016/j.bpj.2014.01.001

Image Courtesy: PymolWiki

It is a fact that there is a non-uniformity with which different space groups occur in protein crystals. For example, the space group P212121 is the most frequent in protein crsytals and occurs almost one-third of the time!!!

Why is this so? This was the question asked by Wukovitz and Yeates in their paper titled “Why protein crystals favour some space-groups over others” [1]
ResearchBlogging.org
Comparing the protein crystals with organic molecule crystals it seems there are marked differences. The rules for organic molecules’ molecular packing was proposed by Kitaigorodskii and it became widely accepted. [2]

However, If we look at the distribution of the space groups in organic molecules and proteins there are marked differences. Thus, the authors argue that same criteria cannot be applied to proteins. One major difference between the crystals is that protein crystals contain 50% solvent by volume, while organic crystals are jam-packed with less space. This results in a higher “coordination number” (10-14) for organic crystals than for proteins, where the number is average 7.5

Based on all these, the authors tried to devise a simple statistical measurement that can answer as to why certain space groups are preferred among the 65 biological space groups.

And the formula is:

D=S+L-C, where

D = Total number of rigid-body freedom
S = number of meaningful degrees of freedom
L = number of independent parameters for describing the unit cell, and
C = minimum number of unique contacts required to make the set of symmetry related molecules

All three are positive integers and are not adjustable parameters. The explanation given by a simple statistical analysis for protein crystals is “For a particular space group only a certain number of rigid-body degrees of freedom are available for assembling the first few molecules before the internal structure of the crystal is completely defined. This number depends on the space group symmetry.”

Three things limit the rigid-body degrees of freedom

  1. number of meaningful Rigid-body DOF for the first molecule in space
  2. the number of independent unit-cell parameters
  3. the number of intermolecular contacts to make a network

How to find C?
The problem of finding C is equivalent to the problem of identifying the minimal set of symmetry elements. For each space group, C can be determined by finding the minimal set of generators for each space group. The numbers range from 5 to 2.

The authors observed that the calculated value of D correlated with the observed frequency of the space group!That is, higher the value of D the most frequent space group.  Guess which space group had a higher D value?

Now the question comes back to “Why P212121 is more frequent?” The reason is that this space group is the least restrictive for the possible orientations and positions of the molecules in the crystal.

The authors do note that their analysis does not take into consideration of the shape of the molecule, energetics, and packing efficiency, which can lead to answers for non-monomeric proteins in the asymmetric unit.  According to the authors, P1 has a D value of 8, and is predicted to be the most used space group for racemic protein mixtures.

References:

  1. Wukovitz SW, & Yeates TO (1995). Why protein crystals favour some space-groups over others. Nature structural biology, 2 (12), 1062-7 PMID: 8846217
  2. Kitaigorodskii AI. Organic Chemical Crystallogrphy (1955) Consultants Bureau, New York (Originally published in Russian by Press of the Academy of Sciences of the USSR, Moscow)