Structural Biology

Help me, neighbor!

January 28, 2016

CAZy, PDB, protein folding, Structural Biology

We all have neighbors who help us in our hour of need. Some go out of the way as well. In enzymes too, it seems, that neighbors play a crucial role. Lafond et al in their recent publication in the Journal of Biological Chemistry report the invovlement of neighboring chains of the same enzyme, lichenase. Apart from the role of stabilizing the quarternary structure (a trimer), they are also invovled in the enzymatic activity.

Sacchrophagus degradans is a marine bacteria that has been credited with the capacity of degrading diverse polysaccharides substrates. The list includes, but not limited to, agar, cellulose, chitin, xylan, carboxymethylcellulose, avicel, laminarin, wheat arabinoxylan, glucomannan, lichenan, curdlan, pachyman, and others. Its genome has 19 coding regions for enzymes that belong to the same CAZy family called GH5.

GH5 class of enzymes are predominantly endoglucanases, i.e. cleave an internal beta-glycosidic bond in the cellulose polymer. They are also characterized by sharing the same protein structural fold, namely the (alpha/beta)8 fold. There are eight beta strands with alternating helices forming a barrel. The enzyme Lafond et al named SdGluc5_26A, also belongs to GH5 family with the classical (alpha/beta)8 fold. However, they also found a stretch of 38 residues at the N terminus that seemed interesting. This N-terminus is not floppy, but binds to the active site of the neighboring chain.

Image of SdGluc5_26A made using PyMOL. (PDB id: 5a8n)

In the figure above, the Trp residue (shown in green sticks) specifically binds to the active site of the neighboring chain. See the figure of the trimer below to see how they interact. Such an arrangement made SdGluc5_26A behave with lichenase activity. In the parlance of carbohydrate active enzymes, this Trp was binding to the -3 subsite of the active site.

Image of SdGluc5_26A trimer made using PyMOL. (PDB id: 5a8n)

So, the next step was to find out what happened to the activity of SdGluc5_26A, when this protruding N-terminal sequence is deleted. It was observed that upon deletion, SdGluc5_26A now behaved as a endo-beta(1,4)-glucanase. In other words, without this N-terminal part the enzyme switched its activity from an exo (chewing at the ends of the polymer) to an endo (chewing in the middle) reactive enzyme.

Given that SdGluc5_26A can act on variety of substrates, it only logical to think that this 38 residue stretch plays an important role in substrate specificity. Now, the question is if there is any allostery and cooperative mechanism that can be the reason for substrate binding? Something to chew upon! 😉

References:

Lafond M, Sulzenbacher G, Freyd T, Henrissat B, Berrin JG, & Garron ML (2016). the quaternary structure of a glycoside hydrolase dictates specificity towards beta-glucans. The Journal of biological chemistry PMID: 26755730

Mosquitoes like it hot!

December 29, 2015

Blogging, homology modeling, mosquito, Structural Biology

Leave a comment

The Mosquito Net (1912) by John Singer Sargent. Licensed under Public Domain via Wikimedia Commons

We all know how pesky mosquitoes can be. Did you know that the ability of a mosquito to find a suitable host to feed is due to thermotaxis? This behavior, being attracted/repelled due to high/low temperature, is seen in other organisms as well such as Drosophila melanogaster and Caenorhabditis elegans.

However, the behaviour is more pronounced among blood-feeding pests (kissing bugs, bedbugs, Ticks, and mosquitoes including Aedes aegypti). Aedes aegypti is a vector for many flaviviral diseases (Dengue fever, Yellow fever, etc.) Until now, it was well established that thermotaxis requires specific thermosensors that activate the sensory signals for a subsequent flight response in a mosquito. However, how exactly they function was not resolved.

In a recent paper by Corfas and Vosshall [1] describe the use of zinc-finger nuclease-mediated genome editing method to identify the role of two receptors TRPA1 and GR19 in Aedes aegypti‘s attraction to heat. It was found that these receptors help the mosquito to identify the host for feeding (in the temperature range of 43-50 deg Celcius), however they avoid surfaces that exhibit above 50 deg Celcius. [Read the recent editorial on genome editing in Genome Biology]

The sequence (923 residues long) of this receptor (Uniprot id: Q0IFQ4) has at least five transmembrane regions that are approximately 20-25 residues long. A cursory glance at homologous sequences shows that it shares 37% sequence identity with the a de novo designed protein (PDB id:2xeh).

The homology modeled structure showing coiled coil region (residues 189-338). Although, the eLife paper does not talk about structure, I felt that this paper deserves a mention here. The reason is the structural biology/bioinformatics possibilities with this novel target. It is a suitable target for designing inhibitors that would potentially act as mosquito repellents.

Also, combined with the method described in my previous post on mutating transmembrane proteins as a method of making them crystallize, I guess the 3D structure of this important protein will come to light sooner!

Homology modeled region of TRPA1, from ModBase

Homology modeled region of TRPA1 (189-338), from ModBase

References:

Corfas RA, & Vosshall LB (2015). The cation channel TRPA1 tunes mosquito thermotaxis to host temperatures. eLife, 4 PMID: 26670734
Greppi, Chloe and Budelli, Gonzalo and Garrity, Paul A (2015). Some like it hot, but not too hot. eLife, 4

My Guest Post in Cell’s Crosstalk Blog

December 19, 2015

Bioinformatics, Blogging, Science Writing, Structural Biology

Leave a comment

I am sharing this guest post of mine that was published in Cell’s Crosstalk: Biology in 3D Blog. Yes, the journal Cell!

Here is the link: http://www.cell.com/crosstalk/why-do-i-blog-about-structural-bioinformatics

Enjoy!

Why Do I Blog about Structural Bioinformatics?:Biology in 3D

Published December 17, 2015 | 15:00 EST

When someone says that they have a blog, the stereotypical response would be, “About your travels?” or “Hmmm … Recipes … Must be a delicious blog!” And when one confesses that said blog is about scientific research, the jaw drops. I presume it has to do with the notion that blogging science is not that much fun!

Two things inspired me to become a blogger: (1) an amazing community of scientific bloggers at Research Blogging, who inspired me with their wonderful posts; and (2) my view that structural biology and structural bioinformatics are not getting the exposure they deserve. Thus inspired and motivated, I begun blogging about four years ago, and was able to channel some of my thoughts and energy into my blog, called Getting to Know Structural Bioinformatics.

Guest author and blogger
Raghu Yennamalli

Why do I blog? Blogging is fun! For me, blogging is about sharing with the world recent research and tidbits on structural biology and bioinformatics. Most importantly, it is about sharing the excitement that I feel after reading a paper. In some sense, blogging about research is similar to a journal club, where I am able to share the latest research with my peers. However, unlike a journal club, the audience for my blog is the entire world.

Blogging is also dynamic and interactive, because it allows me to engage in conversation with others (specifically students) when they weigh in with their comments. Below I highlight some of the best practices that I’ve developed over the years that help me with balancing my research, teaching, and personal responsibilities with my blogging.

Selecting the paper

The main way I find articles that I want to blog about is by scouring through the table of contents of the journals I am interested in. Sometimes I also hear about exciting protein structures via friends and other blogs that I follow. I try to have a balanced approach and highlight structural work on systems that are “hot topics” as well as papers that just captured my interest and fancy.

In the early days of my blogging, I was trying to collate and compile tools and techniques that would come in handy for students working with protein structures. I wanted my blog to be a handy place for myself and others to find tips and tricks. Over time, the range of topics and papers I cover has broadened, and although I still cover a lot of method development work, I cover other topics as well. In general, once I make up my mind about the paper I want to blog about, I start reading it, give myself some time to soak in the method and outcome of the paper, and try to think critically as to what possible gaps or methods that the authors could have done to make the paper better. Alternatively, I also analyze the paper’s novelty with respect to structural bioinformatics.

Composing the blog post

I should confess that the monthly posts in Protein Spotlight by Vivienne Baillie Gerritsen are my inspiration while composing posts. I love her writing style and also the manner in which artwork is included in every post, to make it fun to read. Like Protein Spotlight, blogs have the advantage of including other multimedia items, for example using animated gifs and YouTube videos that make the post much easier for the reader. So, I start finding an appropriate image from an art database that best fits the topic (of course, giving credit where it is due). When it is about a tool/software, I figure the best approach is to use said tool/software and include a “first-hand” experience of how I perceived it. Also, I try to include an additional tidbit or information that the authors mention in passing.

Balancing things

With an active teaching and research schedule, finding time to blog does become a challenge. I try to make it a fun process, so that it does not feel cumbersome. If one looks at the frequency of my posts, I try to maintain at least one post per month. Looking at others’ blogs at Research Blogging, I realize that one post a month is a low turnout, and I try to post as frequently as possible. Sometimes, the problem is sheer lack of time or not finding exciting enough material to blog about. However, this does not mean that exciting research is not out there. The key is to find a balance between blogging and other duties. I have had discussions with other bloggers who blog on other nonscience topics, and we observed that the main turnoff in blogging is when one delves deeper and over time a particular post becomes “work.” Maneuvering that roadblock is key to maintaining a successful blog.

In the end, as at the beginning, it all comes down to having fun and sharing with the world my excitement about the type of scientific research I enjoy. I think this is probably the feeling others who blog share as well, and I can see it in some of the blogs I follow, such as the following:

Protein Spotlight by Vivienne Baillie Gerritsen
Byte Size Biology by Iddo Freidberg
What You’re Doing is Rather Desperate by Neil Saunders
And on the lighter side, the Tumblr blog “What should we call Grad school“

Raghu Yennamalli completed his PhD in Computational Biology and Bioinformatics in 2008 from Jawaharlal Nehru University. He conducted postdoctoral research at Iowa State University, University of Wisconsin-Madison, and Rice University. Currently, he is an Assistant Professor at Jaypee University of Information Technology. He can be contacted at ragothaman AT gmail DOT com.

Alternate conformations

December 4, 2015

Bioinformatics, homology modeling, PDB, Structural Biology

Leave a comment

In the 90s morphing of two unrelated images was popular and mostly it was used for entertainment purposes. For example: the famous video of Michael Jackson’s pop hit “Black or White”.

Courtesy: Google

This morphing method was also used to analyze changes in protein motions, like in domain rearrangement. A popular webserver, where you can get an animated gif of your protein’s motion (assuming you have two distinct conformations), is the Morph server (http://www2.molmovdb.org/) from Gerstein’s Lab. In many cases this gave us insight of how the protein could dynamically change from one form to another.

The change in structural forms of a protein is not a trivial problem. We would need to generate ensembles of protein structures for many purposes. 1) Understand conformational transition paths, 2) Generating more realistic receptors for docking 3) in turn understand the flexible and rigid parts of the protein, and few other applications.

Till now, one could use Normal mode analysis and Molecular Dynamics methods to generate ensemble. It is here that ConTemplate tries to bring in fresh perspective to generate an ensemble of structures.

ConTemplate mines the PDB for existing structures and gives the user a set of possible conformations. The main presumptions are that for any given PDB structure, it has more than one available structure, and there are additional conformations available for proteins that undergo major conformational changes.

For the dataset created for ConTemplate the maximum RMSD between two structures of the same protein is 5 Angstroms. 69.2% of the proteins have less than 1 Angstroms RMSD. Thus, the method uses an interesting three-step process:

using the query it searches for structural equivalents using GESAMT aligner. Here using the structural alignment sequence alignments are generated.
it runs BLAST to identify additional conformations for all structural equivalents obtained in step 1. A representative template is identified
Finally, Modeller is used to build model structures using this template in various conformations.

The advantage of ConTemplate is that it yields a more relevant set of conformations for the query protein. I tried running a query to the server and I would say that I got some interesting results. Screenshot below:

Superposition of models created in ConTemplate for PDB id; 1ECE

References:
Narunsky A, Nepomnyachiy S, Ashkenazy H, Kolodny R, & Ben-Tal N (2015). ConTemplate Suggests Possible Alternative Conformations for a Query Protein of Known Structure. Structure (London, England : 1993), 23 (11), 2162-70 PMID: 26455800

Designer proteins helping biomedicine

November 27, 2015

Bioinformatics, protein folding, Structural Biology

Leave a comment

Reblogging this blog post

http://loonylabs.org/2015/11/24/protein-structure-biotechnology-personalized-medicines/

Professor Meiering and her colleagues were able to incorporate both structure and function into the design process by using bioinformatics to leverage information from nature. They then analyzed what they made and measured how long it took for the folded, functional protein to unfold and breakdown.

Using a combination of biophysical and computational analyses, the team discovered this kinetic stability can be successfully modeled based on the extent to which the protein chain loops back on itself in the folded structure. Because their approach to stability is also quantitative, the protein’s stability can be adjusted to naturally break down when it is no longer needed.

Reference:

Broom A, Ma SM, Xia K, Rafalia H, Trainor K, Colón W, Gosavi S, & Meiering EM (2015). Designed protein reveals structural determinants of extreme kinetic stability. Proceedings of the National Academy of Sciences of the United States of America, 112 (47), 14605-10 PMID: 26554002

Prions – From Dr. Jekyll to being Mr. Hyde

March 16, 2014

Bioinformatics, Blogging, Molecular Dynamics, point mutation, prion, protein folding, Structural Biology

Leave a comment

Image reproduced under Creative Commons licence. Source: Wikimedia commons

The Cellular Prion Protein (PrP^c) like Dr. Jekyll converts into PrP^Sc, a fatal conformational form, like Mr. Hyde, and is responsible for a variety of neurodegenrative disorders. Unlike the use of a potion, this molecular Jekyll and Hyde undergoes conformational change in low pH environment, such as in endosomes. While, there has been many studies done in the past of how this conformational change happens, a recent paper has tried to list the structural and dynamic properties using Molecular Dynamics.

To list these properties,three structures were taken into consideration; one NMR structure (PDB id: 1QLX) and two X-ray structures (PDB id: 2W9E and 3HAK). Interestingly the 3HAK structure is from a SNP variant of human PrPc, where the Met129 is replaced by Val129. Furthermore, those who genetically have this variant are less susceptible to Prion diseases!

Structural alignment of 1QLX (blue), 2W9E (red), and 3HAK (orange) with Met129/Val129 shown as sticks. Image made using PyMOL

Using an in-house MD package called in lucem molecular mechanics, ilmm for short, Chen et al simulated the three structures under two different pH conditions (pH 5 and pH 7) and under two different temperatures (298K/25C and 310K/37C), totaling for about 3.6 microseconds of simulation. (For each structure under each condition the MD simulation was performed in triplicates.)

Analyzing the MD results they found that at 37C and low pH the C-terminal globular domain had significant destabilization effects.

The helix HA and its neighboring loop S1-HA for the SNP variant was higher compared to other two structures at 37C and low pH. It is interesting to note that the S1-HA loop becomes a strand during the prion’s conversion.
At low pH, another helix HB destabilizes, where the His187 becomes solvent exposed, leading to partial unfolding of the C-terminus.
Two residues, Phe198 and Met134, converting from being part of the hydrophobic core to being exposed to the solvent may be involved in partial unfolding and might possibly provide aggregation sites.
The X-loop in the Val129 SNP variant’s structure took a different conformation that was not populated by the other two structures.
Formation of new secondary structures of the N-terminus region to either alpha and beta strands is spontaneous. While, in all two structures both alpha and beta strands formation was seen, in the SNP variant alpha strands were rarely formed. (This N-terminus region is missing from the solved structures and hence was modeled and in each starting structure this region was unstructured.)

These results give more insights into the conversion of the benign form of human Prion to the infectious form.

References:

Chen, W., van der Kamp, M., & Daggett, V. (2014). Structural and Dynamic Properties of the Human Prion Protein Biophysical Journal, 106 (5), 1152-1163 DOI: 10.1016/j.bpj.2013.12.053

PDBx format!

March 8, 2014

Bioinformatics, PDB, Structural Biology

Leave a comment

I am sure this blog’s readers are aware of the PDB format. This format, created in the 1970s, is a standardized format for data derived from X-ray diffraction and NMR studies [1]. Until 2006, homology/theoretical models were also accepted for deposition, but not any more [2]. [See previous post on Protein Model Portal for submitting homology/theoretical models]

The current limit of PDB format is that a coordinate file with more than 62 chains and 99,999 atoms cannot be uploaded as a single file and hence was split into three or four separate PDB depositions. To overcome this limitation, a new format has been on the works and recently the working group announced the new format recommendations. [3, 4]

Not that I warned you. The first FAQ [4] on this link says:

What should every PDB user know about PDBx/mmCIF?
The PDB file format will be phased out in 2016.
PDBx/mmCIF will become the standard PDB archive format in 2014.

What is this new format?
To illustrate the changes, the first image is the ATOM records in the current PDB format. And, the second image is the PDBx format of the same information.

The new PDBx format:
Did you observe any changes? Here is a comparative image below that has the ATOM records aligned one below the other.
The first thing that caught my eye was the order of the columns the new format is using. Also, the extra decimal positions for occupancy and B-factor columns. Now, if you look at the second image, we saw some extra lines before the ATOM records. These are the list of things in the “atom_site” category. The new format has the following categories and under each category, there is a detailed description of what goes into it. For example, the ATOM records is what is called as the ATOM_SITE category. Under this, there is information describing atomic positions.

PDB to PDBx correspondences

This link describes what items in PDB correspond to the new PDBx format

The website (http://mmcif.wwpdb.org/) and the FAQ have tons of information. Check it out, and get familiar with the new format of PDB! You have two years to learn it. 🙂

References:

http://www.rcsb.org/pdb/static.do?p=file_formats/pdb/index.html. Accessed: 2014-03-07. (Archived by WebCite^® at http://www.webcitation.org/6NtpuqkvU)
http://deposit.rcsb.org/depoinfo/depofaq.html. Accessed: 2014-03-07. (Archived by WebCite^® at http://www.webcitation.org/6NtpzBByw)
http://www.emdatabank.org/lrg_strct_dpstn.html. Accessed: 2014-03-07. (Archived by WebCite^® at http://www.webcitation.org/6Ntq1KxKY)
http://www.wwpdb.org/workshop/wgroup.html. Accessed: 2014-03-07. (Archived by WebCite^® at http://www.webcitation.org/6Ntq3BY5u)
http://mmcif.wwpdb.org/docs/faqs/pdbx-mmcif-faq-general.html. Accessed: 2014-03-07. (Archived by WebCite^® at http://www.webcitation.org/6Ntq56e1O)

Know any high schoolers?

March 6, 2014

Structural Biology, visualization

Leave a comment

Structural View of HIV/AIDS:

A Video Challenge for High School Students

http://www.rcsb.org/pdb/101/static101.do?p=education_discussion/educational_resources/videochallenge/video_challenge_2014.html

Something to Bragg about!

February 11, 2014

Structural Biology, visualization

Leave a comment

Yes, the extra “g” was intentional. You see, 2014 is the International Year of Crystallography declared by the United Nations. So, Crystallographers are “Bragg”ing about it! [You see what I did there? 🙂 ]

In this month’s issue (February 2014) of Biophysical Journal, the biophysicist couple Prof. Jane RIchardson and David Richardson came out with an article that commemorates this special year. The number 54 gains value here.

In this issue they highlight 54 protein structures that basically, as they put it, “illuminated” the field of biophysics.
The number 54 also denotes the number of years since the structure of myoglobin was solved.
Additionally, If you wanted a year long celebration, you need at least one structure solved by X-ray crystallography per week. So, you can look at one molecular structure at a time and marvel at it. (As a bonus, you get two more.)

As I read the article, I was squeeing with delight, as it had hand drawn pictures of the earliest solved structures! Those pictures definitely upped the oomph factor for these proteins. The best part is this article is open access. So, I can make a slideshow of these structures! Lo and Behold!

This slideshow requires JavaScript.

If you want to add some unique information about any of these proteins, look out for the list to be available in WIkipedia. http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Biophysics#New_articles

Wait, that’s NOT all. If you go to IYCR2014 website, there is tons of information there. For example, if you go to events, you can look at the year long celebrations happening around the world.

If you point your browser to http://www.iycr2014.org/learn/educational-materials you will find a good list of things about crystallography one can learn about!

References:

http://www.iycr2014.org/
Jane S. Richardson and David C. Richardson (2014). Biophysical Highlights from 54 Years of Macromolecular Crystallography Biophysical Journal, 106 (3), 510-525 DOI: 10.1016/j.bpj.2014.01.001

P212121 – The most frequently seen space group in protein crystals

January 19, 2014

Science Writing, Structural Biology, visualization

Leave a comment

Image Courtesy: PymolWiki

It is a fact that there is a non-uniformity with which different space groups occur in protein crystals. For example, the space group P212121 is the most frequent in protein crsytals and occurs almost one-third of the time!!!

Why is this so? This was the question asked by Wukovitz and Yeates in their paper titled “Why protein crystals favour some space-groups over others” [1]

Comparing the protein crystals with organic molecule crystals it seems there are marked differences. The rules for organic molecules’ molecular packing was proposed by Kitaigorodskii and it became widely accepted. [2]

However, If we look at the distribution of the space groups in organic molecules and proteins there are marked differences. Thus, the authors argue that same criteria cannot be applied to proteins. One major difference between the crystals is that protein crystals contain 50% solvent by volume, while organic crystals are jam-packed with less space. This results in a higher “coordination number” (10-14) for organic crystals than for proteins, where the number is average 7.5

Based on all these, the authors tried to devise a simple statistical measurement that can answer as to why certain space groups are preferred among the 65 biological space groups.

And the formula is:

D=S+L-C, where

D = Total number of rigid-body freedom
S = number of meaningful degrees of freedom
L = number of independent parameters for describing the unit cell, and
C = minimum number of unique contacts required to make the set of symmetry related molecules

All three are positive integers and are not adjustable parameters. The explanation given by a simple statistical analysis for protein crystals is “For a particular space group only a certain number of rigid-body degrees of freedom are available for assembling the first few molecules before the internal structure of the crystal is completely defined. This number depends on the space group symmetry.”

Three things limit the rigid-body degrees of freedom

number of meaningful Rigid-body DOF for the first molecule in space
the number of independent unit-cell parameters
the number of intermolecular contacts to make a network

How to find C?
The problem of finding C is equivalent to the problem of identifying the minimal set of symmetry elements. For each space group, C can be determined by finding the minimal set of generators for each space group. The numbers range from 5 to 2.

The authors observed that the calculated value of D correlated with the observed frequency of the space group!That is, higher the value of D the most frequent space group. Guess which space group had a higher D value?

Now the question comes back to “Why P212121 is more frequent?” The reason is that this space group is the least restrictive for the possible orientations and positions of the molecules in the crystal.

The authors do note that their analysis does not take into consideration of the shape of the molecule, energetics, and packing efficiency, which can lead to answers for non-monomeric proteins in the asymmetric unit. According to the authors, P1 has a D value of 8, and is predicted to be the most used space group for racemic protein mixtures.

References:

Wukovitz SW, & Yeates TO (1995). Why protein crystals favour some space-groups over others. Nature structural biology, 2 (12), 1062-7 PMID: 8846217
Kitaigorodskii AI. Organic Chemical Crystallogrphy (1955) Consultants Bureau, New York (Originally published in Russian by Press of the Academy of Sciences of the USSR, Moscow)

—Getting to know Structural Bioinformatics

for the curious one

Archive

Structural Biology

Help me, neighbor!

Mosquitoes like it hot!

My Guest Post in Cell’s Crosstalk Blog

Why Do I Blog about Structural Bioinformatics?:Biology in 3D

Selecting the paper

Composing the blog post

Balancing things

Alternate conformations

Designer proteins helping biomedicine

Prions – From Dr. Jekyll to being Mr. Hyde

PDBx format!

Know any high schoolers?

Structural View of HIV/AIDS:

A Video Challenge for High School Students

Something to Bragg about!

P212121 – The most frequently seen space group in protein crystals