This paper was part of my journal club recently. I touched upon LPMOs, short for Lytic Polysaccharide Monooxygenases, in my previous post that are basically oxidative enzymes.

These interesting group of enzymes have three basic types: Type I, Type II, and Type III, classified based on the site of attack, namely LPMO1 (Type I) when oxidation occurs at C1 carbon, LPMO2 (Type II) when oxidation occurs at C4 carbon, and LPMO3 (Type III) if either C1 or C4 carbons are attacked. These subtypes are part of four CAZy families to which LPMOs are categorized into (AA9, AA10, AA11, and AA13).

Having said this, the identification of the types in LPMO is not a trivial task. This specificity to cleave a particular bond, or regiospecificity, is characterized by time-consuming chromatography experiments (HPAEC-PAD), as they are time course studies that involve incubating with the substrate for longer periods. If aldonic acids are discovered in the experiments, then it is C1 cleaving or Type I; and if 4-gemdiol-aldose is detected than it is C4 cleaving of Type II LPMOs.

Given this complex identification protocol, any shortcut to identify the regiospecificity is welcome and that’s what Danneels et al have attempted in their paper published in PLOS ONE. Specifically, using an indicator diagram based identification, they give a solution to identify regiospecificity.

To test they used Hypocrea jecorina‘s LPMO9A (having both C1/C4 cleavage) and did site-directed mutagenesis on key aromatic residues that are involved in substrate binding to create mutants that are either selective to C1 or C4 cleavage. Comparing the activity of the wildtype with the mutants by plotting the speed of release of aldonic acids with respect to 4-gemdiol-aldose the authors plot it as an indicator diagram. Basically, if one calculate the slope of the line, and it is closer to x-axis (release of aldonic acid) then the enzyme’s regiospecificity is for C1 oxidation or consisting of Type I LPMO activity. If closer to y-axis, then Type II LPMO activity.

It would be interesting to see this type of indicator diagram applied for enzyme activity identification for new LPMO enzymes, and also for enzyme engineering studies on LPMO.

Reference: B. Danneels, M. Tanghe, H. Joosten, T. Gundinger, O. Spadiut, I. Stals and T. Desmet, “A quantitative indicator diagram for lytic polysaccharide monooxygenases reveals the role of aromatic surface residues in HjLPMO9A regioselectivity“, 2017. .

Enzyme discovery is always a hot topic for industry and biochemists, since there is huge commercial benefit and advancement in current knowledge-base. Unlike, early days where enzyme discovery relied upon assays, now it is a bioinformatics approach to speed up the process.

Lytic polysaccharide monooxygenases (LPMO) are the latest family of enzymes that have affected the biofuel industry, specifically cellulosic bioethanol industry. These enzymes, previously thought to bind to polymers of carbohydrates, are now understood to boost the enzymatic process of degrading recalcitrant crystalline polysaccharides. LPMOs are probably one of the enzymes that bind to crystalline surfaces of polymers and thus hopefully reducing the cost and time in the pre-processing step of biomass to bioethanol conversion.

As of now, there are 4 families that CAZy identifies to consist of LPMOs (AA9, AA10, AA11, and AA13), where they are now called as Auxillary Activity enzymes. However, there is the possibility that newer families of LPMOs may exist and it depends on how or when we identify them. Voshol et al in their recent paper discuss a bionnformatics based approach and validation using expression data for discovery of novel LPMO families, and report the existence of “LPMO14” family that are active on sugars such as beat-1,3-glucans.

In short, they took 14 known LPMOs from the four families and generated a HMM profile, using which they scanned six genomes and identified 7 LPMO14 genes. They were also able to find LPMO genes in other organisms (such as Drosophila, Bivalves, corals, etc) where the function of LPMO is not known.

While, this data sounds promising, further identification and characterization using standard assays would ensure that this is indeed a new family of LPMOs. Also, the authors do not mention as to why they stuck to 14 known LPMO structures to generate the HMM profile, while there are nearly 50 structures currently deposited in PDB.

Reference: G. Voshol, E. Vijgenboom and P. Punt, “The discovery of novel LPMO families with a new Hidden Markov model“, BMC Research Notes, vol. 10, no. 1, 2017.

The area of protein-protein interactions (PPI) is always exciting as proteins can be either monogamous or promiscuous with their interaction partners. Also, a hot topic these days in computational biology is multiscale modelling. It refers to the method of analyzing a system from atomistic and at a global scale and other scales, in between. Few years ago, a new method of co-evolutionary analysis of residues made news and has been used for gaining insights in other systems. Read the blog by Bosco Ho about the groundbreaking work here about Direct Coupling Analysis (DCA).


Example of a protein-protein interaction. By Dcrjsr – Own work, CC BY 3.0,

So, when two orthogonal approaches are used to understand a PPI system, it obviously becomes an interesting work. A recent paper in eLife reports exactly that. Malinverni et al report the use of coarse-grained simulation coupled with atomistic molecular dynamics simulation and data from DCA to identify the evolutionarily conserved residues that cause the specific interaction between Hsp70 and Hsp40.

ResearchBlogging.orgAs these two proteins are part of the chaperone machinery, any insight on the Hsp70’s ability to bind to proteins along with Hsp40 is crucial to understand the short-lived interaction. As mentioned in the article, vast amount of data (in terms of NMR, mutagenesis, etc.) is present that can be incorporated in this multiscale modelling.

The impact of the predicted interaction model is not only statistically significant, but also correlates well with the known experimental data.

Malinverni D, Jost Lopez A, De Los Rios P, Hummer G, Barducci A: Modeling Hsp70/Hsp40 interaction by multi-scale molecular simulations and coevolutionary sequence analysis. eLife 2017, 6. DOI:


Cartoon representation of the molecular structure of protein registered with 2fft code. By Jawahar Swaminathan and MSD staff at the European Bioinformatics Institute – Public Domain,

Intrinsically disordered proteins are thought to be fully functional, yet do not confirm to a single conformation, thereby identifying their structure via crystallography becomes problematic. Many intrinsically disordered proteins have been studied and analyzed using NMR methods, however the question as to why proteins are intrinsically disordered is still debatable.

While, viewing X-ray diffraction data some residues do not have an electron density region, thus they are marked as missing residues. These regions are highly mobile and are considered as intrinsically disordered. For some proteins, the entire sequence is considered intrinsically disordered.

ResearchBlogging.orgIt is a widely accepted fact that sequence dictates structure, and structure in turn dictates function. So, is the “disordered-ness” encoded in the genome, if so to what extent? This and related questions have led Basile et al at the Stockholm University, Sweden to delve deeper and have narrowed it down to GC content. Their work has been published in latest issue of PLoS Computational Biology.

Using computational methods they analyzed 400 eukaryotic genomes and looked into the so-called orphan genes, specifically. They categorized the age of the proteins using ProteinHistorian tool and looked into the old and young proteins. They found that the

…selective pressure to change amino acids in a protein is stronger than the one to change the GC content. At low GC ancient proteins are more disordered than expected for random sequence while at high GC they are less.

The three disorder promoting amino acids (Ala, Pro, and Gly) are high in GC content w.r.t to their codons. However,

At high GC the youngest proteins become more disordered and contain less secondary structure elements, while at low GC the reverse is observed. We show that these properties can be explained by changes in amino acid frequencies caused by the different amount of GC in different codons.


  1.  Basile, W., Sachenkova, O., Light, S., & Elofsson, A. (2017). High GC content causes orphan proteins to be intrinsically disordered PLOS Computational Biology, 13 (3) DOI: 10.1371/journal.pcbi.1005375

With increasing computational power (aka GPU) that can be accessed these days, it is no wonder that performing all-atom molecular dynamics simulation for a longer time, with duplicates and/or triplicates, has become easier.
Two publications report all-atom MD data that have significant implication in two diverse areas. The first one is the popular CRISPR-Cas9 system and the second one is Dengue virus.

With these data it should pave way for more insights.

CRISPR-Cas9 all atom simulation (total of 400-600ns data)
Zuo Z, & Liu J (2016). Cas9-catalyzed DNA Cleavage Generates Staggered Ends: Evidence from Molecular Dynamics Simulations. Scientific reports, 5 PMID: 27874072

Entire Dengue viral envelope complex simluation (1 microsecond data)
Marzinek JK, Holdbrook DA, Huber RG, Verma C, & Bond PJ (2016). Pushing the Envelope: Dengue Viral Membrane Coaxed into Shape by Molecular Simulations. Structure (London, England : 1993), 24 (8), 1410-20 PMID: 27396828



Wanted to share this exciting news with you! Biophysical Journal has created a collection of papers that describe tools and software that can be routinely used in biological research. Editor Prof. Leslie Loew mentions that the full-text of articles in this collection will be freely available until February 25, 2016.

…a year agoBiophysical Journal called for papers in a new class of articles called Computational Tools (CTs). These papers are limited to five pages in length and describe software for analysis of experimental data, modeling and/or simulation, or database services. All are required to be freely accessible and open to the research community. In addition to following the usual review criteria of novelty and importance, reviewers of CTs are asked to test drive the software and judge its usability.

Among the thirteen, some of them are directly related to Structural Biology and Bioinformatics. So, here’s my “curated” list.
Article Title
CHARMM-GUI HMMM Builder for Membrane Simulations with the Highly Mobile Membrane-Mimetic Model


Slow diffusion of the lipids in conventional all-atom simulations of membrane systems makes it difficult to sample large rearrangements of lipids and protein-lipid interactions. Recently, Tajkhorshid and co-workers developed the highly mobile membrane-mimetic (HMMM) model with accelerated lipid motion by replacing the lipid tails with small organic molecules. The HMMM model provides accelerated lipid diffusion by one to two orders of magnitude, and is particularly useful in studying membrane-protein associations. However, building an HMMM simulation system is not easy, as it requires sophisticated treatment of the lipid tails. In this study, we have developed CHARMM-GUI HMMM Builder ( to provide users with ready-to-go input files for simulating HMMM membrane systems with/without proteins. Various lipid-only and protein-lipid systems are simulated to validate the qualities of the systems generated by HMMM Builder with focus on the basic properties and advantages of the HMMM model. HMMM Builder supports all lipid types available in CHARMM-GUI and also provides a module to convert back and forth between an HMMM membrane and a full-length membrane. We expect HMMM Builder to be a useful tool in studying membrane systems with enhanced lipid diffusion.

Article Title
MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories


As molecular dynamics (MD) simulations continue to evolve into powerful computational tools for studying complex biomolecular systems, the necessity of flexible and easy-to-use software tools for the analysis of these simulations is growing. We have developed MDTraj, a modern, lightweight, and fast software package for analyzing MD simulations. MDTraj reads and writes trajectory data in a wide variety of commonly used formats. It provides a large number of trajectory analysis capabilities including minimal root-mean-square-deviation calculations, secondary structure assignment, and the extraction of common order parameters. The package has a strong focus on interoperability with the wider scientific Python ecosystem, bridging the gap between MD data and the rapidly growing collection of industry-standard statistical analysis and visualization tools in Python. MDTraj is a powerful and user-friendly software package that simplifies the analysis of MD data and connects these datasets with the modern interactive data science software ecosystem in Python.

Article Title
MDN: A Web Portal for Network Analysis of Molecular Dynamics Simulations


We introduce a web portal that employs network theory for the analysis of trajectories from molecular dynamics simulations. Users can create protein energy networks following methodology previously introduced by our group, and can identify residues that are important for signal propagation, as well as measure the efficiency of signal propagation by calculating the network coupling. This tool, called MDN, was used to characterize signal propagation in Escherichia coli heat-shock protein 70-kDa. Two variants of this protein experimentally shown to be allosterically active exhibit higher network coupling relative to that of two inactive variants. In addition, calculations of partial coupling suggest that this quantity could be used as part of the criteria to determine pocket druggability in drug discovery studies.

Article Title
Multidomain Assembler (MDA) Generates Models of Large Multidomain Proteins

Weblink: AND

Homology modeling predicts protein structures using known structures of related proteins as templates. We developed MULTIDOMAIN ASSEMBLER (MDA) to address the special problems that arise when modeling proteins with large numbers of domains, such as fibronectin with 30 domains, as well as cases with hundreds of templates. These problems include how to spatially arrange nonoverlapping template structures, and how to get the best template coverage when some sequence regions have hundreds of available structures while other regions have a few distant homologs. MDA automates the tasks of template searching, visualization, and selection followed by multidomain model generation, and is part of the widely used molecular graphics package UCSF CHIMERA (University of California, San Francisco). We demonstrate applications and discuss MDA’s benefits and limitations.

Article Title
RedMDStream: Parameterization and Simulation Toolbox for Coarse-Grained Molecular Dynamics Models


Coarse-grained (CG) models in molecular dynamics (MD) are powerful tools to simulate the dynamics of large biomolecular systems on micro- to millisecond timescales. However, the CG model, potential energy terms, and parameters are typically not transferable between different molecules and problems. So parameterizing CG force fields, which is both tedious and time-consuming, is often necessary. We present RedMDStream, a software for developing, testing, and simulating biomolecules with CG MD models. Development includes an automatic procedure for the optimization of potential energy parameters based on metaheuristic methods. As an example we describe the parameterization of a simple CG MD model of an RNA hairpin.

Article Title
A Web Interface for Easy Flexible Protein-Protein Docking with ATTRACT


Protein-protein docking programs can give valuable insights into the structure of protein complexes in the absence of an experimental complex structure. Web interfaces can facilitate the use of docking programs by structural biologists. Here, we present an easy web interface for protein-protein docking with the ATTRACT program. While aimed at nonexpert users, the web interface still covers a considerable range of docking applications. The web interface supports systematic rigid-body protein docking with the ATTRACT coarse-grained force field, as well as various kinds of protein flexibility. The execution of a docking protocol takes up to a few hours on a standard desktop computer.

Article Title
ReaDDyMM: Fast Interacting Particle Reaction-Diffusion Simulations Using Graphical Processing Units


ReaDDy is a modular particle simulation package combining off-lattice reaction kinetics with arbitrary particle interaction forces. Here we present a graphical processing unit implementation of ReaDDy that employs the fast multiplatform molecular dynamics package OpenMM. A speedup of up to two orders of magnitude is demonstrated, giving us access to timescales of multiple seconds on single graphical processing units. This opens up the possibility of simulating cellular signal transduction events while resolving all protein copies.

Article Title
Local Perturbation Analysis: A Computational Tool for Biophysical Reaction-Diffusion Models


Diffusion and interaction of molecular regulators in cells is often modeled using reaction-diffusion partial differential equations. Analysis of such models and exploration of their parameter space is challenging, particularly for systems of high dimensionality. Here, we present a relatively simple and straightforward analysis, the local perturbation analysis, that reveals how parameter variations affect model behavior. This computational tool, which greatly aids exploration of the behavior of a model, exploits a structural feature common to many cellular regulatory systems: regulators are typically either bound to a membrane or freely diffusing in the interior of the cell. Using well-documented, readily available bifurcation software, the local perturbation analysis tracks the approximate early evolution of an arbitrarily large perturbation of a homogeneous steady state. In doing so, it provides a bifurcation diagram that concisely describes various regimes of the model’s behavior, reducing the need for exhaustive simulations to explore parameter space. We explain the method and provide detailed step-by-step guides to its use and application.


  2. Qi Y, Cheng X, Lee J, Vermaas JV, Pogorelov TV, Tajkhorshid E, Park S, Klauda JB, & Im W (2015). CHARMM-GUI HMMM Builder for Membrane Simulations with the Highly Mobile Membrane-Mimetic Model. Biophysical journal, 109 (10), 2012-22 PMID: 26588561
  3. McGibbon RT, Beauchamp KA, Harrigan MP, Klein C, Swails JM, Hernández CX, Schwantes CR, Wang LP, Lane TJ, & Pande VS (2015). MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophysical journal, 109 (8), 1528-32 PMID: 26488642
  4. Ribeiro AA, & Ortiz V (2015). MDN: A Web Portal for Network Analysis of Molecular Dynamics Simulations. Biophysical journal, 109 (6), 1110-6 PMID: 26143656
  5. Hertig S, Goddard TD, Johnson GT, & Ferrin TE (2015). Multidomain Assembler (MDA) Generates Models of Large Multidomain Proteins. Biophysical journal, 108 (9), 2097-102 PMID: 25954868
  6. Leonarski F, & Trylska J (2015). RedMDStream: Parameterization and Simulation Toolbox for Coarse-Grained Molecular Dynamics Models. Biophysical journal, 108 (8), 1843-7 PMID: 25902423
  7. de Vries SJ, Schindler CE, Chauvot de Beauchêne I, & Zacharias M (2015). A web interface for easy flexible protein-protein docking with ATTRACT. Biophysical journal, 108 (3), 462-5 PMID: 25650913
  8. Biedermann J, Ullrich A, Schöneberg J, & Noé F (2015). ReaDDyMM: Fast interacting particle reaction-diffusion simulations using graphical processing units. Biophysical journal, 108 (3), 457-61 PMID: 25650912
  9. Holmes WR, Mata MA, & Edelstein-Keshet L (2015). Local perturbation analysis: a computational tool for biophysical reaction-diffusion models. Biophysical journal, 108 (2), 230-6 PMID: 25606671

We all have neighbors who help us in our hour of need. Some go out of the way as well. In enzymes too, it seems, that neighbors play a crucial role. Lafond et al in their recent publication in the Journal of Biological Chemistry report the invovlement of neighboring chains of the same enzyme, lichenase. Apart from the role of stabilizing the quarternary structure (a trimer), they are also invovled in the enzymatic activity.

Sacchrophagus degradans is a marine bacteria that has been credited with the capacity of degrading diverse polysaccharides substrates. The list includes, but not limited to, agar, cellulose, chitin, xylan, carboxymethylcellulose, avicel, laminarin, wheat arabinoxylan, glucomannan, lichenan, curdlan, pachyman, and others. Its genome has 19 coding regions for enzymes that belong to the same CAZy family called GH5.

ResearchBlogging.orgGH5 class of enzymes are predominantly endoglucanases, i.e. cleave an internal beta-glycosidic bond in the cellulose polymer. They are also characterized by sharing the same protein structural fold, namely the (alpha/beta)8 fold. There are eight beta strands with alternating helices forming a barrel. The enzyme Lafond et al named SdGluc5_26A, also belongs to GH5 family with the classical (alpha/beta)8 fold. However, they also found a stretch of 38 residues at the N terminus that seemed interesting. This N-terminus is not floppy, but binds to the active site of the neighboring chain.


Image of SdGluc5_26A made using PyMOL. (PDB id: 5a8n)

In the figure above, the Trp residue (shown in green sticks) specifically binds to the active site of the neighboring chain. See the figure of the trimer below to see how they interact. Such an arrangement made SdGluc5_26A behave with lichenase activity. In the parlance of carbohydrate active enzymes, this Trp was binding to the -3 subsite of the active site.

Image made using PyMOL. (PDB id: 5a8n)

Image of SdGluc5_26A trimer made using PyMOL. (PDB id: 5a8n)

So, the next step was to find out what happened to the activity of SdGluc5_26A, when this protruding N-terminal sequence is deleted. It was observed that upon deletion, SdGluc5_26A now behaved as a endo-beta(1,4)-glucanase. In other words, without this N-terminal part the enzyme switched its activity from an exo (chewing at the ends of the polymer) to an endo (chewing in the middle) reactive enzyme.

Given that SdGluc5_26A can act on variety of substrates, it only logical to think that this 38 residue stretch plays an important role in substrate specificity. Now, the question is if there is any allostery and cooperative mechanism that can be the reason for substrate binding? Something to chew upon! 😉


  1. Lafond M, Sulzenbacher G, Freyd T, Henrissat B, Berrin JG, & Garron ML (2016). the quaternary structure of a glycoside hydrolase dictates specificity towards beta-glucans. The Journal of biological chemistry PMID: 26755730