What to do with so many models?

I have encountered this issue while working with structures in PDB that are solved using Nuclear Magnetic Resonance (NMR). Which model should I choose among the 10 or 20 models? As a general rule of thumb, Model 1 is usually taken for further analysis and consideration. Is that rule universal for all NMR structures? Some studies on this topic makes it interesting to revisit.

On the outset, having an ensemble to work on is a goldmine of data. For more than one reason. Furnham et al [1] say that just like NMR, it would be great to have an ensemble created for X-ray structures too.

Such ensembles would be especially valuable in structural bioinformatics and rational drug design. For example, they would alter how local environments around residues are calculated and considered; this would have an impact on structural alignments, fold recognition, prediction of protein-protein interactions and docking.

A minimized average structure is obtained by alinging the structures and finding the average position in Cartesian space for each atom across the ensemble. One of the measure of relatedness of the minimized average structure to the ensemble is the psi and phi torsion angles and its deviation from the ensemble. The other is the chi torsion angles. Analyzing the distribution of these angles it was found that there was no good correlation between an ensemble of structures and a single structure chosen to represent [2].

In considering NMR-derived structures it is vital to take into account the fact that parts of the structure usually on the surface of the protein are not well determined, either through inherent flexibility or lack of data. In general, for the protein core the result is comparable to a medium resolution (2.0 to 2.3 A) crystal structure [3].

In the words of Sutcliffe [2]

if an NMR structure is deposited as an ensemble of structures then it is advisable to study this ensemble as a whole rather than take select a single structure to represent it.

Specifically as described in his paper, one would want to find if the number of restraints derived experimentally is less (about 2 to 3) or more (about 15-20). This easily obtained by running PROCHECK-NMR [4] (http://www.ebi.ac.uk/thornton-srv/software/PROCHECK/nmr_manual/). The NMR restraints can be downloaded from PDB. Look for “NMR Restraints” under the “Download Files” tab.

Still, if one thinks for that particular protein of interest, an average structure would be the best way to go. Then one could use CARON [5] or CYANA [6]. The latter works better if you have access to experimental restraints from the authors.

I would love to hear about this from an NMRist, if that’s how they are called! 🙂


  1. Furnham, N., Blundell, T., DePristo, M., & Terwilliger, T. (2006). Is one solution good enough? Nature Structural & Molecular Biology, 13 (3), 184-185 DOI: 10.1038/nsmb0306-184
  2. Sutcliffe, M. (1993). Representing an ensemble of NMR-derived protein structures by a single structure Protein Science, 2 (6), 936-944 DOI: 10.1002/pro.5560020607
  3. MacArthur, M., & Thornton, J. (1993). Conformational analysis of protein structures derived from NMR data Proteins: Structure, Function, and Genetics, 17 (3), 232-251 DOI: 10.1002/prot.340170303
  4. Laskowski, R., Rullmann, J., MacArthur, M., Kaptein, R., & Thornton, J. (1996). AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR Journal of Biomolecular NMR, 8 (4) DOI: 10.1007/BF00228148
  5. Sikic K, & Carugo O (2009). CARON–average RMSD of NMR structure ensembles. Bioinformation, 4 (3), 132-3 PMID: 20198187
  6. Gottstein, D., Kirchner, D., & Güntert, P. (2012). Simultaneous single-structure and bundle representation of protein NMR structures in torsion angle space Journal of Biomolecular NMR, 52 (4), 351-364 DOI: 10.1007/s10858-012-9615-8
  1. Ross said:

    When I present a single model from an ensemble, I use the structure closest to an unbiased mean. I do this by calculating the Root Mean Squared Deviation (RMSD) of the atomic coordinates of each structure with the program UWMN (Hartshorn and Caves, University of York ). This can be done over the whole molecule or just using selected residues if you have flexible regions.


    • ragothamanyennamalli said:

      Hi Ross,
      Can I call you NMRist? 🙂 Thanks for stopping by and for suggesting UWMN. Is this available as a script to run? I couldn’t find it via Google.

  2. Ross said:

    Hi if you send me an email to

    r.thomson.1 @ research.gla.ac.uk (without spaces) and I can send you program. it runs off Linux


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: