Intrinsically disordered proteins are thought to be fully functional, yet do not confirm to a single conformation, thereby identifying their structure via crystallography becomes problematic. Many intrinsically disordered proteins have been studied and analyzed using NMR methods, however the question as to why proteins are intrinsically disordered is still debatable.
While, viewing X-ray diffraction data some residues do not have an electron density region, thus they are marked as missing residues. These regions are highly mobile and are considered as intrinsically disordered. For some proteins, the entire sequence is considered intrinsically disordered.
It is a widely accepted fact that sequence dictates structure, and structure in turn dictates function. So, is the “disordered-ness” encoded in the genome, if so to what extent? This and related questions have led Basile et al at the Stockholm University, Sweden to delve deeper and have narrowed it down to GC content. Their work has been published in latest issue of PLoS Computational Biology.
Using computational methods they analyzed 400 eukaryotic genomes and looked into the so-called orphan genes, specifically. They categorized the age of the proteins using ProteinHistorian tool and looked into the old and young proteins. They found that the
…selective pressure to change amino acids in a protein is stronger than the one to change the GC content. At low GC ancient proteins are more disordered than expected for random sequence while at high GC they are less.
The three disorder promoting amino acids (Ala, Pro, and Gly) are high in GC content w.r.t to their codons. However,
At high GC the youngest proteins become more disordered and contain less secondary structure elements, while at low GC the reverse is observed. We show that these properties can be explained by changes in amino acid frequencies caused by the different amount of GC in different codons.
- Basile, W., Sachenkova, O., Light, S., & Elofsson, A. (2017). High GC content causes orphan proteins to be intrinsically disordered PLOS Computational Biology, 13 (3) DOI: 10.1371/journal.pcbi.1005375