Journal of Physical Chemistry B, Vol.124, No.16, 3252-3260, 2020
Interplay between Conformational Strain and Intramolecular Interaction in Protein Structures: Which of Them Is Evolutionarily Conserved?
By computing strain energies of peptide fragments within protein structures and their intramolecular interaction energies, we attempt to reveal general biophysical trends behind the secondary structure formation in the context of protein evolution. Our "protein basis set" consisted of 1143 representatives of different folds obtained from curated SCOPe database, and for each member of the set, the strain and intramolecular energy was calculated on the "rolling tripeptide" basis, employing the DFT-D3/COSMO-RS method for the former and the QM-calibrated force field method (MM) for the latter. The calculated data, strain and interactions, were correlated with the conservation of amino acid residues in secondary structure elements and also with the level of the residue burial within the protein three-dimensional structure. It allowed us to formulate several observations concerning fundamental differences between two main secondary structure motifs: alpha-helices and beta-strands. We have shown that a strong interaction is one of the determining characteristics of the beta-sheet formation, at least at the level of tripeptides (and likely penta- or heptapeptides, too), and that the beta-strand is a prevailing secondary structure in the strongly-interacting regions of the protein folds conserved by evolution. On the other hand, low strain was neither proven to be an important physicochemical property conserved by evolution nor does it correlate with the propensity for the alpha-helix and beta-strand. Finally, it has been demonstrated that the strong interaction has a certain level of connection with residue burial; however, we demonstrate that these two characteristics should be rather regarded as two complementary factors. These findings represent an important contribution to understanding protein folding from first principles, which is a complementary approach to ongoing efforts to solve the protein folding problem by knowledge-based approaches and machine-learning.