Full Abstract

Full Abstract No. 31

Authors: Maxim Chapovalov1, Roland L. Dunbrack2
Fox Chase Cancer Center, 1Maxim.Chapovalov@fccc.edu 2RL_Dunbrack@fccc.edu

Title: Using Statistical Analysis of Electron Density to Evaluate Protein Side-Chain Conformations and Rotamer Disorder

Representative figure/table:

Full abstract:

The backbone-dependent rotamer library (http://dunbrack.fccc.edu/bbdep) is used in many areas of structure analysis and prediction. A number of side-chain prediction programs, including SCWRL, use the library's backbone-dependent probabilities and dihedral angles. The library is also used in many protein design efforts and in some docking programs that treat side-chain conformational changes in binding. Because the rotamer library is only as good as the data that go into it, we decided to investigate the correlation between side-chain conformations and electron density as calculated from the experimental structure factors in the X-ray experiment.

We obtained a list of 412 high-resolution structures from the PISCES server (http://dunbrack.fccc.edu/pisces), from all PDB entries with available structure factors, resolution better than 1.6A, and R-factor better than 0.20, and no two sequences with sequence identity more than 20%. We used the program CNS (Crystallography and NMR System; Brunger et al. (1998), Acta. Cryst D 54:905) to calculate electron densities from structure factors and Cartesian coordinates (for phase determination) downloaded from the Protein Data Bank. We calculated the electron densities at side-chain atomic positions by integrating the electron density in a 1.5 A-radius sphere around the location of the atom, according to the PDB file, weighted with a decreasing exponential function. These densities were then normalized by dividing by the average density of the backbone, calculated in the same way.

We found that the average density of gamma atoms (atoms CG, SG, OG, OG1, and CG1 in PDB coordinates) varies substantially with the chi1 dihedral angle, and is highest at the staggered (60, 180, and 300 degrees) and lowest at the eclipsed positions (at 0, 120, and 240 degrees) of an sp3-sp3 hybridized covalent bond. This result provides some statistical evidence that so-called non-rotameric side chains have very low density and are more likely to be average positions of two interconverting rotameric conformations.

To investigate side-chain disorder further, we rotated a pseudoatom about chi1 and measured the electron density as a function of the dihedral angle. We found four common scenarios. For most side chains, the crystallographic position agrees well with the maximum in the electron density. For a significant fraction of side chains, the crystallographic peak coincided with one electron density peak, usually the maximum, but there was another peak in electron density of nearly the same magnitude. This was especially common for small polar residues, such as serine, with one third of serine residues exhibiting this kind of disorder. In a few cases, a non-rotameric side chain was placed roughly between two distinct rotameric peaks in the electron density. More commonly, a non-rotameric side chain was placed within a large smear of density encompassing two rotameric positions.

Using a conservative criterion for disorder, about 9% of side chains demonstrated multiple occupied rotamers at chi1. This is lower than what is seen in NMR experiments through an analysis of alpha-beta coupling constants (West and Smith (1998), J. Mol. Biol. 280:867), which documented a value of 28% in a sample of 10 proteins. However it is much higher than what is documented in the same PDB entries via partial occupancies and multiple coordinates for the XG atom - about 2%. Crystallographic structures are of course less likely to have disorder than NMR structures, due to crystal packing and conditions of temperature, salt, and pH. Crystal packing affects up to 20% of all side chains and up to 60% of surface side chains (Shelenkov and Dunbrack, unpublished).

We used electron density estimations at atomic positions from the PDB to determine whether our side-chain prediction program SCWRL3.0 (Canutescu et al. (2003), Protein Sci., 12:2001) predicts side chains with well-determined density better than those with low density. For the top decile of density, SCWRL3.0 predicts 88% of chi1 correctly within 40° of the X-ray structure, while for the bottom decile of density the figure is 66%. This indicates that some of the error in side-chain prediction is likely due to SCWRL predicting one occupied rotamer, while the X-ray coordinates reflect the other occupied rotamer. For whole side chains, RMSD values decrease steadily as electron density increases. This is especially true for the longer side chains. The effect is the smallest in the aromatic side chains as one might expect.

We have developed a system, called Woodchuck, for evaluating side-chain quality with electron density as described above. We have used Woodchuck to identify the top 50% of side chains of each residue type to build a new backbone-dependent rotamer library using a Bayesian statistical method modified from our published work (Dunbrack and Cohen (1997), Protein Sci., 6:1661). This library is built from 1500 structures of 1.8A or better. As with other criteria for restricting data to the best-determined side chains (Lovell et al. (2000), Proteins, 40:389), dihedral angles of the new library have much lower variances and strained rotamers become even rarer than was true of older libraries. Once validated and tested, we plan to use the identification of single and multiple-conformation side chains as a test set for predicting side-chain conformations and rotameric disorder using Boltzmann calculations within SCWRL. Until now, such methods have been tested against multiple structures of the same protein showing different conformations in different PDB entries.