Can we predict small-molecules binding affinities?

In a recent post, Derek Lowe, from “In The Pipeline”, asks his readership “If we could just walk right up and calculate the free energies of binding events reliably, what would you most want such calculations to be able to do for you? What would convince you that they’re actually believable? And how close to you think that we actually are to that?” We tried to briefly answer some of these questions. How close are we to predict small molecules binding free energy?

I admit that I was driven to answer this question by the air of scepticism rising for Derek’s post. The same scepticism modelers often receive from their experimentalists counterparts, which is sometimes appropriate, but at times completely mistaken. Is it justified in this case? I’ll let the readers decide.

There is a wide range solutions for this problem starting from the rigorous but expensive free energy perturbation (FEP) methods, through end-point free energy methods as Linear Response Approximation/Linear Interaction Energy and Molecular Mechanics Poisson–Boltzmann/Generalized Born Surface Area to the less accurate but very fast scoring functions. Let’s review those in reverse order.

Scoring functions

Small molecule docking methods usually use a simplified and fast energy model to try to identify the most stable conformation of the ligand-protein complex. Usually these are based on an empirical force field with a simple solvent model. Reliance on a single bound conformation in effect assumes that this conformation is the only one that is significantly occupied. These scoring functions are perhaps less accurate but provide a good first guess that can than be followed by a more rigorous estimation.

Wang et al. (2004) performed an extensive comparison of 14 different scoring functions using a set of 800 protein-ligand complexes with known binding constants from PDBBind. They found that X-Score, DrugScore, Sybyl::ChemScore, and Cerius2::PLP demonstrated moderate correlations (R = 0.45?0.57) between their binding scores and the experimentally determined binding constants for the entire 800 protein?ligand complexes set. These four scoring functions reproduced the known binding constants of the entire test set with a standard deviation of 1.8?2.0 log units. In a more recent (2009) study in a similar framework the correlation coefficients between the experimental binding constants and the binding scores computed by these scoring functions ranged from 0.545 to 0.644. RosettaLigand’s scoring function had a correlation coefficient of 0.63 with experimental small molecule binding energy across a diverse set of 229 protein- small molecule complexes. Finally, in a study supported by the Scoring Function Consortium (SFC), a collaborative effort with various pharmaceutical companies and the Cambridge Crystallographic Data Center, different regression models were used to train a suite of scoring functions based on a plethora of descriptors. These worked beautifully on the training sets and went as high as 0.72 in 10-fold cross validation. However on an independent test set (N=800) from PDBBind, the correlation was only 0.58.

The conclusion is that scoring functions provide a fast and predictive tool for the task of estimating a ligand binding affinity given a high-resolution crystal structure of the interaction, however there is much room for improvement in terms of accuracy.

Molecular Mechanics Poisson–Boltzmann/Generalized Born Surface Area

These methods use Molecular Dynamics (MD) simulations of the free ligand, free protein, and their complex as a basis for calculating the average potential and solvation energies. The MD runs themselves are typically based on an empirical force field and an explicit solvent model. The resulting snapshots are post-processed and re-scored with either the PBSA or GBSA implicit solvent model. Averaging over each trajectory allows the changes in mean potential and solvation energy to be calculated. The change in configurational entropy is then estimated by obtaining the average entropy change over these snapshots.

These methods provide very good results for specific systems, for instance on a series of 7 compounds bound to avidin MM-PBSA predicted the binding values with a correlation of R^2=0.92. On 22 PfDHFR inhibitors a correlation of R^2=0.93 was achieved (even when using a single minimized structure instead of the MD ensemble), and a similar performance of R^2=0.9 for 18 urokinase ligands. However for other systems such as the p38 MAP Kinase, they were shown to be less adequate. (See also Validation and use of the MM-PBSA approach for drug discovery.)

Linear Interaction Energy / Linear Response Approximation

In the linear interaction energy method two MD simulations are ran with an empirical force field: one for the ligand in solution, and the other for the ligand in the protein binding site. Snapshots saved from the simulations represent Boltzmann ensembles of conformations and are used to compute the Boltzmann-averaged electrostatic and van der Waals interaction energies of the ligand with its environment in the bound and free states. The binding free energy is then estimated as a linear combination of the differences in potentials between the bound and free state over Boltzmann averages.

Good correlation was shown for these types of approaches both with generic weights (usually noted alpha and beta) for the different terms, e.g. on 8 Plm II Inhibitors as well as with fitted weights per system. In a very recent comparison of free energy calculation methods, Singh and Warshel, show an R^2=0.93 correlation of free energy prediction for 22 different protein-ligand complexes. The methods compared in this study, microscopic Linear Response Approximation (LRA/b version) and its variants including the Linear Interaction Energy (LIE) to the more approximated and considerably faster scaled Protein Dipoles Langevin Dipoles (PDLD/S-LRA version).


It would seem that for specific systems LIE/LRA or MM/PB(GB)SA can be the method of choice yielding very accurate results, and is proposed to be used as a second stage for validation of docking predictions, or for de-novo ligand design. There is certainly more work needed to be done in improving scoring functions, but these are ‘good enough’ (and fast enough) for screening large databases of small-molecules.

I hope this answers Derek’s questions and wish him a productive conference.

This post was (very) largely based on the excellent review by Gilson and Zhou:
Gilson, M., & Zhou, H. (2007). Calculation of Protein-Ligand Binding Affinities
Annual Review of Biophysics and Biomolecular Structure, 36 (1), 21-42 DOI: 10.1146/annurev.biophys.36.040306.132550

Be Sociable, Share!

Enjoyed this Post ?

Subscribe by E-mail:

Subscribe in a reader. Follow us on twitter.

  • Darrell

    Great article! I really enjoy these concise summaries of recent research that highlight important general guidelines or rules of thumb for computational biology/chemistry. Keep ’em coming!

  • admin

    Thanks Darrell!

  • Lab Rat

    Interesting article! I think your right that experimental scientists have a tendency to distrust (and underestimate) the ability of models, especially in biology (more than chemistry and physics). We had a modelling team last summer working with my project and I was continually impressed by how much the managed to achieve and how useful their work was.

  • Per

    I think you are grossly exaggerating how well these techniques perform.

    In my experience scoring functions does not normally give any meaningful correlation with experimental activity. They can however to some degree differentiate between actives and inactive compounds.

    Articles describing MM-PBSA and LIE results are, as far as I have seen, exclusively based on small sets of con generic ligands and are therefore of limited value.

Powered by WordPress | Aeros Theme | WordPress Themes
© 2009 Rosetta Design Group LLC