It is highly desirable to have a scoring function that provides guidance for the design of compounds with optimized bioactivity. HYDE , as implemented in the SeeSAR software package, is such a scoring function. Its basic principle is a balanced assessment of the energetics of desolvation. Optimizing the signal-to-noise ratio, three major factors are taken into consideration:
Based on these, energetically favorable and unfavorable contributions to the binding affinity can be assessed on an atomic level. The HYdration and DEsolvation terms are determined using octanol/water partition coefficients of small molecules. We do not calibrate based on affinity data or otherwise! Therefore HYDE is generally applicable to all protein targets. It reflects the Gibbs free energy of binding while only considering the essential interactions of protein-ligand complexes.
HYDE successfully selects the correct binding mode in 93% of complexes in re-docking calculations on the Astex diverse set. Also, the performance in virtual screening experiments using the DUD dataset showed significant enrichment values with a mean AUC of 0.77 across all protein targets with little or no structural defects. As part of these studies, we also carried out a very detailed analysis of the data that revealed interesting pitfalls. On the PDBbind 2007 coreset, HYDE achieves a correlation coefficient of 0.62 between the experimental binding constants and the predicted binding energy, performing best on this dataset compared all other well-established scoring functions that have not been trained on this data. Furthermore, it has been demonstrated that HYDE is able to distinguish in congeneric compound series between strong binders, weak binders, and non-binders . Previously missing terms regarding repulsion and strain which rendered HYDE not entirely applicable to conformationally strained or clashing poses are now considered in an optimization phase prior to the actual score assessment.
The greatest benefit of HYDE is that it yields a very intuitive atom-based score, which can be mapped onto the ligand and protein atoms. This allows the direct visualization of the score and consequently facilitates analysis of protein-ligand complexes during the lead optimization process. The user may immediately identify energetically unfavorable arrangements, like an H-bonding group without a counter-part in an otherwise hydrophobic pocket. Medicinal chemists will immediately have ideas how to alter a given structure in order to gain activity. The interface allows these changes to be tried out in an interactive manner like on a virtual workbench.
One of the great advantages of HYDE is that it is solely based on physicochemical properties and reflects energy-estimates cited in standard literature for multiple decades very well without being calibrated or adjusted based on experimental data. Therefore HYDE is a generally applicable scoring function and it works well in a range of different scenarios.