It is currently possible to define library sets covering billions of virtual products from generally available starting materials and a single synthetic sequence of a few steps. Attempts to evaluate compound properties relying on the starting material structures miss the topology provided by the reaction sequence, while traditional enumeration methods become impractically slow as library size increases. And since reaction sequences themselves are combinatorial, the full scope of the problem is larger by several orders of magnitude.
We have developed the Syntheverse – a synthetically accessible part of the compound universe – and a ligand-based virtual screening method to mine this space for similarity. When queried with an active molecule, in a few minutes compounds are suggested which are synthetically accessible via one or more of the existing synthetic routes. Such output provides library design ideas for hit follow-up from screening or lead hopping into novel series. Examples of design, evaluation and generation of virtual hits will be presented.
We will introduce a method which catches the two aforementioned two birds (chemcial complexity and chemical universe) with one stone: by cleverly searching a fragment space on the fly without the need to enumerate compounds, the computational overhead is kept to a minimum, and thus, search times are low (minutes for 1010 molecules). Secondly, if the fragment space is composed of the inhouse available chemistry, results obtained are much more likely to be synthesizable, as the chemical reaction protocol is automatically delivered together with the hits.
We will show a few validation cases from the industry, and look at the properties of one publicly available fragment space which contains 12 billion molecules.
Hyde is a new scoring function considering the essential interactions in proteinligand complexes. Hyde describes consistently hydrogen bonds, the hydrophobic effect and desolvation. Its basic principle is a well balanced assessment of the energetics of desolvation. Unlike most other scoring functions Hyde is not calibrated against affinity data. Instead, we use octanol/water partition data of small molecules for calibration. Energetically favorable and unfavorable contributions to the binding affinity can be assessed on an atomic level. Energetically unfavorable arrangements can easily be identified. Thus, Hyde provides intuitive guidance for the design of compounds with optimized bioactivity. Combined with a visual front end, Hyde can be used for an interactive optimization of small molecule binders.
It is highly desirable to have a scoring function that provides guidance for the design of compounds with optimized bioactivity. Hyde is such a scoring function. Its basic principle is a balanced assessment of the energetics of desolvation, based on which, energetically favorable and unfavorable contributions to the binding affinity can be assessed on an atomic level.
It has been demonstrated previously that Hyde is able to distinguish between strong binders, weak binders, and non-binders. We have coupled Hyde to an efficient numerical optimizer and integrated it with our graphical user interface, for optimum ease of use. Hydes meaningful atomic contributions can now be visualized. The Medicinal chemists simply sees energetically unfavorable arrangements and may try out ideas how to alter a given structure in order to gain potency in an interactive manner like on a virtual workbench.
Lead discovery often starts from small fragment binders for which experimental evidence has been found in an active site. These fragment binders can then be developed into leads with improved affinity by
These tasks can now be accomplished with a novel software tool, LeadIT, which was primarily designed for mixed medicinal and computational chemistry teams. The approach uses an indexed 3D fragment database which is interactively searched. The medicinal chemist can give immediate feedback on the synthetic feasibility of the results, interesting compounds can be saved and further elaborated on.
We will show the basic principles of this approach as well as a few retrospective examples which show the usefulness of this approach.
There are many aspects making structure-based virtual screening a difficult and time consuming task. Compound libraries have to be compiled and preprocessed – ideally customized for each screening campaign; the target protein has to be prepared carefully analyzing protonation states, co-factors, protein flexibility and interfacial water molecules. Parameters of the docking and scoring calculation have to be adjusted and sometimes pharmacophoric constraints have to be defined. Finally, numerous potential hits have to be analyzed in order to make a final compound selection. Due to its complexity, it would be highly desirable to have structure-based virtual screening much closer to interactive design. This would allow playing with assumptions and revising decisions in the preparatory phase. In this talk, we summarize our efforts towards this goal presenting novel methods for dataset preprocessing, fast screening, effective scoring, and complex visualization.
In the past two decades of structure-based drug design we have seen several ups and downs of the field. With the first protein crystal structures rational drug design – mostly based on structure-based de novo design – emerged. It quickly became clear that the prediction of synthetic accessibility and binding affinity are the cornerstones of these methods. Today, we are still lacking reliable prediction methods resulting in the primary use of statistical learning approaches. Recent discussions on the predictivity of QSAR models and the existence of activity cliffs show that also these methods are no black-box tools. On the other hand, numerous examples of successful modeling applications can be found in the literature impressingly showing the relevance of structure-based modeling today. In our opinion, it is the mixture of computational search power and the medicinal chemist's intuition making structure-based modeling successful. In this talk, we will focus on computational tools supporting this interactive design process.
Structure diagrams are the universal language of chemistry. They revolutionized our understanding of chemical structure and influence our daily theoretical work with molecules. In nearly all fields of chemistry ranging from organic synthesis to toxicology we generalize from molecular structures to chemical patterns. We often do so by naming functional groups or molecular building blocks. More sophisticated are linear chemical pattern languages like SMARTS. Due to their flexibility and expressive power, they are almost omnipresent today. Although hardly human readable, there was no graphical representation of chemical patterns available. Based on structure diagrams, we present such a visualization concept covering all aspects of chemical pattern languages like the usage of logic expressions and the description of atomic and bond features and implemented it in the software tool SMARTSviewer. The resulting visual patterns are easy to read and therefore open the opportunity to work with generalized patterns to every chemist.
Virtual screening is predominantly used to mine enumerated compound libraries. Such application is inherently limited to millions of molecules at most, even if massive computer resources are employed. However, chemistry – even is restricted to the well-established, robust and straight forward reactions – gives access compound spaces that are easily by several orders of magnitude larger than that. We have taken virtual screening to a whole new dimension by being able to search such truly virtual chemistry spaces based on similarity, without ever fully enumerating all products.
The similarity searches in the chemistry space is accomplished by a descriptor-based method. Descriptor-based similarity searches are known to be extremely fast and suitable for high throughput virtual screening. Whereas shape-based methods are considered to be more accurate but significantly slower. We have combined these two approaches to gain the better of both worlds, the speed of the descriptor-based search and the accuracy of the shape matching. Application examples and results of benchmark studies will be presented.
Fragment based drug design has seen a tremendous impact to the discovery of new and promising lead compounds. Building up needles to potent molecular binders in a rational approach not only is a more viable approach to lead discovery, but also one is not limited to one scaffold and thus has more possibilities to reach into areas of the binding pocket that possibly wouldn't have been reached otherwise.
However, there are also targets for which we already know a multitude of binders that occupy different subpockets of the active site. E.g., there are four sub pockets, A, B, C, and D and compound 1 occupies A and B and compound 2 occupies C and D. Wouldn't it be easy to simply take the best of both compounds, and combine the parts that sit in the individual subpockets such that we arrive at compound 3 occupying all four subpockets? Exactly this is done in hybrid design, i.e. merging compounds to new molecular entities that have the wonderful property of being anchored in all sub pockets of the parent compounds.
We show a method by which we can introduce only one single, rigid fragment to merge the compounds. This ensures not disturbing the location of the parts binding to the subpockets too much. An indexed 3D fragment library is searched, thus search times are minimal. We also show a few example cases in which this method has been successfully used.
In times of restructurings, more computation has to be accomplished by MedChems themselves. For software development, this prompts to acknowledge strong boundary conditions:
Several years of research have now addressed these issues, and we are now ready to present a collection of tools we think are of high interest to the MedChem personnel.
The software is a monolithic suite that not only supports drag and drop of mol2, sdf, pdb files, but also lets users drag result 2D and 3D graphics into PowerPoint, Word etc. with a single click.
Novel computational tasks included are:
Experience in several big pharmas revealed that the time won can be re-invested in core MedChem tasks. We will showcase example workflows highlighting successes plus take a glance at the science behind.