key: cord-0059735-9zundawc
authors: Temml, Veronika; Kutil, Zsofia
title: Structure-based molecular modeling in SAR analysis and lead optimization
date: 2021-03-04
journal: Comput Struct Biotechnol J
DOI: 10.1016/j.csbj.2021.02.018
sha: 5ba7cffe792df277a99484cafff88e82d69c7f61
doc_id: 59735
cord_uid: 9zundawc

In silico methods like molecular docking and pharmacophore modeling are established strategies in lead identification. Their successful application for finding new active molecules for a target is reported by a plethora of studies. However, once a potential lead is identified, lead optimization, with the focus on improving potency, selectivity, or pharmacokinetic parameters of a parent compound, is a much more complex task. Even though in silico molecular modeling methods could contribute a lot of time and cost-saving by rationally filtering synthetic optimization options, they are employed less widely in this stage of research. In this review, we highlight studies that have successfully used computer-aided SAR analysis in lead optimization and want to showcase sound methodology and easily accessible in silico tools for this purpose.

In this review, we focus on the utilization of molecular modeling techniques, primarily molecular docking but also pharmacophore modeling and molecular dynamics (MD) simulations, in the hit-to-lead optimization process.

To this end, we have surveyed the research papers deposited in Web of Science using the query phrases 1) docking ''lead optimization" or ''lead identification and optimization" (420 results) and 2) pharmacophore ''lead optimization" or ''lead identification and optimization" (156 results ). Subsequently, we have excluded the publications (a) containing only in silico work not verified by appropriate in vitro methods, (b) using in silico methods for the explanation of observed results, not involved in the optimization process (c) lacking the optimization process. The final selection of 48 publications is presented in this review together with comments on various aspects of molecular modeling techniques that have to be considered especially when they are used in the optimization process. The objective of this review is to showcase the sound methodology and guide the readers through the fractionalized topic to the relevant sources of information like specialized reviews and verified databases. We believe that given the low ratio of publications successfully utilizing in silico methods in the lead optimization process such a guidepost is highly useful for multiple audiences. This review is by no means meant to be an exhaustive overview of all publications using the in silico methods in the optimization process. While we don't aim at a detailed analysis of the docking and pharmacophore modeling processes, we provide a basic analysis of the theory and recent developments pointing the reader at several excellent review articles that have been written on this topic.

The ground rules of medicinal chemistry remain mostly unchallenged since the 1970s when Topliss published his famous scheme about optimizing aromatic substitution patterns for maximal bioactivity [1] . From the very beginning, this kind of systematic optimization has been accompanied by computational analysis like the Hansch and Free-Wilson analysis that aim to quantify the influence specific substitutions have on defined bioactivity [2] . Over the last decades, a vast array of computational methods for medicinal chemistry purposes have become available. Many novel activities are found in pharmacophore-based virtual screening campaigns or found by other 2D and 3D similarity-based methods [3] . Nearly every publication describing a novel activity also proposes a binding mechanism predicted in a docking simulation, often refined by follow-up MD simulations that provide more details on ligand binding.

Still, the role of computational methods becomes smaller over the course of drug development. However helpful, all of these methods are also a double-edged blade. On one hand, they can save a lot of resources and unnecessary work but they can also tempt the user to oversimplify challenges, by looking only at the scope of the simulation and not beyond it. Furthermore, we always have to consider that methods based on already known ligands and binding patterns might prevent us from finding uncommon (and often also highly innovative) ligands. It is, therefore, key to employ a clean methodology that includes as much known data as possible and does not exclude ''inconvenient" data points, e.g. atypical molecules like natural products. In summary, molecular modeling methods have established a supporting role for themselves during hit discovery [4, 5] , still, they also have a lot to offer during the hit to lead optimization phase when used by the educated user.

Molecular Docking has been used to predict binding conformations and interactions between ligands and protein binding sites for over 40 years, starting with Levinthal aiming to simulate the arrangement of sickle cell hemoglobin molecules (Hb-S) in the sickle cells tubular fibers [6] . Since then a plethora of molecular docking programs have been written and applied, primarily to predict binding modes of bioactive molecules. Molecular docking programs for protein-ligand docking are defined by three core functions: (1) The computational representation of protein and ligand. (2) The docking algorithm, which is used to solve the optimization problem of fitting the ligand into the protein binding pocket and generating possible binding poses. ( 3) The scoring function (SF) then aims to quantify the quality of said calculated binding poses. All of these factors have to balance accuracy and calculation time and therefore work with simplified representations of the physical reality of protein-ligand binding. In this section, we would like to introduce those concepts, but we also refer the readers to the specialized reviews listed in Table 1 for more information.

Due to the size and complexity of biopolymers, a full representation of a complete protein with all atomic coordinates is usually not feasible for high throughput simulations. The structural data, acquired from X-ray crystallography or lately cryogenic electron microscopy is therefore reduced and simplified for docking simulations. A segment of the protein that contains the investigated binding pocket is selected and prepared. In the most basic approaches, the protein is simply seen as a surface, which is enhanced by electrostatic properties [7] . Such approaches are still in use for protein-protein docking, where a more complex representation is Table 1 Selected reviews focused on the docking algorithms and SFs.

Reference Short description [18] Describes the main concepts behind biologically inspired algorithms applied to molecular docking simulations. [19] Reviews the implementation of genetic algorithms in drug design and quantitative structure-activity relationship. [20] Describes several molecular docking search algorithms, and the programs which apply such methodologies. [21] Reviews key aspects relevant to the field of molecular docking from forces important in molecular recognition up to the library design. [22] Briefly covers some of the applications of genetic algorithms in the field of drug design.

SFs [23] Analyzes machine-learning SFs for drug lead optimization in the 2015-2019 period. [24] Summarizes the progress of traditional machine learning-based SFs in the last few years and provides insights into recently developed deep learning-based SFs. [17] Reviews basic types of SF based on an up-to-date classification scheme together with suitable application areas and shortcomings, challenges, and potential future study directions. [25] Highlight the impact of using QM methods to investigate the docking of ligands to protein targets by using QM-computed parameters to either develop or to be used directly as SFs. [26] Describes authors' work on developing the PDBbind database and the CASF benchmark together with relevant work done by other researchers.

V. Temml and Z. Kutil Computational and Structural Biotechnology Journal 19 (2021) 1431-1444 not feasible [8] . In protein-ligand docking, the selected binding site is represented in more detail. In the oldest docking methods both the protein and the ligands were regarded as rigid (rigid docking). These approaches were quickly replaced by flexible docking methods, which treat the protein mostly as rigid but are optimizing the conformations of the ligand to fit the binding pocket. In more advanced methods also the receptor has a degree of flexibility, allowing e.g. individual amino acids to assume different orientations [9, 10] . Full receptor flexibility is mostly taken into account in docking refinement with MD simulations. Flexible docking algorithms can be roughly divided into energybased methods and stochastic methods. Energy-based methods aim to represent the binding free energy as a function of the binding geometry. Stochastic methods work by randomly changing the translational, rotational, and torsional degrees of freedom of a molecule [11] . The resulting poses of these changes are then evaluated by the SF and kept if they represent an improvement and discarded if they do not. To avoid the algorithm finishing in a local energy minimum, worse poses are accepted on a random basis (e.g. with a Monte Carlo algorithm [12] ). One of the most widely applied group of docking algorithms are genetic algorithms (employed by software programs like GOLD and Autodock). The principle of genetic algorithms is shown in Fig. 1 .

To quantify the quality of binding poses, docking programs use SFs. SFs can be categorized into physics-based and empirical SFs although in the meantime a lot of hybrids have been developed. Physics-based SFs consist of a sum of energy terms from a force-field (e.g. the MERCK or AMBER force field) that represent interaction energies of the protein-ligand complex, internal ligand energy, and sometimes also solvation models [15] . Empirical SFs are based on regressional analysis of structural descriptors and experimentally elucidated affinity data of known protein-ligand complexes [16] . Scoring is a hugely controversial topic in molecular docking and a wide variety of SFs and post-process rescoring functions have been developed [17] .

Next to the theoretical background of molecular docking, we would like to stress that while docking is a valuable tool to find out how a ligand molecule binds to a target protein it can also lead an inexperienced user to over-interpret the results. A docking program will almost always propose one or several possible binding poses and it is the user who needs to judge the quality of the result. Unfortunately, there are a lot of studies published where docking is used with very low-quality standards. Often the biological experiments do not necessarily translate to binding affinity to the specified target, docking scores are often sold as ''activities" and only a few studies fully include their workflow validation. Haphazard The docking calculation process starts with the evaluation of the initial population, derived from a starting conformation. Then a mutation factor that ensures variation in the genes is introduced. The next step is crossbreeding -the genes are combined, creating a new (offspring) population of chromosomes. This process is iterated until (a) the fitness level is satisfactory, (b) the population has converged and does not produce offspring significantly different from the previous generation, (c) the fixed number of iterations is reached [13, 14] . simulations are almost certainly meaningless for lead structure optimization. It is therefore key to rely on high-quality data and extensive validation when constructing a fine-tuned docking workflow that should be able to predict the activity of structural modifications in a quantitative manner. The optimal docking workflow is simplified in Fig. 2 and discussed in the next section.

2.1.2. Practical background -data and software selection, workflow validation, and recent advances in the field The main resource for any publically available structural data is the protein databank (https://www.rcsb.org/), a constantly updated database of all published protein structures [27] , invaluable for molecular modeling purposes [28] . All structural data available on the target should be carefully examined since the crystallized protein differs from the wildtype and the researcher needs to judge if a modified structure is still relevant for their study (e.g. if the investigated binding site is conserved in the mutant structure). If the resolution of the crystal structure is sufficient for molecular modeling (ideally below 2.5A), the electron density map should be checked to identify flexible, or otherwise badly resolved parts of the protein. Many flexible proteins assume different binding conformations upon binding to different smallmolecule ligands. If the investigated scaffold is known beforehand it is beneficial to select a co-crystallized structure that is similar to the investigated scaffold. Next, to build an optimal docking workflow it is helpful to have a high-quality test set of known ligands and non-binding compounds, to determine if the workflow can discriminate between them. It must be stressed that aside from the docking software, the quality of the data used for the modeling and the way they are processed (e.g. protein preparation and conformer generation) is the most important factor [29] .

Various studies were performed to benchmark the docking software and their SFs (see table 2 ; reviewed in [17, 30, 31] ). Nevertheless, due to the number of programs and the variable nature of the targets and ligands, it is almost impossible to recommend a single software or method. As it is shown in table 2, the programs not performing well in one study, were well-rated in studies focused on other targets or ligands (e.g AutoDock Vina). Therefore, we suggest following the procedures in the benchmarking studies and use multiple docking programs during validation, to determine the one that yields the best results for a particular research question. We are aware that the availability of the docking software often plays a role in their selection and we have specified it the Tables 4 and 6 . Still, the users should keep in mind that not all commercial and open-source molecular modeling software are fully compatible.

The minimum validation of a docking workflow is the redocking of the co-crystallized ligand into the empty binding pocket of the protein. The calculated pose is then compared to the bioactive conformation from the crystal structure and the root mean square distance (RMSD) is calculated. Typically, RMSD values below 2 Å are considered acceptable, although depending on the size and flexibility of the ligand and the quality of the crystal structures, researchers might set a more ambitious benchmark [37] . If multiple structures for a protein are available, they can be used for cross-docking validation. A co-crystallized ligand is docked to another protein structure and the poses are compared to the cocrystallized complex, to see if the orientation and interactions in the binding site can be reproduced correctly [38] (Fig. 3) .

Ideally, the workflow is also able to enrich active over inactive compounds in a theoretical validation with a test set of known actives and inactives from the literature or standardized benchmarking datasets [40] . In this case, the user is assuming that bioactivity correlates directly with binding affinity. While this is true for specialized methods like radioligand binding assays, other factors than direct ligand-binding may play a role in other assay types to different degrees. It is therefore critical to evaluate how many other factors influence the outcome of the used assay and if activity in the assay translates quantitatively to binding affinity.

The validity of the workflow is represented by its ability to rank active over inactive compounds, which can be represented in the receiver operating characteristic (ROC) curve. All compounds found in a test set are ranked according to their score. Each true active is counted on the vertical scale, while each false positive is counted on the horizontal scale until all compounds are included and the curve reaches the top right corner of the plot. Every ROC curve above the diagonal is an improvement over random selection. The integral of the ROC curve is the area under the curve, Table 3 Selected publications dealing with the advances in docking, recent research, and/or specialized reviews.

Reference Short description [50] Examine a representative set of currently used computational approaches to identify repurposable drugs for COVID-19. [51] Introduces public-private partnership that has been established worldwide and then describes the background, frame, activities, and uniqueness of this partnership. [52] Reviews the experience with Ebola and Zika viruses drug development, its implications to SARS-CoV-2 drug discovery, gaps in the field, and computational approaches applied. [53] Provides a review of the available computational methods that employ water molecules for the analysis of macromolecules' properties and structure dynamics. [54] Research article; Application of Gaussian Boson Samplers for prediction of accurate molecular docking configurations. [55] Intends to provide readers with guidance for practically applying MM/PBSA and MM/GBSA in drug design and related research fields. [56] An opinion of a panel of scientists from the industry who work at the interface of machine learning and pharma on the past, present, and future role of AI for ADME/Tox in drug discovery and development. [31] Covers the current field of in silico docking to nucleic acids, available programs, as well as challenges faced in the field. [57] Introduces in silico approaches and tools that have been developed to predict drug metabolism and fate, and assess their potential to facilitate the virtual discovery of promising drug candidates. [58] Introduces three-dimensional matched molecular pairs concept and discusses the successful applications. [59] Discuss the development and application of strategies for the structure-based design of cancer-targeting peptides against GRP78. [60] Outlines the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, propose recommendations for the selection and the design of benchmarking datasets. [61] Focuses on the specifics of docking calculations with covalent ligands. [62] Discuss applications and practical aspects of MD simulations with mixed solvents. [63] Provides an overview of protein-peptide docking methods and outlines their capabilities, limitations, and applications in structure-based drug design. [64] Examines the successes, limitations, and new avenues for modeling metalloenzyme inhibitors and metallodrugs. [65] Gives a historical account of the development of ensemble docking and discusses some pertinent methodological advances in conformational sampling. [66] Presents an overview of the evolution of structure-based drug discovery techniques in the study of ligand-target recognition phenomenon, going from static molecular docking toward enhanced MD strategies. [67] Summarizes progress in the prediction of RNA-ligand interactions, available methods for calculating the mode of binding for various small RNA molecules, the range of SFs for ligands ranking, accommodating RNA flexibility, and example studies. [68] Provides an overview of the state of the art of experimental and computational approaches for investigating drug metabolism. which additionally contains information about the ''early enrichment" in the dataset if the actives are ranked higher than inactives [41] . When docking is used for structure optimization all experimental data that relates to the binding mode, e.g. from mutational studies or protein fishing experiments should be gathered. It should be stressed that in contrast to other in silico modeling applications, e.g. environmental toxicity assessments studies where in vitro evaluations are not desired [42] [43] [44] , in silico lead optimization stands and falls with the combination of those methods.

If docking is used in the context of a lead structure optimization, the demands on the SF become a lot more challenging. Instead of just discriminating between active and inactive, a quantitative relationship between the docking score and measured bioactivity should be achieved. Enyedy et al showed that automated SFs have a hard time outperforming the correlation of simple physicochemical properties such as molecular weight or clogP [45] . Šinko conducted a study where 68 PDB crystal structures of complexes between acetylcholinesterase (AChE) and its ligands were evaluated by different SFs (LigScore1, LigScore2, PLP1, PLP2, Jain, PMF, and PMF04) to see if they could establish a quantitative correlation to their activity. The best results were achieved by the PLP2 function with a coefficient of determination (r 2 ) of 0.591. However, also physicochemical parameters, like the number of heavy atoms or the number of sp 2 hybridized atoms, performed with an r 2 above 0.5 [46] .

Alternatively, it can be useful to identify key interactions that are vital for the activity of a specific scaffold [47] . This can lead to empirically customized SFs, especially useful in guiding the optimization process [48] . An exemplary theoretical study by Levoin et al, conducted already in 2008 aimed to construct dockingbased virtual screening workflows to select high-affinity histamine H3 receptor ligands, while also excluding activity on the HERG channel and CYP2D6, prominent antitargets. For their chosen activity thresholds they achieved an accuracy of at least 70% with these methods [49] .

The advances in the field and the popularity of docking led to its novel applications in specific areas of drug development. We would like to mention e. g. the docking of ligands to nucleic acids and the docking of specific ligands like peptides, but due to the broad range of the developments, it is beyond the scope of this review to discuss them in detail. Nevertheless, we have selected 20 representative specialized reviews and research publications that deal with various directions of recent developments and we provide their list in Table 3 to ease the reader the way to the information. Besides, we hope that the following ''Example" section will guide the less advanced docking users in the right direction.

Molecular docking has been successfully used in drug screening campaigns, but its ability to help in lead optimization is still gener- Table 4 The selected studies using molecular docking in successful lead optimization. 1 The possible drug disposition in pathophysiological conditions are taken from referenced studies. 2 In IC 50 , Ki or KD. 3 Data missing. 4 Reported improvement. 5 Thermal-shift assay. 6 DNA polymerase stop assay. 7 Improvement in metabolic stability. 8 Commercial software. 9 Academic. 10 Freeware.

ally questioned. If we plan to use it so, we have to realize that it is not a stand-alone technique but should be embedded in a workflow of different in silico as well as experimental techniques. In table 4, we have summarized the studies successfully integrating docking into the optimization process, that we have found. These studies share a good research methodology, where molecular docking was connected with in vitro verification and led to optimized compounds. The major reasons why we have excluded the remaining publications are: (1) the authors used docking just to elucidate the mechanism of action of their optimized compounds or performed retrospective molecular modeling on previously optimized compounds (~13% of the query results); (2) the authors have only proposed the optimization by in silico methods, but haven't performed the optimization (~3%); the authors haven't used appropriate methodology to validate their docking results (e.g. validation of molecular docking in animals or models that are too far removed from direct target ligand binding interaction;~1%) (3); It was not clear, how the authors implemented the docking to the optimization process (~1%) (4) The authors identified the hit but did not develop it further (~1%); (5) The docking approaches did not lead to a more potent compound (~1%). Those reasons mainly point towards user-based issues in the implementation of the docking in the optimization process and we hope that the sound methodology discussed in the following section will help to improve the reputation of the molecular docking in the hit-tolead optimization. When we aim to optimize a specific activity, it is vital to be familiar with the exact molecular mechanism of action, e.g. Granchi et al employed molecular docking to optimize a reversible monoacylglycerol lipase (MAGL) inhibitor [69] . Most known MAGL inhibitors act via an unfavorable irreversible mechanism of action, it was therefore vital in this case to retain the reversibility of inhibition. A docking simulation was conducted and the resulting poses were clustered to identify the dominant pose orientations and interaction patterns. This process was used to select poses that underwent an MD simulation to check their reliability. Synthesizing the structures that fared best in the computer simulations led to a compound (17b) with a tenfold improved Ki value compared to the lead compound. Additionally, aware of the fact that this class of compounds could act as artifacts and promiscuous bioactive molecules, the authors carried out a number of control and verification experiments including Pan Assay Interference Compounds (PAINS) analysis. The acronym PAINS was first used by Baell and Holloway, who described a number of substructural features that can help to identify false-positive compounds and included them into publicly available filter-it software [70] . The electronic filters formulated to recognize PAINS can process hundreds and thousands of compounds in seconds and are a very useful tool in current medicinal chemistry. Still, even the authors of this concept, later on, wrote: ''It has become increasingly clear that overzealous or simplistic use of these filters may inappropriately exclude a useful compound from consideration and inappropriately tag a useless compound as worthy of development" [71] . We thus highly recommend including this type of analysis into the optimization process but keep in mind its limitations. It might be tempting to employ such filter tools at the beginning of a virtual screening campaign to exclude all potentially problematic compounds from the start. This can, however, lead to overlooking interesting hit structures.

Another crucial aspect of optimization by docking is the interpretation of the interaction pattern with the protein. It should be noted that docking primarily describes reversible inhibition without covalent binding. There are however some specialized approaches (see [61] ) where docking is used to predict covalent (irreversible binding). This is usually accomplished by anticipating the reaction that takes place and predicting if the reaction partners will be in close enough vicinity to perform the reaction. Noncovalent binding is usually driven by electrostatic interactions, either by ionic interactions or by weaker polar interactions such as hydrogen bonds. These bonds do not only occur directly between the ligand and the protein but are also often mediated by water molecules. Different interaction points often vary in their relevance for the overall activity. While interactions with the catalytic or substrate binding residues are most likely to play an important role, it is also key to systematically analyze all interactions, in multiple crystal structure complexes. A handy freeware tool for conducting interaction analysis is protein-ligand interaction profiler (PLIP), a web server that generates lists of interacting residues for a list of given pdb entries (available at www.projects. biotec.tu-dresden.de/plip-web) [72] .

If possible, also compounds that do not possess a suspected key interaction point should be synthesized and tested to verify the relevance of the interaction by loss of activity. Such proof of principle was shown in several cited studies. We would like to highlight a 2015 study by Stornaiuolo et al, where a structure-based docking approach was used to develop the first direct activator of BCL-2associated X protein (BAX), a pro-apoptotic member of the B-cell lymphoma-2 family [73] . The authors of this study carefully prepared a design strategy by the analysis of their target and its small Table 5 Selected publications dealing with the pharmacophore modeling basics and advances.

Reference Short description [115] Research article; Introduces a target-specific drug design method based on a deep learning algorithm and a water pharmacophore that can autonomously generate a series of target-favorable compounds. [116] Research article; Describes BioChemical Library update and shows (beside others) how the author's models can be decomposed into human-interpretable pharmacophore maps to aid in hit/lead optimization. [117] Provides a brief introduction to the pharmacophore modeling concept and presents examples of applications in the specific field of natural product chemistry. [112] Overview of the basic pharmacophore modeling concept together with recent developments in the field. [118] Research article; Present a new approach that incorporates flexibility based on extensive MD simulations of protein-ligand complexes into structure-based pharmacophore modeling and virtual screening. [119] Discuss foundations and caveats of scaffold hopping approaches and analyzes recent methodological developments [120] Introduces three of the fertile directions in approaching the biological activity by chemical structural causes: (i) the special computing trace of the algebraic structure-activity relationship, (ii) the minimal topologicaldifference (MTD), and (iii) comparative molecular similarity indices analysis. [121] Different approaches to the generation of pharmacophore models are compared together with their strengths and weaknesses. [122] Reviews pharmacophore techniques that are used for modeling ADME properties (with a special focus on pharmacophore models reported for various cytochrome P450 enzymes) [123] Reviews pharmacophore modeling studies focused on antitargets. [124] Describes the specifics in the construction of structure-based pharmacophores including (i) protein structure preparation, (ii) binding site detection, (iii) pharmacophore feature definition, and (iv) pharmacophore feature selection. [125] Focuses on the synergistic combination of pharmacophore modeling with other molecular modeling approaches such as the hot spot analysis of protein binding sites, MD, and docking [126] A brief overview of structure-activity relationship methods in ligand-based drug design followed by a more detailed presentation of issues and limitations associated with empirical energy functions and conformational sampling methods [127] Insight into different approaches implemented by the 3D pharmacophore modeling packages like Catalyst, MOE, Phase, and LigandScout. [128] Describes computational models for key nuclear hormone receptor binding sites generated by ligand-based approach. of the synthesized compounds were measured and the whole SAR was rationalized. The resulting lead compound was further evaluated in cellular assays and in vivo on mice models showing efficiency in tumor mass reduction together with the absence of gross toxicity. When moving further along in the drug development process the effects on the whole organism become more and more relevant. A marvelously low IC 50 on an isolated target is worth nothing if the drug can never reach the intended target. ADMET properties should, therefore, be considered early on in the drug development process to avoid inauspicious surprises in in vivo trials. There already a lot of online prediction tools from a wide variety of sources available to profile compounds for their drug-likeness and for their ability to be orally absorbed Online predictors like swissADME (http://www.swissadme.ch/) [74] give a good idea about oral bioavailability, ability to penetrate the blood-brain barrier and also for the likelihood of a compound to bind to metabolic enzymes and some anti-targets. Biotransformer, an open-source prediction tool for drug metabolism was presented at the beginning of 2019. It uses a knowledge and machine learning-based approach to predict putative metabolites of an input molecule under different conditions (e.g. gut microbiota and water/soil microbiota) [75] . An excellent resource to stay on top of novel tools and databases that become available is http://www.click2drug.org/ , a comprehensive list of computer-aided drug design (CADD) software.

Nevertheless, even docking itself can be utilized in the improvement of ADMET properties. In a study, focusing on the optimization of bis-amide derivatives as CSF1R inhibitors, published in 2017, the authors employed a docking workflow in GLIDE to evaluate possible replacements for a metabolically labile and poorly permeable methyl piperazine group in the lead compound. Possible replacement candidates were evaluated in a docking workflow and compounds with good chances of retaining activity were synthesized and tested. Replacement of the methyl piperazine group and further optimization led to more stable compounds and oral bioavailability in a mouse model [76] . Another example of a successful docking application in improving the ADMET properties can be Table 6 The selected studies using pharmacophore modeling and other approaches in successful lead optimization. found in [77] . The lead phosphoinositide 3-kinase (PI3K) inhibitor was metabolically unstable because of rapid glucuronidation of the phenol moiety. Based on the X-ray structure of the PI3K, the authors concentrated their SAR on the interactions with Asp836 and Lys833, but the results of this approach were not successful. The subsequent virtual docking experiments took into consideration not only Asp841 and Tyr867, but also Asp836 and Lys833. Based on this docking simulation the authors identified aminopyrimidine as a bioisostere of phenol and designed compounds showing comparable PI3K inhibitory activity to the lead compound and greatly improved metabolic stability. The final compound showed strong tumor growth inhibition against a KPL-4 breast cancer xenograft model in vivo. Docking can be valuable when investigating multi-target inhibitors since it allows us to compare the binding mode within two different proteins that often share binding site similarities. Here, we would like to discuss one successful and one unsuccessful example for this application. Giustiniano et al used a docking-based virtual screening to find dual inhibitors of a murine double minute (MDM) 2 and 4 homologs [78] . Inhibitors that target MDM2 selectively induce the upregulation of MDM4. Targeting both proteins, therefore, increases the chance for an effective clinical response. The visual inspection of docking results to MDM2 performed by Glide docking software led the authors to the synthesis of compound effective against both the MDM2 and MDM4. The lead structure was then optimized in an in silico guided process to retain both activities and led to a nanomolar inhibitor of both targets. A binding mode predicted for the series was then confirmed with NMR spectroscopy.

However, docking showed the limitations in a study performed by Cheung et al. The authors performed docking experiments for the binding sites of the two target enzymes 5-Lipoxygenase (5-LO) and Microsomal prostaglandin E synthase (mPGES-1) to gain more insights into their SAR. The docking was performed with GOLD and the docking poses were analyzed by taking into account three factors: the presence of key hydrogen bonds, the docking score, and the similarity of the outcome poses. Nevertheless, the detailed analysis of the results revealed that the docking simulations underscore the similarities between the two binding modes, explaining why so many of the derivatives also showed activities against both, 5-LO and mPGES-1 [48] .

In depth in silico analyis was used by a research team led by Artem Cherkasov, to explore various binding sites on the androgen receptor (AR). The AR contains a ligand-binding domain, which serves as a binding site for both androgens and other small molecule ligands. A particularly dangerous mutation in prostate cancer, however, produces AR-V7, a mutant that lacks the ligand-binding domain and thereby cannot be controlled by the common androgen deprivation therapy [79] . Inhibitors of the AR targeting alternative binding sites are therefore of great interest to treat prostate cancer. The group used in-silico methods to identify two alternative binding sites on the DNA binding domain of the AR [80] . Docking was used to identify inhibitors of the D-BOX dimerization interface of the AR DNA binding domain [81] .

The last study that we would like to showcase is a textbook example for docking applications in lead discovery and lead optimization. Lyo et al used a large-scale docking approach to evaluate 170 million of make-on-demand compounds that can be formed by 130 popular chemical reactions against the AmpC b-lactamase (AmpC) and the D4 dopamine receptor [82] . They have identified phenolate inhibitor of AmpC which revealed a group of inhibitors with unknown precedent and optimized this compound lowering the Ki 29 times. Against the D4 dopamine receptor, 81 new chemotypes were discovered, 30 showed submicromolar activity, including a 180-pM subtype-selective agonist of the D4 dopamine receptor. This study was executed under rare conditions that also allowed the researchers to address several key questions of the molecular docking. The authors not only verified their hits and leads in vitro (over 500 molecules), but they also co-crystallized selected compounds with AmpC and confirmed their fidelity to the docking predictions. This large and verified compound data set allowed them to evaluate SFs and take a closer look at the performance of the human over the computer. The compounds were spread to the bins covering the highest-ranking, mid-ranking, and low-ranking scores. The hit rates followed the score after a plateau defined by the highest-ranking molecules. The hit rates of sets selected by docking score alone and by human visual evaluation were comparable (at around 24%). Nevertheless, the molecules prioritized by human inspection typically had better affinities and a disproportionate number of the most potent compounds. This study shows that it is possible to achieve a quantitative correlation between score and experimental activity (at least for the D4 receptor) and molecular docking is extremely helpful in drug discovery, but still confirms the need for human expert inroad.

Pharmacophore modeling has found a place in drug discovery, especially for fast and effective screening for new bioactive molecules [108] . A pharmacophore is defined by IUPAC as ''an ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target and to trigger (or block) its biological response'' [109] . These abstracted three-dimensional interaction patterns can be used to examine key interaction points that define ligand binding and to determine which parts of a molecule can be modified while retaining or even improving ligand binding. Established pharmacophore models for a target binding site can then be compared to the pharmacophores of conformational libraries in a virtual screening that requires comparatively little calculation time [110] . Since a pharmacophore only defines interaction as a general interaction type, e.g. as a hydrophobic area or a hydrogen bond acceptor, a pharmacophore screening can also perform scaffold hopping and propose different functional groups for bioisosteric replacement [111] .

There are two general approaches to generate pharmacophore models: (i) the structure-based, built on an experimentally elucidated protein-ligand complex, and (ii) the ligand-based approach, where known ligands are superimposed, to elucidate common structural elements and pharmacophore features.

There are a variety of different molecular modeling software packages available for pharmacophore modeling. They are using different approaches to four key aspects of pharmacophore modeling: Calculation of the pharmacophore and feature definitions, conformational sampling, screening algorithm, and SF. Several pharmacophore modeling software programs are reviewed indepth under [112] . They are mainly distinguished by the available types of features and how they are placed. Some features like ionic interactions or hydrogen bonds are available in all programs and their definition is rather straightforward based on the physical properties of these bonds. Many programs also contain more specialized features e.g. metal binding features for different metals or even allow the user to define their own features in accordance with their requirements [110, 113] .

The screening algorithm then compares the query pharmacophore to the pharmacophores of molecules within a conformational library. Similar models in different software programs can find different hit molecules within the same database, due to differences in the algorithm [114] . Finally, the fit of a compound to the pharmacophore model is quantified by a SF (often called pharmacophore fit value). Since the fit into these simplified models is a more clearly defined task than the fit to an actual binding pocket, SFs are not as controversial as they are in molecular docking. Recent developments are aiming to combine pharmacophore modeling with MD simulations, creating dynamic pharmacophores that account for the dynamic process of protein/ligand binding [112] . For more information about the basics of pharmacophore modeling and the recent developments in the field, please see the publications listed in Table 5 .

Constructing a pharmacophore model requires an in-depth knowledge of the target protein. Many proteins have a variety of different binding sites and even for a single binding site there are usually multiple different binding modes since different ligands interact with different parts of the pocket (Fig. 4 A and B) . Subtypes of proteins often contain very small differences that can determine the selectivity of a pharmacophore, e.g. in the case of the Src kinase family: A hydrophobic pocket is specific for lymphocyte-specific kinase Lck. If a pharmacophore model is constructed to reflect that it contains a hydrophobic feature within that pocket (Fig. 4A) . A pharmacophore-based on a non-selective inhibitor does not contain a feature there but has the ability to also find molecules that bind to other kinases that do not have the side pocket ( Fig. 4B and C) [129, 130] .

Since one pharmacophore model can only represent a single binding mode it is often necessary to create multiple models for a target to cover its active space [131] . Usually automatically generated models from a single crystallographic complex need to be further refined by using a test set of known binding compounds. Same as for docking it is vital to carefully curate the dataset and use only experimental data that shows direct ligand/target interactions, undisturbed by interactions with other targets or assay components, to arrive at a high-quality model [132] . The test set allows us to calculate several typical quality parameters that can be used to compare individual models and select the best one. The enrichment factor shows how a model increases the yield of actives compared to random selection. The Ef is the ratio between the number of true positives (tp) divided by all found virtual hits (vh) and the number of actives (A) divided by the total number of compounds in the database (N) [133] . . This kinase lacks the binding pocket near the ATP-binding cleft seen in A and B. The inhibitors share a common binding mode in the ATP-binding cleft, but PP2 binds selectively by making additional contacts in a deep, hydrophobic pocket present in Src kinase family [129, 130] . This phenomenon is visualized by the software PyMOL (left side of the panel; [39] ), but even better reflected by pharmacophore models created by the software LigandScout (presented at the right site; [110] ) and showing the feature important for the selective binding. The yellow spheres represent hydrophobic features, the blue star is a positive ionizable feature, the green arrow is a hydrogen bond donor feature and the red arrow marks a hydrogen bond acceptor feature. These pharmacophore models can serve for the search of new Src kinase family selective inhibitors. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Ef = (tp/vh)/(A/N) Model quality can also be represented in the ROC curve. In this case, the compounds are ranked by the fit value and all compounds that are not found by the model are counted with a fit value of zero and represented by a line to the upper right corner of the plot. In a final experimental validation step, the model is sometimes set to screen a large database of available compounds. The bests hits are then biologically evaluated and the gained activity data can in turn be used to even further refine the model [134] .

The application of pharmacophore modeling in lead optimization is often about identifying new interaction points that might increase the potency of an inhibitor. For example two pharmacophore models of similar inhibitors with the same binding site of glutamate carboxypeptidase II [135] help to understand why a single modification leading to a new hydrogen bond acceptor feature leads to an increase of potency (Fig. 5) . In an optimization process, this new feature can then be used to select molecules with a better chance for high binding affinity.

Sometimes also the de novo design is guided by pharmacophore models, to evaluate synthesis proposals. To this end, computational structure generators aim to cover the chemically possible space and to generate structures that are synthetically feasible with a reasonable effort [136] . Instead of using databases with known and available molecules, databases generated by structure generators can be used to discover novel chemical entities. The program PhDD combines these methods by generating structures that fit a specific pharmacophore [137] .

Pharmacophores have also been used to design multi-target ligands, by combining known pharmacophores of two targets, sometimes with computational methods [138] , sometimes simply by creating chimeric molecules [138, 139] . More examples of the pharmacophore modeling application in the hit to lead optimization could be found in the following section.

Pharmacophore-based thinking is a common strategy in lead optimization, even if it is not always supported by computational methods (see Figs. 4 and 5) . Many medicinal chemists simply divide a molecule into different areas and functionalities, to then modify them by replacing functional groups with moieties with similar functionality. Even though pharmacophore-based in silico modeling is not used in lead optimization as often as docking, several high-quality publications using pharmacophore models could be found. The reasons, why we have excluded most of the found studies are quite similar like in the case of the docking, i.e.: mostly they (1) used pharmacophore modeling just to elucidate the mechanism of action of already optimized compounds (2) the authors have only proposed the optimization, but haven't performed it (3) haven't used appropriate methodology to validate their modeling results, etc. Those reasons are again pointing towards userbased issues and not the limitations of the pharmacophore modeling. The following section is showing the successful application of pharmacophore modeling in the hit-to-lead optimization process. This strategy was for example used to design a ligand for botulinum neurotoxin serotype A light chain (BoNT/A LC), a zinc metalloprotease that cleaves components of the SNARE (soluble Nethylmaleimide-sensitive fusion protein attachment protein receptor). Proteolysis of SNARE proteins inhibits the exocytosis of acetylcholine into neuromuscular junctions and results in lifethreatening flaccid paralysis [140] . Burnett et al. have focused on the pharmacophore-based identification and development of non-Zn(II)-coordinating small molecule, non-peptidic, inhibitors of BoNT/A LC [141] . The authors decided to use a pharmacophore modeling strategy since they have been unable to generate an Xray co-crystal of their lead candidate in complex with the BoNT/ A LC and the synthetic chemistry guided by structure-based molecular docking studies has failed to provide a derivative possessing increased potency [142] . Due to the absence of reliable binding site models, they used three-dimensional search queries derived from their gas-phase pharmacophore for BoNT/A LC inhibition. This model defined separation between the overlaps of several different, non-zinc(II)-coordinating small molecule chemotypes and resulted in a new more potent structural hybrid possessing a Ki = 600 nM [141] .

In contrast, Fu et al. successfully integrated both in silico pharmacophore modeling, MD simulations, and co-crystallization studies [143] . They focused on the development of inhibitors targeting poly (ADP-ribose) polymerase-1 (PARP1). PARP1 is an enzyme involved in the self-repair of cellular DNA damage and its inhibitors are used for breast cancer therapy. In this study, the authors screened the DrugBank and ZINC databases (in total 40 215 compounds) via the docking and successfully co-crystallized one of the hit compounds. To explore how to modify and further optimize their hit compound, they constructed the structure-based pharmacophore model. This model was based on the co-crystal structures of their hit compound as well as ten other co-crystal structures of PARP inhibitors from the literature. Using structure-based pharmacophore features, they designed and synthesized new derivatives of their hit and tested their PARP1 inhibition activities. The first series of compounds had negligible effects on PARP1 activity and led the authors to the re-evaluation of the X-ray crystal structure and the pharmacophore model. The second round of the following synthesis has led to several compounds with significantly enhanced activity and finally to a compound with a novel chemical scaffold, unique binding interaction with PARP1 protein, and almost 20 fold lower IC 50 in comparison to the parent compound. The binding mode of this compound was investigated with 10-ns MD simulations.

Crystal structures provide a ''frozen" picture of a ligand-protein complex, a system that is, in fact, dynamic in its biological environment. This raises the question of how ligand-protein interactions change in the dynamic state. MD simulations aim to answer this question and are often combined with other methods to analyze changes in the binding pattern. MD simulations are frequently used to refine docking poses and elucidate binding affinity patterns [144] , as also shown by Fu et al for PARP1 [143] , but we can also find examples in their combination with pharmacophore modeling. Wieder et al investigated the differences between pharmacophore models built from the original protein-ligand complexes obtained from the PDB and pharmacophore models built on the final structures of an MD simulation [118, 145] . They showed that the resulting models differ in feature number and type from those built on the original complexes. Furthermore, some of them displayed a better enrichment of active compounds than the original models. This shows that a combination of these techniques could lead to more successful models in the future.

Another study successfully integrating several in silico methods is a project by Shan and Zheng conducted in 2009 [146] . The authors aimed at the optimization of an inhibitor of a dishevelled PDZ domain, a potential cancer therapeutic target [147] . To this end, they have prepared the pharmacophore model of their previously identified hit compound and two non-binders. The pharmacophore derived from the hit complex structure was examined and essential components were selected based on the differences between their hit and two similar compounds that do not bind to the PDZ domain. Using this model, they screened the ChemDiv database with an algorithm that combines similarity search and docking procedures. The virtual hits were analyzed with pharmacophore modeling and the 15 selected virtual hits were examined with NMR spectroscopy for their binding affinity towards the dishevelled PDZ domain. All tested compounds showed improve-ment over the original compound, the best of them with a 30-fold improvement in KD.

In a series of studies Schuster and Vuorinen built and then refined pharmacophore models of 11b-hydroxysteroid dehydrogenases (11b-HSD) 1 and 2. The 11b-hydroxysteroid dehydrogenases are enzymes regulating the intracellular availability of glucocorticoids and activation of glucocorticoid receptors and their inhibition has considerable therapeutic potential for glucocorticoidassociated diseases [148] [149] [150] . While the aim of these studies was not the lead optimization, they illustrate how a pharmacophore-based virtual screening approach can be refined with experimental data to optimally predict activity and selectivity for a specific target. At the beginning of Schusters and Vuorinens modeling studies, no X-ray crystal structure of 11b-HSD 1 was available. Accordingly, they employed ligand-based pharmacophore models as virtual screening tools for the identification of novel classes of 11b-HSD inhibitors [151] . First models with the ability to identify 11b-HSD inhibitors were generated and experimentally validated. The results of the validation run and more experimental data from virtual hits of the models were then used to evaluate them and to refine them. The final models had a higher hit rate and better selectivity among 11b-HSD subtypes [134, 151, 152] .

As shown by the variety of success stories published in the scientific literature molecular modeling techniques have the ability to support the drug development process at various stages. It is, however vital not to oversimplify the complex problems faced in drug development. The meaningful application of computational models requires careful curating of datasets, extensive model validation, and thorough analysis and contextualization of experimental results.

Even more, than in other CADD applications, SAR optimization requires high-quality standards that we would like to sum up in the following questions as a take-home message:

What is my target? Even though it seems redundant to stress this, it is vital to thoroughly familiarize oneself with the available literature on the target structure. Many studies suffer from poor background reading which often leads to mistakes that could have easily been avoided or the repetition of experiments that have already been conducted by other groups. Different binding sites, binding modes, and structural changes undergone by the protein upon binding to different ligands are usually well described in the publications related to the available structural data and can help to select the right workflow for a lead optimization problem.

How do the biological test systems work? The test systems that were and are used to gather activity data should be thoroughly scrutinized. It is very common that the applied assay systems do not directly reflect ligand/protein binding making it difficult to use the generated data for the optimization of an in silico workflow. Especially when multiple different binding modes are possible it becomes key to determine how they can be experimentally distinguished. It is also important to be aware of putative assay interference, often caused by PAINS and other well-known promiscuous motifs.

How good is my data? This again is a less self-explanatory point than one might think. The large amount of activity data collected in public databases like the ChEMBL (ebi.ac.uk) tempts the user to use automated scripts to create activity datasets for validation on the fast track. However, this often leads to low data quality in the set. Different assay types are mixed up andthey contain data that is not representative for direct target ligand binding. Different binding sites might not always be investigated separately and the intermixing of all these different types of data leads to contradictions and in the end poor models. A smaller high-quality dataset can lead to better results.

What are my software tools? There is a reason why non of the many available software solutions for docking and pharmacophore modeling has become dominant in the field. The success of algorithms and SFs hinges largely on their suitability for a specific project or target. Empirical SFs often outperform energy-based functions, when they are applied to structures that are also represented in the dataset they are trained with they might, however, fail if presented with a novel binding mode that is not sampled in the training data. If available, use multiple different programs and approaches and select the most suitable one in a thorough validation process. While this takes time, it leads to more high-quality simulations and results.

Is my workflow valid? Employ all available data to test and validate your workflow. If simple validation strategies like re-docking and cross-docking are performed, they vastly improve the reliability of any predictions made by the workflow.

Can my workflow be refined? Adapt and refine your computational models as new data becomes available. Build a system to learn from every compound that is tested. Build specialized SFs for your specific compound class. Find relevant key interactions and aim to elucidate the dominant binding orientation. In many cases also automated approaches can be used to analyze vast amounts of data and extract the relevant findings.

What else? Structure-based molecular modeling can be a very powerful and valuable tool in lead optimization if it is used with care and sound methodology. We hope the readers of this review can use it to find the best way to employ molecular modeling techniques in their own optimization projects. 

Utilization of operational schemes for analog synthesis in drug design

Free Wilson Analysis. Theory, applications and its relationship to Hansch analysis

Drug metabolism in preclinical drug development: a survey of the discovery process, toxicology, and computational tools

Computational chemistry-driven decision making in lead generation

Efficient drug lead discovery and optimization

Hemoglobin interaction in sickle cell fibers. I: Theoretical approaches to the molecular contacts

A geometric approach to macromolecule-ligand interactions

Protein docking using a single representation for protein surface, electrostatics, and local dynamics

Ligand docking to proteins with discrete side-chain flexibility

Molecular docking to ensembles of protein structures

Protein-ligand docking: current status and future challenges

MCDOCK: a Monte Carlo simulation approach to the molecular docking problem

Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function

Protein docking using a genetic algorithm

Classification of current scoring functions

Molecular docking and structure-based drug design strategies

An overview of scoring functions used for protein-ligand interactions in molecular docking

Bio-inspired algorithms applied to molecular docking simulations

Genetic algorithm optimization in drug design QSAR: Bayesian-regularized genetic neural networks (BRGNN) and genetic algorithm-optimized support vectors machines (GA-SVM)

Molecular docking algorithms

Molecular recognition and docking algorithms

Genetic algorithms in molecular recognition and design

Machine-learning scoring functions for structure-based drug lead optimization

From machine learning to deep learning: Advances in scoring functions for protein-ligand docking

Implementing QM in docking calculations: is it a waste of computational time?

Forging the basis for developing protein-ligand interaction scoring functions

The protein data bank

The protein data bank (PDB), its related services and software tools as key components for in silico guided drug discovery

High-quality dataset of protein-bound ligand conformations and its application to benchmarking conformer ensemble generators

Software for molecular docking: a review

Challenges and current status of computational methods for docking small molecules to nucleic acids

Comprehensive evaluation of ten docking programs on a diverse set of protein-ligand complexes: the prediction accuracy of sampling power and scoring power

Benchmarking of different molecular docking methods for protein-peptide docking

Comprehensive evaluation of fourteen docking programs on protein-peptide complexes

Comparative assessment of seven docking programs on a nonredundant metalloprotein subset of the PDBbind refined

Validation of molecular docking programs for virtual screening against dihydropteroate synthase

The performance of several docking programs at reproducing protein-macrolide-like crystal structures

Docking validation resources: protein family and ligand flexibility experiments

The PyMOL Molecular Graphics System

Benchmarking sets for molecular docking

Virtual screening workflow development guided by the ''receiver operating characteristic" curve approach. application to high-throughput docking on metabotropic glutamate receptor subtype 4

Design of environmentally friendly neonicotinoid insecticides with bioconcentration tuning and Bi-directional selective toxic effects

Environment-friendly PCN derivatives design and environmental behavior simulation based on a multi-activity 3D-QSAR model and molecular dynamics

High ultraviolet sensitivity of phthalic acid esters with environmental friendliness after modification through pharmacophore modeling associated with the solvation effect

Can we use docking and scoring for hit-to-lead optimization

Assessment of scoring functions and in silico parameters for AChEligand interactions as a tool for predicting inhibition potency

Endogenous metabolites of vitamin E limit inflammation by targeting 5-lipoxygenase

Discovery of a benzenesulfonamide-based dual inhibitor of microsomal prostaglandin E2 synthase-1 and 5-lipoxygenase that favorably modulates lipid mediator biosynthesis in inflammation

Refined docking as a valuable tool for lead optimization: application to histamine h3 receptor antagonists

Lessons from the COVID-19 pandemic for advancing computational drug repurposing strategies

A public-private partnership to enrich the development of in silico predictive models for pharmacokinetic and cardiotoxic properties

Déjà vu: stimulating open drug discovery for SARS-CoV-2

Applications of water molecules for analysis of macromolecule properties

Molecular docking with Gaussian Boson Sampling

End-point binding free energy calculation with MM/PBSA and MM/GBSA: strategies and applications in drug design

Opportunities and challenges using artificial intelligence in ADME/Tox

In silico approaches and tools for the prediction of drug metabolism and fate: a review

Exploring structure-activity relationships with threedimensional matched molecular pairs-a review

Structure-based design for binding peptides in anti-cancer therapy

Decoys selection in benchmarking datasets: overview and perspectives

Docking of covalent ligands: challenges and approaches

Solvents to fragments to drugs: MD applications in drug design

Protein-peptide docking: opportunities and challenges

Metal-ligand interactions in drug design

Ensemble docking in drug discovery

Bridging molecular docking to molecular dynamics in exploring ligand-protein recognition process: an overview

Modeling of ribonucleic acid-ligand interactions

Predicting drug metabolism: experiment and/or computation?

Structural optimization of 4-chlorobenzoylpiperidine derivatives for the development of potent, reversible, and selective monoacylglycerol lipase (MAGL) inhibitors

New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays

Seven year itch: pan-assay interference compounds (PAINS) in 2017-utility and limitations

PLIP: fully automated protein-ligand interaction profiler

Structure-based lead optimization and biological evaluation of BAX direct activators as novel potential anticancer agents

SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules

BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification

Design, synthesis and optimization of bis-amide derivatives as CSF1R inhibitors

Lead optimization of a dihydropyrrolopyrimidine inhibitor against phosphoinositide 3-kinase (PI3K) to improve the phenol glucuronic acid conjugation

Computer-aided identification and lead optimization of dual murine double minute 2 and 4 binders: structure-activity relationship studies and pharmacological activity

Androgen receptor plasticity and its implications for prostate cancer therapy

Best practices of computer-aided drug discovery: lessons learned from the development of a preclinical candidate for prostate cancer with a new mechanism of action

Selectively targeting the dimerization interface of human androgen receptor with smallmolecules to treat castration-resistant prostate cancer

Ultra-large library docking for discovering new chemotypes

Design, synthesis and structure-activity relationships of a novel class of sulfonylpyridine inhibitors of Interleukin-2 inducible T-cell kinase (ITK)

Identification of a novel and selective series of Itk inhibitors via a template-hopping strategy

Virtual screening and structure-based discovery of indole acylguanidines as potent beta-secretase (BACE1) inhibitors

Use of structure-based drug design approaches to obtain novel anthranilic acid acyl carrier protein synthase inhibitors

Integrated strategy for lead optimization based on fragment growing: the diversity-oriented-target-focused-synthesis approach

SAR studies on truxillic acid mono esters as a new class of antinociceptive agents targeting fatty acid binding proteins

Design, synthesis, SAR, and molecular modeling studies of acylthiocarbamates: a novel series of potent non-nucleoside HIV-1 reverse transcriptase inhibitors structurally related to phenethylthiazolylthiourea derivatives

Structurebased design, synthesis, and biological evaluation of conformationally restricted novel 2-alkylthio-6-[1-(2,6-difluorophenyl)alkyl]-3,4-dihydro-5-alkylpyrimidin-4(3H)-on es as non-nucleoside inhibitors of HIV-1 reverse transcriptase

Gyrase ATPase domain as an antitubercular drug discovery platform: structure-based design and lead optimization of nitrothiazolyl carboxamide analogues

Stabilization of Gquadruplex DNA with platinum(II) Schiff base complexes: luminescent probe and down-regulation of c-myc oncogene expression

Discovery of selective 4-Amino-pyridopyrimidine inhibitors of MAP4K4 using fragment-based lead identification and optimization

Structure-based optimization of morpholino-triazines as PI3K and mTOR inhibitors

From the cyclooxygenase-2 inhibitor celecoxib to a novel class of 3-phosphoinositidedependent protein kinase-1 inhibitors

Synthesis and evaluation of novel inhibitors of Pim-1 and Pim-2 protein kinases

2-(4-Chlorophenyl)-2-oxoethyl 4-benzamidobenzoate derivatives, a novel class of SENP1 inhibitors: virtual screening, synthesis and biological evaluation

Discovery of novel aldose reductase inhibitors using a protein structurebased approach: 3D-database search followed by design and synthesis

Progresses in the pursuit of aldose reductase inhibitors: the structure-based lead optimization step

Design, synthesis and molecular modelling studies of novel 3-acetamido-4-methyl benzoic acid derivatives as inhibitors of protein tyrosine phosphatase 1B

Bicyclic and tricyclic thiophenes as protein tyrosine phosphatase 1B inhibitors

Lead optimization toward proof-of-concept tools for huntington's disease within a 4-(1h-pyrazol-4-yl)pyrimidine class of Pan-JNK inhibitors

Lead optimization of isocytosine-derived xanthine oxidase inhibitors

Isocytosinebased inhibitors of xanthine oxidase: design, synthesis, SAR, PK and in vivo efficacy in rat model of hyperuricemia

Discovery of a rhodanine class of compounds as inhibitors of Plasmodium falciparum enoyl-acyl carrier protein reductase

Identification of novel molecular scaffolds for the design of MMP-13 inhibitors: a first round of lead optimization

Discovery of tarantula venom-derived NaV1.7-inhibitory JzTx-V peptide 5-Br-Trp24 analogue AM-6120 with systemic block of histamine-induced pruritis

Pharmacophore modeling and applications in drug discovery: challenges and recent advances

Glossary of terms used in medicinal chemistry

LigandScout: 3-D pharmacophores derived from proteinbound ligands and their use as virtual screening filters

Scaffold-hopping" by topological pharmacophore search: a contribution to virtual screening

Next generation 3D pharmacophore modeling

PHASE: a novel approach to pharmacophore modeling and 3D database searching

Pharmacophore modeling for COX-1 and -2 inhibitors with LigandScout in comparison to Discovery Studio

Target-specific drug design method combining deep learning and water pharmacophore

General Purpose Structure-Based drug discovery neural network score functions with humaninterpretable pharmacophore maps

Applications of the pharmacophore concept in natural product inspired drug design

Common hits approach: combining pharmacophore modeling and molecular dynamics simulations

Recent advances in scaffold hopping

Chemical structure-biological activity models for pharmacophores' 3D-interactions

Generation of three-dimensional pharmacophore models

Pharmacophore modeling for ADME

Pharmacophore modeling for antitargets

From the protein's perspective: the benefits and challenges of protein structure-based pharmacophore modeling

Pharmacophore modelling: a forty year old approach and its modern synergies

Computational ligand-based rational design: role of conformational sampling and force fields in model development

Molecule-pharmacophore superpositioning and pattern matching in computational drug design

A ligand-based approach to understanding selectivity of nuclear hormone receptors PXR, CAR, FXR, LXRa, and LXRb

Structural analysis of the lymphocyte-specific kinase Lck in complex with non-selective and Src family selective kinase inhibitors

Protein kinase inhibition by staurosporine revealed in details of the molecular interaction with CDK2

Methods for generating and applying pharmacophore models as virtual screening filters and for bioactivity profiling

Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection-what can we learn from earlier mistakes

Improving structure-based virtual screening by multivariate analysis of scoring data

Pharmacophore model refinement for 11b-hydroxysteroid dehydrogenase inhibitors: search for modulators of intracellular glucocorticoid concentrations

Novel b-and camino acid-derived inhibitors of prostate-specific membrane antigen

Computer-based de novo design of drug-like molecules

PhDD: A new pharmacophore-based de novo design method of drug-like molecules combined with assessment of synthetic accessibility

Discovery of the first dual inhibitor of the 5-lipoxygenase-activating protein and soluble epoxide hydrolase using pharmacophore-based virtual screening

Inhibitors of the Arachidonic Acid Cascade: Interfering with Multiple Pathways

Highly specific interactions between botulinum neurotoxins and synaptic vesicle proteins

Pharmacophore-guided lead optimization: the rational design of a non-zinc coordinating, sub-micromolar inhibitor of the botulinum neurotoxin serotype a metalloprotease

Three-dimensional database mining identifies a unique chemotype that unites structurally diverse botulinum neurotoxin serotype A inhibitors in a three-zone pharmacophore

Crystal structure-based discovery of a novel synthesized PARP1 inhibitor (OL-1) with apoptosisinducing mechanisms in triple-negative breast cancer

Are automated molecular dynamics simulations and binding free energy calculations realistic tools in lead optimization? an evaluation of the linear interaction energy (LIE) method

Comparing pharmacophore models derived from crystal structures and from molecular dynamics simulations

Optimizing Dvl PDZ domain inhibitor by exploring chemical space

Identification of a specific inhibitor of the dishevelled PDZ domain

Transgenic amplification of glucocorticoid action in adipose tissue causes high blood pressure in mice

A transgenic model of visceral obesity and the metabolic syndrome

Metabolic syndrome without obesity: Hepatic overexpression of 11beta-hydroxysteroid dehydrogenase type 1 in transgenic mice

The discovery of new 11b-hydroxysteroid dehydrogenase type 1 inhibitors by common feature pharmacophore modeling and virtual screening

Characterization of activity and binding mode of glycyrrhetinic acid derivatives inhibiting 11b-hydroxysteroid dehydrogenase type 2

Generation of ligand-based pharmacophore model and virtual screening for identification of novel tubulin inhibitors with potent anticancer activity

Discovery, characterization, and lead optimization of 7-azaindole nonnucleoside HIV-1 reverse transcriptase inhibitors

Identification of a novel selective small-molecule inhibitor of protein arginine methyltransferase 5 (PRMT5) by virtual screening, resynthesis and biological evaluations

Discovery of N6-phenyl-1H-pyrazolo[3,4-d]pyrimidine-3,6-diamine derivatives as novel CK1 inhibitors using common-feature pharmacophore model based virtual screening and hit-to-lead optimization

Fragment-based discovery of subtype-selective adenosine receptor ligands from homology models

ones: a new class of selective A1 adenosine receptor antagonists

Rational design, discovery, and synthesis of a novel series of potent growth hormone secretagogues

Discovery of novel acyl coenzyme a: cholesterol acyltransferase inhibitors: pharmacophore-based virtual screening, synthesis and pharmacology

Identification of potent and selective small-molecule inhibitors of caspase-3 through the use of extended tethering and structure-based drug design

Discovery of novel and selective adenosine A2A receptor antagonists for treating Parkinson's disease through comparative structure-based virtual screening

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.