key: cord-0144656-feqy4m29 authors: Cofala, Tim; Elend, Lars; Mirbach, Philip; Prellberg, Jonas; Teusch, Thomas; Kramer, Oliver title: Evolutionary Multi-Objective Design of SARS-CoV-2 Protease Inhibitor Candidates date: 2020-05-06 journal: nan DOI: nan sha: 18f936e64748ec2d1f367006299234fb0882dc7d doc_id: 144656 cord_uid: feqy4m29 Computational drug design based on artificial intelligence is an emerging research area. At the time of writing this paper, the world suffers from an outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus replication is via protease inhibition. We propose an evolutionary multi-objective algorithm (EMOA) to design potential protease inhibitors for SARS-CoV-2's main protease. Based on the SELFIES representation the EMOA maximizes the binding of candidate ligands to the protein using the docking tool QuickVina 2, while at the same time taking into account further objectives like drug-likeliness or the fulfillment of filter constraints. The experimental part analyzes the evolutionary process and discusses the inhibitor candidates. At the time of writing this paper, researchers around the globe are searching for a vaccine or an effective treatment against the 2019 novel coronavirus (SARS-CoV-2). One strategy to limit virus replication is protease inhibition. A biomolecule called ligand binds to a virus protease enzyme and inhibits its functional properties. For SARS-CoV-2 the crystal structure of its main protease M pro has been solved, e.g. by Jin et al. [17] . The search for a valid protease inhibitor can be expressed as optimization problem. As not only the binding of the ligand is an important objective, but also further properties like drug-likeliness or filter properties, we comprise the molecule search problem as multi-objective optimization problem, which we aim to solve with evolutionary algorithms. This paper is structured as follows. In Section 2 we shortly repeat the basics of protease inhibition and the connection to the novel coronavirus. Section 3 gives an overview of related work on evolutionary molecule design.In Section 4 we introduce molecule metrics, which we aim to optimize with the EMOA that is presented in Section 5. The experimental part in Section 6 presents our experimental results and discusses the evolved molecules. Conclusions are drawn in Section 7, where also prospective future research directions are presented. As of late 2019, a novel respiratory disease named COVID-19 spread worldwide. is caused by SARS-CoV-2, which belongs to the coronavirus family like the well-known severe acute respiratory syndrome coronavirus (SARS-CoV). As RNA virus SARS-CoV-2's replication mechanism hijacks the cell mechanisms for replication. An essential part of the virus replication process is a cleavage process, in which the virus protease enzyme cuts long precursor polyproteins into mature non-structural proteins, see Fig. 1 . If a ligand biomolecule binds to the protease it can prevent and inhibit this cleavage process. A ligand binds to the target protein in a so-called pocket based on various non-covalent interactions like hydrophobic interactions, hydrogen bonding, π-stacking, salt bridges, and amide stacking [15] . With the proper ligand, the protease cleavage process is inhibited, in practice measured by the half maximal inhibitory concentration IC 50 corresponding to the inhibitory substance quantity needed to inhibit 50% of the protease process. The protease inhibitor is the target of the drug design process, which we aim to find with evolutionary search. Computational modeling of protein-ligand binding is a complex process depending on protein-ligand geometry, chemical interactions as well as various constraints and properties like hydration and quantum effects. Complex molecular dynamics computations are often too expensive in computational drug design. Instead, docking tools like AutoDock [25] , see Section 4, are supposed to be sufficient for a coarse binding affinity estimation based on a simplification of the physical reality. For SARS-CoV-2 the crystal structure of its main protease M pro is known, e.g. [8, 17, 36] . Various attempts to design inhibitors have been made recently, e.g., based on known protease inhibitors for other viruses [7, 19] , based on virtual screening [14] , and computational drug design [24] . Methods for de novo drug design can be categorized in different ways [10, 5] . Some works construct molecules directly from atoms [11, 28] , while others use chemical fragments as their smallest building block [29] . The goal also varies among publications and is sometimes to find drugs that bind to a specific protein binding site like in our work or [29, 35] , and other times the goal is to generate any drug-like molecules as in [11, 30] . ADAPT [29] is a fragment-based method that optimizes for molecule to bind to a specific binding site using a genetic algorithm on an acyclic graph-representation consisting of chemical fragments. The fitness of a resulting compound is evaluated through a docking simulation with a target protein binding site and common drug-likeness indicators. On the other hand, Douguet et al. [11] use the SMILIES representation and as such work on the level of atoms instead of fragments. In contrast to our work their genetic algorithm optimizes for drug-likeness only instead of binding to a specific ligand. Furthermore, the algorithm is single-objective and simply weighs the different properties in a fitness function using constant coefficients. Similarly, Nigam et al. [28] present a genetic algorithm on the SMILIES representation for general molecule design. The method increases diversity by using a deep neural network as an adaptive fitness function to penalize longsurviving molecules. In contrast to methods like ours that try to stay inside the distribution of drug-like molecules, the genetic algorithm is free to explore the chemical space in its entirety. Finally, LigBuilder [35] is a software tool for drug design that is based on a genetic algorithm. It allows optimizing for the interesting quality of binding to multiple targets, which enables tackling more complex diseases with a single drug without the risk of drug-drug interactions that comes with combination drugs (treatment with multiple compounds). Brown et al. [6] have utilized an approach for multi-objective optimization of molecules applying a graph-based representation of molecules. The multi-objective evolutionary algorithm applies a Pareto ranking scheme for the optimization process. Wagner et al. [33] have developed a tool which identifies potential CNS drugs by means of a multi-objective optimization. The molecules have been optimized for six physical properties. In contrast to the approach presented here, this tool is not based on evolutionary algorithms but on medical knowledge. A short review, which focuses on the multi-objective optimization of drugs, is given in [27] . In this context, different problem definitions and various Multiobjective optimization methods are summarized. Since there are many competing approaches to drug and molecule design, benchmarking platforms like MOSES [30] or GuacaMol [5] have emerged. These propose fixed datasets and metrics to measure and compare the generative abilities of different algorithms. Among the benchmarked algorithms are random sampling, variational autoencoders, generative adversarial networks, Monte-Carlo tree search, and others. In computational drug design, molecule metrics define the optimization objectives. This section introduces the five metrics our optimization approach is based on. The major objective in protease inhibitor search is the protein-ligand binding affinity. A widespread tool for this metric is the automated docking tool AutoDock [25] , which will also be used by the OpenPandamics 1 activities to fight COVID-19. AutoDock performs very fast calculations of the binding energy by using gridbased look-up tables. For this purpose, the protein is embedded in a grid. The binding energy of all individual atoms of the ligand is calculated at all positions of the grid using semi-empirical force field methods. Using a Lamarckian genetic algorithm, the best binding position and binding energy of the complete ligand can be determined with the help of the look-up tables. Through various improvements, the accuracy and especially the performance of AutoDock has been significantly improved. In AutoDock Vina [32] a hybrid scoring function based on empirical and knowledge-based data is used instead of the force field method. QuickVina [32] and QuickVina 2 [1] mainly improve the search algorithm by performing the most complex part of the optimization only for very promising ligand positions. We use QuickVina 2 for the calculation of the binding energies of our proposed ligands, as it provides very good results at high performance. For the sake of simplicity we will use binding affinity score and docking score synonymously. The informative value of QuickVina 2 binding scores may be limited due to a simplification of various physical properties, such as the neglect of water molecules and the changing electrical properties of ligand and protein when they interact with each other. However, it has been shown by Gaillard [16] that AutoDock Vina binding scores outperform various computational docking methods and Quickvina 2 achieves very comparable results with Autodock Vina [1] . To estimate whether a molecule can be used as a drug, its similarity to other existing drugs can be considered. This is based on the fact that many important physiochemical properties of drugs follow a certain distribution. Lipinski's rule of five [22] which specifies ranges of values for different molecular properties such as size, is frequently used. A major disadvantage, however, is that this rule is only a rule of thumb and only checks whether its criteria are met or not. Among modern drugs there are molecules that violate more than one of Lipinski's rules. A modern approach by Bickerton et al. [4] is based on multi-criteria optimization and the principle of desirability. Instead of a fixed value range, all relevant molecular properties are evaluated by an individual desirability function. A single score (QED) is then determined by geometrically averaging all desirability functions. In addition to the similarity to known drugs, the similarity to naturally occurring biomolecules (natural products) is also an important metric. Natural products have numerous bioactive structures that were created and validated by nature in an evolutionary process. Ertl et al. [12] have studied the key differentiating features of natural and synthetic molecules and developed a measure of similarity to natural products. This score is based on structural characteristics of the molecules, such as the number of aromatic rings and the distribution of nitrogen and oxygen atoms. Medical chemical filters are used to exclude molecules that are toxic due to their structural nature. Potentially unstable molecules whose metabolites may be toxic are also not suitable as drugs. We use the MCFs and PAINS filters described by Polykovskiy [30] as a Boolean indicator metric. For drug design it is not only important to find a molecule with the desired properties, but also a synthesizable one. Ertl and Schuffenhauer [13] create a method to estimate the synthetic accessibility of drug-like molecules and achieve a high agreement with manual estimations by experts. Such a method can easily be incorporated into a search process and we use it as one of our optimization goals, too. A different approach to the synthesis problem is taken by Segler et al. [31] . Instead of estimating synthetic accessibility, their symbolic AI driven approach searches for actual synthesis routes of desired target molecules with a combination of Monte Carlo tree search and neural networks encoding rules for reaction centers. Table 1 shows the value ranges and the optima of the five used metrics. For our experiments we unify these values to a range of [0, 1], where 0 is the optimum, as we will describe under Section 5.2. This section presents the evolutionary approach for the protease inhibitor design. For searching in the design space of biomolecules we use evolutionary algorithms (EAs), which are biologically inspired population-based search heuristics. We employ the evolution strategy oriented (µ + λ) population model [3] . A solution is defined by a string based on the self-referencing embedded strings (SELFIES) representation [21] , which is an advancement of the simplified molecular-input line-entry system (SMILIES) [34] representation. Figure 2 pictures an exemplary molecule with its structural formula and the corresponding SMILIES and SELFIES representations. Each string consist of symbols, encoding the occurring atoms, bindings, branches and ring sizes. SELFIES implements a formal grammar, and the interpretation of a symbol depends on derivation rules and state of derivation. In contrast to SMILIES, SELFIES strings are always syntactically correct and therefore always yield valid molecules [21] . The EA's initial population consists of individuals with randomly generated strings representation of a fixed length. Since multiple SELFIES strings can be translated to the same SMILIES string, the resulting SMILIES string is compared to a global list of all previously generated individuals. Individuals with a representation that already occurred are discarded and a new individual is generated. This process is repeated until the population consist of unique individuals and also applies for the generation of offspring individuals. Since every SELFIES string corresponds to a valid molecule and every molecule can be expressed in SELFIES representation, the design space can be explored by applying random mutations to the strings -more precisely the SELFIES symbols of which the string is composed. Offspring solutions are created by choosing a random individual from the parental population. Each child is mutated with the following mutation operations with defined probabilities: Replacement is applied independently for every symbol with a probability of p r . The symbol is replaced by a random SELFIES symbol. Insertion is applied with probability p i . A random symbol is inserted at a random position in the individual's representation. Deletion is applied with probability p d and deletes a randomly chosen symbol of the individual's representation. The new symbols are drawn from a set of symbols inspired by [21] . This set has been extended with benzene as a separate, composed symbol, to increase the likelihood of its occurrence and ease the generation of complex molecules. Additionally, each symbol is assigned a weighting parameter to adjust the probability with which the is is randomly selected. This weighting can be used to increase the likelihood of more common symbols (e.g. [C]) in contrast to more complex ones (e.g. branches and ring structures). For the selection operator the fitness f (x) of each solution candidate is evaluated based on the molecule metrics binding affinity score, QED, filters, NP, and SA introduced in Section 4. To increase the comparability, each metric is scaled to the range between 0 (best possible score) and 1 (worst possible score). The binding affinities are scaled with regard to the experimentally chosen minimum of −15 kcal/mol and maximum of 1 kcal/mol and clipped to the range between 0 and 1 with soft clipping [20] . For the single-objective baseline experiments each individual is assigned a single composed fitness value. We use a weighted sum fitness of the n introduced metrics: Individuals are evaluated in parallel. Therefore, the respective SELFIES strings are converted to the SMILIES representation. MOSES is then used for the calculation of QED, NP, and SA as well as for the application of the PAINS and MCF filters. The docking score for each compound is determined by QuickVina 2. Therefore, RDKit 2 and MGLTools 3 are used to generate PDB and PDBQT files for the respective SMILIES representation. The binding energy is calculated in regards to the COVID-19 M pro (PDB ID: 6LU7 [23] ) 4 with the search grid being centered around the native ligand position and sized to 22 × 24 × 22 Å 3 . The exhaustiveness is maintained at its default value of 8. The objectives presented in Section 4 may be contradictory. For example, in preliminary experiments, we discovered that molecules with high AutoDock binding scores suffer from low QED scores. As the choice of predefined weights for objectives is difficult in advance, a multi-objective approach may be preferable in practice. In our multi-objective optimization setting in molecule space M with fitness functions f 1 , . . . , f n to minimize we seek for a Pareto set NSGA-II [9] is known to be able of approximating a Pareto set with a broad distribution of solutions in objective space, i.e., of the Pareto front. After nondominated sorting, µ non-dominated solutions maximizing the crowding distance, which corresponds to the sum of Manhattan distances between the neighboring solutions in objective space. For comparison of different multi-objective runs we also employ the S-metric measuring the dominated hypervolume in objective space with regards to a dominated reference point [2] . In this section we experimentally analyze the single-objective and the NSGA-II approaches for the protease inhibitor candidate search. For the experimental analyses, the following settings are applied. A (10 + 100)-EA is used for the singleobjective run i.e., in each generation from 10 parents 100 offspring candidate molecules are generated with the mutation operators introduced in Section 5.1 with mutation probabilities p r = 0.05, p i = 0.1, and p d = 0.1 applying plus selection. For multi-objective runs the number of parents is increased to 20 to achieve a broader distribution of solutions in objective space. No crossover is applied. Individuals are limited to a length of 80 SELFIES tokens oriented to the setting by Krenn et al. [21] . All runs are terminated after 200 generations and are repeated 20 times. Figure 3 shows the development of the previously explained normalized metrics in single-and multi-objective runs. For the single-objective runs, the best individuals according to fitness are chosen in each generation and their metrics are averaged over all runs. The optimization process concentrates on improving docking score, QED, and NP. As expected, an improvement of one metric may result in a deterioration of another, e.g., as of generation 140, when QED and NP deteriorate in favor of SA and docking score. For multi-objective runs, the best individuals for each metric are chosen in each generation and then averaged over all runs. A steady improvement with regard to all objectives is achieved here, but has to be paid with regard to deteriorations in other objectives that are not shown here. Figure 4 shows three different two-dimensional slices of the Pareto front that compare docking score to QED, NP, and SA. A Pareto front is shown for every 10th generation and their colors start at light blue for the first generation and end at dark blue for the final generation. The plots illustrate NSGA-II's ability to generate solutions with different degrees of balance between docking score and the plotted metric. In the course of the optimization process the front of non-dominated solutions has the expected tendency to move towards the lower left. This is also reflected by the S-metric, which, in average over all runs improves from 0.10 ± 0.03 in the first to 0.20 ± 0.05 in the last generation. In the slice plots deteriorations are possible due to improvements in the remaining three objectives. A comparison of final experimental results of the single-objective and NSGA-II runs is presented in Table 2 . For NSGA-II the best achieved values for each objective are shown corresponding to the corner points of the Pareto front approximation. For comparison, corresponding metric values are shown for N3 proposed as ligand in the PDB database as well as for Lopinavir, the HIV main protease inhibitor [18] . Docking scores achieved by the single objective optimization process show that the best values even overcome the scores of N3 and Lopinavir. Lopinavir and N3 bind similarly strong to M pro . NSGA-II achieves promising values for all metrics. The broad coverage of objective function values offers the practitioner a huge variety of interesting candidates. However, some of Table 2 : Experimental results of weighted-sum single-objective approach, the best values per objective for NSGA-II, the N3 ligand (from PDB 6LU7), and Lopinavir (a prominent drug candidate). Statistical evaluation for the NSGA-II method is calculated based on the best 20 individuals per objective. marks a minimization objective, while marks a maximization objective. From our observations we conclude that the SELFIES representation with our mutation operators are able to robustly achieve molecules of a certain quality. However, we expect the quality of the results to improve with mechanisms that allow the development of larger molecules to overcome fitness plateaus and local optima. Figure 5 compares the populations of the last generation of a typical single-objective and NSGA-II run. The solutions in the single-objective population are similar to each other, while the solutions in the last NSGA-II population maintain a higher diversity of molecule properties. In the following we present interesting protein inhibitor candidates evolved with the single-and multi-objective approaches. In our experiments we made three main observations. The molecules generated have a strong tendency to contain aromatic ring structures. Candidates with good drug-likeliness are comparatively short. Candidates with high docking scores often have unrealistic geometries. In Figure 6 we present a list of six promising protease inhibitors (PI) candidates with properties as radar plots, structural formulas, and chemical names. PI-I (a) to PI-III (c) are results from single-objective runs, while PI-IV (c) to PI-VI (f) show candidates generated by NSGA-II. Points near the border of the radar plot represent better values, e.g., a zero value lies at the corner of a plot. All candidates fulfill the filter condition. PI-1 achieves a high SA value with a reasonable docking score. PI-II achieves an excellent docking score with −9.7 kcal/mol. PI-III, PI-IV, and PI-VI achieve excellent drug-likeliness QED with good docking results around −7.0 kcal/mol. An interesting candidate balancing all objectives is PI-V with docking score −7.7 kcal/mol and QED value of 0.75. Last, we visualize how the ligand candidates are located in the M pro protein pocket optimized by QuickVina 2. Figure 7 shows candidates (a) PI-I and (b) PI-V in their M pro pockets. In this paper we introduced an evolutionary multi-objective approach to evolve protein inhibitor candidates for the M pro of SARS-CoV-2, which could be a starting point for drug design attempts, aiming at optimizing the QuickVina 2-based protein-ligand binding scores and further important objectives like QED druglikeliness and filter properties. In the experimental part we have shown that the evolutionary processes are able to evolve interesting inhibitor candidates. Many of them achieve promising metrics with ordinary structures, but also unconventional candidates have been evolved that may be worth for a deeper analysis. As the informative value of QuickVina 2 binding scores and also the further metrics may be limited in practice, we understand our approach as AI-assisted virtual screening of the chemical biomolecule space. Future research will focus on the improvement of protein-ligand models for more detailed and more efficient binding affinity models. Further, we see potential to improve the SELFIES representation in terms of bloated strings that represent comparatively small molecules and mechanisms to guarantee their validity. Fast, accurate, and reliable molecular docking with QuickVina 2 SMS-EMOA: Multiobjective selection based on dominated hypervolume Evolution strategies -A comprehensive introduction Quantifying the chemical beauty of drugs GuacaMol: Benchmarking Models for de Novo Molecular Design A Graph-Based Genetic Algorithm and Its Application to the Multiobjective Evolution of Median Molecules The FDA-approved Drug Ivermectin inhibits the replication of SARS-CoV-2 in vitro Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease A fast and elitist multiobjective genetic algorithm: NSGA-II Evolutionary algorithms for de novo drug design -A survey A genetic algorithm for the automated generation of small organic molecules: Drug design using an evolutionary algorithm Natural Product-likeness Score and Its Application for Prioritization of Compound Libraries Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions Inhibitors for Novel Coronavirus Protease Identified by Virtual Screening of 687 Million Compounds A systematic analysis of atomic protein-ligand interactions in the PDB Evaluation of AutoDock and AutoDock Vina on the CASF-2013 Benchmark Structure of M pro from COVID-19 virus and discovery of its inhibitors Safety and antiviral activity of lopinavir/ritonavirbased therapy in human immunodeficiency virus type 1 (HIV-1) infection Potential Inhibitor of COVID-19 Main Protease (Mpro) From Several Medicinal Plant Compounds by Molecular Docking Study Neural Network-Based Approach to Phase Space Integration Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings 6LU7: The crystal structure of COVID-19 main protease in complex with an inhibitor N3 Inhibition of the Main Protease 3CL-pro of the Coronavirus Disease 19 via Structure-Based Ligand Design and Molecular Modeling AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility NGLview-interactive molecular graphics for Jupyter notebooks Multi-objective optimization methods in drug design Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space A genetic algorithm for structure-based de novo design Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models Planning chemical syntheses with deep neural networks and symbolic AI AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Central Nervous System Multiparameter Optimization Desirability: Application in Drug Discovery SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules LigBuilder V3: A Multi-Target de novo Drug Design Approach Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors We would like to thank Ahmad Reza Mehdipour, Max Planck Institute of Biophysics, Frankfurt, Germany, for useful suggestions and comments that helped to improve this manuscript.