key: cord-0234641-nbqn1xjv authors: Huang, Kexin; Fu, Tianfan; Khan, Dawood; Abid, Ali; Abdalla, Ali; Abid, Abubakar; Glass, Lucas M.; Zitnik, Marinka; Xiao, Cao; Sun, Jimeng title: MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning date: 2020-10-05 journal: nan DOI: nan sha: 91b0a0595f785e1a4de374240453d35d4b3f0f87 doc_id: 234641 cord_uid: nbqn1xjv The efficacy of a drug depends on its binding affinity to the therapeutic target and pharmacokinetics. Deep learning (DL) has demonstrated remarkable progress in predicting drug efficacy. We develop MolDesigner, a human-in-the-loop web user-interface (UI), to assist drug developers leverage DL predictions to design more effective drugs. A developer can draw a drug molecule in the interface. In the backend, more than 17 state-of-the-art DL models generate predictions on important indices that are crucial for a drug's efficacy. Based on these predictions, drug developers can edit the drug molecule and reiterate until satisfaction. MolDesigner can make predictions in real-time with a latency of less than a second. A drug's efficacy depends on various factors. Among others, two crucial factors are binding affinity that measures the strength of drug-protein target interactions and pharmacokinetics (PK) indices that measure how the body reacts to the drug [13] . PK indices include absorption, distribution, metabolism, excretion, and toxicity (ADMET). However, to obtain these values in a traditional wet lab setting is notoriously expensive and time-consuming [4] . Deep learning (DL) models have recently shown high accuracy in predicting the binding affinity and various ADMET indices [9, 11, 8] . Demonstrated Technology. Here, we propose a humans-in-the-loop web user interface (UI) called MolDesigner to help drug developers leverage DL to improve drug designs. Given a drug molecule, MolDesigner predicts various important indices from state-of-the-art DL models, powered by our DeepPurpose library [8] . They provide real-time feedback to help drug developers modify the molecular structure and iterate on this process until the models start to show the drug candidate has reasonable measures and properties. This way, drug developers can leverage DL models to rapidly design a drug that has not only high efficacy to the target protein but also has ideal ADMET properties. MolDesigner is especially useful for lead optimization where drug developer makes chemical modification on the molecular structure to improve the drug's quality. In MolDesigner, the user (drug designer) first specifies the target protein (e.g., amino acid sequence) of interest. The user can then begin drawing the initial drug candidate or reading the molecular structure from a file using any popular chemical data format. In less than one second latency, MolDesigner returns predictions, representing binding affinity to the target protein along with predictions for 16 crucial PK indices. The user can then review predictions and make further chemical modifications based on the expert knowledge, thus iteratively refining the drug candidate's structure until the candidate molecule is predicted to have the desired properties. This draw-predict cycle can be repeated as needed until the user is satisfied. The user can also specify the DL model to make 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Demonstration Track. predictions. We provide five DL models for binding affinity prediction and three models for each of the PK properties. In total, more than 50 DL models power the backend of MolDesigner. Implementation. We use DeepPurpose [8] to rapidly generate DL models for drug-target interaction (DTI) and drug property predictions. We then use Gradio [1] to set up the web UI. We use the BindingDB dataset for DTI prediction and collect 16 datasets from various sources for the PK indices. The PK indices include solubility, lipophilicity, Caco-2, human-intestinal-absorption (HIA), P-gp inhibition, bioavailability, blood-brain-barrier (BBB), plasma-protein-binding rate (PPBR), various CYP inhibitors, half-life, clearance, and clinical toxicity. We provide the following DL models for drug-protein binding affinity prediction. For modeling drugs, we use Message Passing Neural Network (MPNN) [6] , Convolutional Neural Network (CNN) [10] , Deep Neural Network (DNN) on Morgan fingerprints [12] . For modeling proteins, we have CNN and DNN, which are trained on proteins' amino acid composition fingerprints. For drug property prediction, MolDesigner supports MPNN, CNN, and DNN on Morgan fingerprints. The Gradio interface also supports taking screenshots and flagging outputs for easy tracking and future reference. Novelty. MolDesigner makes the following contributions. 1) Existing drug discovery interfaces focus on one task [5, 2, 3] , such as DTI or one of ADMET prediction tasks. In contrast, MolDesigner is the first to integrate 16 ADMET predictions and binding affinity in a single click. 2) Earlier work focuses on a demo of prediction model where a separate output page is generated after user input [5, 2, 3] . In contrast, MolDesigner is designed to keep humans in the loop. MolDesigner focuses on molecular design, enabling rapid predictions, exploration, and interactive adjustments as the user modifies chemical reactions. It provides a unified interface where the user can see both drug molecules and predicted values on the same page. 3) MolDesigner is powered by over 50 state-of-the-art DL models, whereas prevailing tools are only designed to use one model. For ADMET prediction, the current web UI mainly uses classic machine learning methods. 4) MolDesigner is fast. The overall prediction latency is less than one second, allowing rapid molecular design. Interaction with Virtual Audience. We will start with a demo showcasing how our framework aids drug design. Then, depending on the audience's background, we will design two interactive activities. For a computational audience with no knowledge of drug development, we will have a game where users will play the role of chemists developing drugs for one of the COVID-19 targets [7] . We will start from the drug that already has a high affinity to the target and let the user make chemical modifications. As the user draws drugs, we will demonstrate how predicted drug properties vary with molecular structure. To engage with drug discovery experts in the audience, we will ask them to propose a target of interest and a potential drug molecule. We will ask them what changes to the molecular structure would they introduce to improve predicted properties, and see if our model can corroborate their hypothesis, followed by discussion. For the audience to engage with the UI, we will use remote control features to draw on the screen. Gradio: Hassle-free sharing and testing of ML models in the wild DT-Web: a web-based application for drug-target interaction and drug combination prediction through domain-tuned network-based inference Drug-target interaction prediction: databases, web servers and computational models Key factors in the rising cost of new drug discovery and development ADMETlab: a platform for systematic ADMET evaluation based on a comprehensively collected ADMET database Neural message passing for quantum chemistry A SARS-CoV-2 protein interaction map reveals targets for drug repurposing Deep-Purpose: a deep learning library for drug-target interaction prediction and applications to repurposing and screening Modeling industrial ADMET data with multitask networks Imagenet classification with deep convolutional neural networks Deepdta: deep drug-target binding affinity prediction Extended-connectivity fingerprints Target identification and mechanism of action in chemical biology and drug discovery