key: cord-0844554-rv8yosr7 authors: Unajak, Sasimanas; Sawatdichaikul, Orathai; Songtawee, Napat; Rattanabunyong, Siriluk; Tassnakajon, Anchalee; Areechon, Nontawith; Hirono, Ikuo; Kondo, Hidehiro; Khunrae, Pongsak; Rattanarojpong, Triwit; Choowongkomon, Kiattawee title: Homology modeling and virtual screening for antagonists of protease from yellow head virus date: 2014-02-22 journal: J Mol Model DOI: 10.1007/s00894-014-2116-9 sha: 451d679d22735e853442bcdd98e2091c7fb68680 doc_id: 844554 cord_uid: rv8yosr7 Yellow head virus (YHV) is one of the causative agents of shrimp viral disease. The prevention of YHV infection in shrimp has been developed by various methods, but it is still insufficient to protect the mass mortality in shrimp. New approaches for the antiviral drug development for viral infection have been focused on the inhibition of several potent viral enzymes, and thus the YHV protease is one of the interesting targets for developing antiviral drugs according to the pivotal roles of the enzyme in an early stage of viral propagation. In this study, a theoretical modeling of the YHV protease was constructed based on the folds of several known crystal structures of other viral proteases, and was subsequently used as a target for virtual screening—molecular docking against approximately 1364 NCI structurally diversity compounds. A complex between the protease and the hit compounds was investigated for intermolecular interactions by molecular dynamics simulations. Five best predicted compounds (NSC122819, NSC345647, NSC319990, NSC50650, and NSC5069) were tested against bacterial expressed YHV. The NSC122819 showed the best inhibitory characteristic among the candidates, while others showed more than 50 % of inhibition in the assay condition. These compounds could potentially be inhibitors for curing YHV infection. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00894-014-2116-9) contains supplementary material, which is available to authorized users. Yellow head virus (YHV) has been emerging for the past two decades in Thailand. Virulence of the virus is known to cause devastation in shrimp farming 2-3 days after the onset of disease. The infected shrimp exhibits yellow coloration on cephalothorax area and pale color throughout the body. The enormous economical loss of shrimp production in Thailand by YHV infection started from 1991-1995. In the latter years of yellow head disease outbreak, the diagnostic of YHV contamination in post-larvae must be determined prior to culturing. Therefore, the promising preventive methods of the disease have to be considered. The attenuation or inhibition of YHV propagation is one of the preventive methods for developing antiviral agents nowadays. YHV is classified in the order of Nidovirales, genus Okavirus, and family Roniviridae. The rod-shaped virus contains positive single-stranded RNA as a genomic material with approximately 26 kbp. Structural protein components of the virus are composed of three major types including gp116, gp64 spike glycoprotein, and p20 nucleocapsid protein [1, 2] . Among viral genetic variation, the genotypes of YHV are closely related to the Gill-associated virus (GAV) that is less virulent (or provides less virulence). A recent study revealed that YHV are composed of five open reading frames (ORFs) [3] . Among those, the ORF1ab encodes for chymotrypsin-like proteinase (3CLP), nidovirus accessory protease, and RNA-dependent RNA polymerase (RdRp), which have been known to be involved in the replication complexes, and also contained the metal-ion binding domain (MIB) [4] . Chymotrypsin-like proteinase (3CLP) was first and well characterized in GAV. The enzyme employs a "Cys-His catalytic dyad" and possesses trans-cleavage activity [4, 5] . The sequence alignment showed that 3CLP of GAV successfully located between I 2866 and G 3006 in YHVand interestingly, the putative 3CLP of YHV shared remarkably high identities (up to 87.9 %) in their amino acid sequences. Moreover, the two proteolytic cleavage sites for 3CLP; 2838 LVTHE↓VNTGN 2847 and 6455 KVNHE↓LYHVA 6464 (the cleavage site is indicated by ↓) are conserved between the GAV and YHV sequences. However, the specific role of 3CLP in YHV is yet unclear. It has been postulated that the 3CLP in YHV is involved in the regulation of viral replication complex activity [4] , like 3CLP in other Coronaviruses that process the downstream replicase domains including RdRp and helicase enzyme [6] . The other encoding protease that is found in both YHV and CLP is Nidovirus accessory protease that is widely known as papain-like protease (PLP). This enzyme contains the canonical "Cys-His catalytic dyad" and a distinctive α+β fold that is typically found in proteins that function as transcription factor involved in subgenomic messenger RNA (sgmRNA) synthesis [3] . However, the sequence alignment showed that, unlike the 3CLP, the PLP of YHVand GAV share only 73.7 % identities in their amino acid sequences [3] . The prevention of viral propagation has focused on this key enzymatic reaction as well [7, 8] . The inhibition of YHV propagation was only accomplished by YHV-protease dsRNA [9, 10] . Therefore, this viral protease could possibly be a potential target in therapeutic approach, likewise the development of viral protease inhibitors (PI) for HIV-1, HCV, and SARS, illustrate a high achievement in drug pharmaceutical. The 3CLP-specific inhibitors were rationally designed based on the three-dimensional structure of the viral proteases with computer-assisted molecular modeling. The Xray crystallographic determining the three-dimensional structure thus provide a better in-depth understanding of the association of proteins and their other target peptides. Unfortunately, crystal structure of YHV protease is not yet available; a computer-assisted homology modeling of 3CLP of YHV is currently in progress in order to obtain the molecular basis of protease cleavage site. Therefore, regarding the successful therapeutic approach targeting viral components for HIV-1, HCV and SARS, the YHV-3CLP protease should be the best target inhibiting viral propagation in YHV. In this study, the homology modeling approach was used to construct the 3D structure of the YHV protease. Molecular docking based virtual screening of synthetic small molecules from the National Cancer Institute (NCI) diversity set against the YHV protease model was performed. The program Autodock4 is used as the docking tool in this study for several reasons. First, AutoDock is widely used among docking programs in the market. In 2006, AutoDock was the most cited docking software. Second, AutoDock is free of charge to access. Last, AutoDock has also been shown to be useful in blind docking in which the location of the binding site is not known. It is very fast, provides high quality predictions of ligand conformations, and good correlations between predicted inhibition constants and experimental ones. The complexes of the proteasehit ligand were subsequently investigated for intermolecular interactions by molecular dynamics simulations. Multiple sequence alignment and homology modeling Sequence analyses, homology modeling and molecular mechanics calculation were performed on the Discovery Studio 2.5 package (Accelrys Inc., CA, USA). The amino acid sequence of the yellow head virus (YHV) protease was retrieved from the NCBI database (www.ncbi.nlm.nih.gov), covering 196 amino acid residues (Ser 2828 to Asp 3023 ) of the YHV replicase polyprotein 1a (Accession number: ACA21300). All of the template coordinates used for homology modeling were retrieved from the protein data bank (www.pdb.org), which include the eight protease enzymes from both microorganisms (tobacco etch virus, entry 1LVM; rhinovirus, entry 1CQQ; hepatitis A virus, entry 1HAV; Staphylococcus aureus, entry 1AGJ; Achromobacter lyticus, entry 1ARB) and higher organisms (bovine β-trypsin, entry 5PTP; human factor B, entry 1DLE; human heparin binding protein, entry 1A7S). All eight template sequences were multiple-aligned according to their structural superimposition of the Cα atoms on the "align and superimpose proteins" protocol. The YHV protease sequence was then automatically aligned to a set of aligned template sequences on the "align multiple sequences" protocol of which the pairwise alignment algorithm is modified from the ClustalW 1.8 [11] . This alignment protocol allows keeping the existing aligned sequences of the templates as a rigid element and thus the pre-aligned target sequence could be mobile. Sequence identities and similarities between the target and template sequences were calculated on the GeneDoc v2.7 [12] . The structural model of the YHV protease was constructed based on the unconventional sequence alignment between the target and template profile using the Modeller 9v4 [13] implemented in the "build homology models" protocol. Insertion and deletions in the target sequence with respect to the templates were defined as loops and further refined within this protocol. The best quality model with highest discrete optimized protein energy (DOPE) score was subsequently subjected to the "minimization" protocol based on the charmm22 force field [14] to remove unreasonable atomic contacts (steepest descent and conjugate gradient methods until the model reaching 0. 1 kcal mol −1 Å -1 convergence). The stereochemical qualities of the energy-minimized model were assessed by PROCHECK v3.5 [15] via submitting the coordinates to the JCSG structure validation server (www.jcsg.org). The 1364 files (in sdf MDL MOL format) of the NCI diversity dataset were obtained from the Office of the Associated Director of the Developmental Therapeutics Program (DTP), Division of Cancer Treatment and Diagnosis, National Cancer Institute, more information is available at NCI/DTP Open Chemical Repository [16] . As a description from the official NCI/DTP page, these small molecules from NCI dataset have been collected from almost 140,000 compounds on plates. They were screened following the criteria of DTP, which are hydrogen bond acceptor, hydrogen bond donor, positive charge, aromatic, hydrophobic, acid, base, and pharmacophores. Therefore, the selections of 1364 compounds have passed the formal mentioned criteria. Thereby, this set of ligand is suitable to use as a model for finding the antagonists of protease from YHV. The 3D structure of ligands were prepared by adding hydrogen atoms and rectifying an appropriate bond order and the chiral center check (stereo-chemistry) using LigPrep 2.2 with the default setting [17] . The output 1523 ligand conformations were stored in SYBYL mol2 format. The rotational bonds of ligands were treated as flexible using python script, prepare_ligand4.py. This script was implemented in the AutoDock program [18] [19] [20] . The minimized structure of YHV-protease model was obtained from the previous section (homology modeling and minimization processes). AutoDockTools version 1.5.2 was used to assign the Kollman united atom charges and the solvent parameter to this protein, as well as to perform the grid set-up process. The grid boxes were generated around the catalytic residues (C152, H30, D70; as presented in Fig. 2c ). The H30 protonated state was set to be HIP (add H atoms both δ and ε positions) by the N position point to D40. The grid points spacing was 0.375 Å in the box-sizes 60, 60, and 60 Å in each dimension. AutoGrid 4.0 was used to calculate the grid affinity maps for following atom types: A (aromatic carbon), C, HD, N, NA (hydrogen-bond-accepting N), OA (hydrogenbond-accepting O), S, SA (hydrogen-bond-accepting S), Cl, F, Br, I, P, and e (electrostatic). The protein-ligand docking performed 50 trial runs using Lamarckian genetic algorithm search. The population size was set to 150. The docking results were sorted by the lowest estimated free energy of binding (AutoDock4 score) of each ligand. This score is calculated from the following equation: Torsional Free Energy−Unbound System 0 s Energy At the same time, all of these results were rescored using FRED program [21] . The anchor binding mode of ligands and protein were analyzed using SiMMap server (http://simmap. life.nctu.edu.tw/) [22] . The ligand-YHV protease complexes obtained from molecular docking were subsequently applied to molecular dynamics (MD) simulations. The AMBER99SB-ILDN force field [23] was used to simulate the protein structures and the ionization state of amino acid residues was set according to the standard protocol. To mimic catalytic conditions of cysteine proteases, the side chain of His30 was protonated at both δand ε-nitrogen atoms (HIP form), while the side chain of Cys152 was deprotonated (minus cysteine; CYM form). The N-and C-terminal ends of amino acid chains were capped with acetyl (ACE) and methyl amino (NME) groups, respectively. The GROMACS topology files for each ligand were generated using the ACPYPE script Sousa 2012 [24] , in which the general AMBER force field (GAFF) parameters [25] were applied and atomic partial charges were calculated using the AM1-BCC method [26] , implemented in the Antechamber module. Each complex was solvated in a rectangular box of TIP3P water [27] keeping a distance of 1.0 nm between the solutes and the sides of the solvent box. Sodium ions were added to neutralize the charge of the system. The MD simulations were carried out on the explicitsolvent periodic boundary conditions using the GROMACS v4.5.5 [28] [29] [30] . Each solvated system was energy-minimized by two steps using the steepest descent method either until the maximum force is smaller than 1000 kJ mol −1 nm -1 on any atom or until additional steps result in a potential energy change of less than 1 kJ mol −1 to reduce undesirable atomic contacts. At first, positioned restraints with a force constant of 1000 kJ mol −1 nm -1 were applied to all heavy atoms of the proteins and ligands (if presented), allowing water molecules and counterions to relax their position. Second, the restrains on proteins and ligands were released, then allowing all atoms in a system to freely move in turn. Afterward, the energyminimized systems were equilibrated in three phases with the positioned restraints described earlier. The first step is heating up each system from 50 to 300 K over 50 ps under the NVT condition using the Berendsen thermostat [31, 32] . The following step is conducted under the NPT condition at 1 bar pressure over 100 ps using the Parrinello-Rahman barostat [33, 34] . Once each system was sufficiently equilibrated around the target temperature and pressure, the positioned restraints were then gradually reduced to zero kJ mol −1 nm -1 with four rounds of 50 ps-NPT simulations. After the equilibrations, an unrestrained dynamics production was subsequently performed under the NPT condition with snapshots collected every 1 ps. For all dynamics runs, the LINCS algorithm [35] was applied to fix all hydrogen related bond lengths, facilitating the use of a 2 f. time step. A short-range nonbonded interaction cut-off distance of 1.0 nm was used. The particle mesh Ewald method was used to account for long-range electrostatics [36, 37] . All dynamics analyses were performed using tools available within the GROMACS suite [38] . Hydrogen bonds observed in the ligand-YHV protease complexes and their percentage occupancy were also analyzed from MD trajectories using the tool available in the GROMACS suite. A hydrogen bond is defined by the default geometrical criteria: donor-acceptor distance ≤ 0.35 nm and protondonor-acceptor angle ≤ 30°. The Grace v5.1.2 [39] was used to plot the 2D data and the 3D images were created and rendered with the PyMOL v1.3.1 [40] . Fluorogenic peptide YHV substrate was dissolved in dimethyl sulfoxide (DMSO) at 25 mM and diluted with YHV reaction buffer (20 mM Tris-HCl pH 7.5, 200 mM NaCl, 1 mM EDTA and 1 mM DTT) to obtain 100 μM YHV substrate. YHV protease inhibitors (gifted from NCI) were resuspended in dimethyl sulfoxide (DMSO) to obtain 6.7 μM and diluted with YHV reaction buffer to obtain 75 nM YHV protease inhibitors. Purified YHV protease enzyme (gifted from Dr. Pongsak Khunrae) was dissolved in YHV reaction buffer (20 mM Tris-HCl pH 7.5, 200 mM NaCl, 1 mM EDTA and 1 mM DTT). The determination of YHV inhibition assay was performed by incubating 30 μl YHV protease with equal volume of 75 nM inhibitor at room temperature for 3 min. Then, fluorogenic peptide YHV substrate diluted in YHV reaction buffer was added. The YHV protease in reaction mixture without inhibitor was used as negative control while the reaction mixture excluded YHV protease was used as positive control. The reactions were measured by using fluorescence microplate reader with excitation/emission at 340/ 485 nm for 5 min. Rate of reaction was calculated as % activity. We performed a comparative modeling approach using multiple templates followed by an energy minimization to construct a theoretical model for the YHV protease. Because a multiple alignment is the key step in homology modeling, our model was cautiously done since there is only limited sequence identity to such templates. Ab initio modeling and protein threading are now very effective and invaluable methods to use for structural prediction of non-or lowtemplate identity proteins, and their online version are currently accessible [41] [42] [43] [44] [45] [46] [47] . In our case, the search for template coordinates available from the protein data bank performed on several BLAST and threading web-servers did not provide us with the consensus template structures or a satisfied model (data not shown), and none of them exhibits acceptable percentage of identities/similarities to the YHV protease. A tertiary structure of the YHV protease should have significant conserved fold like other proteases, and thus Fig. 1 Structural comparison of YHV protease to eight template proteases. Secondary structure assignment is based on the sequence alignment as described in Methods. The amino acid residues responsible for β-strand and α-helical structures are shown in green and red colors, respectively. The catalytic triad residues are indicated in yellow rectangles allow the development of our own method used for the target-template sequence alignment as described in the Materials and methods section. The model templates used in this study were selected according to an extremely high structural similarity in the two-domain antiparallel β-barrel fold, which is the hallmark of trypsin-like proteases even though these structures exhibit a limited sequence identity to the YHV protease, ranging from 14 to 20 % of the aligned regions ( Fig. 1 and Table 1 ). The selected templates were three cysteine proteases from tobacco etch virus (PDB entry 1LVM) [48] , rhinovirus (PDB entry 1CQQ) [49] , hepatitis A virus (PDB entry 1HAV) [50] , and the other five proteases of non-viral proteins, bovine β-trypsin (PDB entry 5PTP) [51] , protease domain of human factor B (PDB entry 1DLE) [52] , epidermolytic toxin B from Staphylococcus aureus (PDB entry 1AGJ), lysine-specific protease I from Achromobacter lyticus (PDB entry 1ARB) [53] , and trypsin-like folded human heparin binding protein (PDB entry 1A7S) [54] . After energy minimization of the homology model, results of Ramachandran plots from PROCHECK showed that more than 95 % of the residues were in the most favored and additional allowed regions, indicating a good quality model could be obtained (Supplementary data Fig. S1 ). This final model adopted the two-domain antiparallel βbarrel fold-like structure resembling the templates (Fig. 2a and b) with less than 3 Å of the C α -atom root mean square deviation (RMSD) between the model and such templates ( Table 1) . Although the overall fold of the YHV protease seemed closer to those of the non-viral serine proteases than the three viral cysteine proteases, the atomic coordinates of the catalytic triad residues, His30, Asp(Glu)70, and Cys152 (numbering of YHV protease) were highly conserved among the four viral C-type cysteine proteases; yellow head, tobacco etch, rhino-and hepatitis A viruses (Fig. 2c) . These also indicated a high accuracy of the modeled YHV protease and it was reasonable to use this model for subsequent docking experiments. Structural analysis of 3c-proteases in several species revealed that there were three conserved residues, His, Asp or Glu, and Cys, known as the catalytic triad. They located on the center between two lobes of anti-parallel b-strands. According to the nomenclature of Schechter and Berger [55] , these catalytic residues positioned around the S 2 to S' 2 sub-sites of protease enzyme to serve P 2 to P' 2 positions of peptide substrate. Furthermore, most of the 3c-proteases preferred glutamine as the P1 site for substrate, and in some cases the minor preference seems to be glutamic acid [4] . Generally, residues of S 1 pocket are Thr147 and His167 (YHV protease numbering). In addition, the surrounding residues around the S 1 subsite were also conserved among 3c-protease. This neighbor pocket is named "pocket A" which is composed of Lys148, Asp149, Gly170, Ala172, and Ser173. In order to identify the small molecules that specifically bind to YHV protease, a docking study analysis was performed against small molecules from NCI diversity dataset. All 1364 compounds from NCI diversity II were docked into the modeled structure of YHV-protease, 43 compounds were selected from this screening step based on docking scores. The expected results showed several sets of the ligand-protein binding mode. Therefore, the FRED program combinations with SiMMap server (http://simmap.life.nctu.edu.tw/) were used to justify the reliable possibility of the binding mode. The SiMMap server was used to identify the site-moiety map between the functional group(s) of these ligands to the residues around the binding site (Fig. 3) . As shown in the SiMMap moieties, there were four spheres which represent two types of interactions. H moiety refered to the hydrogen bonding, while V moiety refered to van der Waals force. Anchor H1 illustrated the binding residues; Asp149 and Cys152 prefer to form hydrogen bonding interactions with the following functional groups; hydroxyl, 1°amine, azo (also named diimide or diazene compound), hydrazine and carboxamide moieties. Similar to anchor H1, H2 specified the consistent functional groups with the position Gly170 which are ether, carbonyl, amide, alcohol, and sulfonyl groups, respectively. For anchor H3, Ser173 plays the role that interacts with ether, nitro, sulphonyl, carboxy, and carboxamide groups. In additional, the van der Waals interactions between three residues around binding pocket; Lys148, Gly170, and Ala172, and the hydrophobic moieties; phenol, benzene, alkene etc. were represented by V1 anchor. These analyses assisted us to understand the mode of binding of ligands and to classify ligand-group which followed the interaction modes. The 43 screened small molecules can be categorized into three groups. Considering at group one, there are 32 compounds which are able to form hydrogen bonds with residues in the catalytic cleft and/or pocket A (Fig. 4) . Moreover, these compounds are also representing cation-π and π-π interactions with those mentioned residues. Compounds in this group share common aromatic nucleus and also carboxamide group, the peptide-bond-like structure ( ) where its structure mimics peptide bond. Group two is composed of only two compounds which present only cation-π interactions. Compounds in group three consist of various types of core structures. These seven compounds interact with residues around the catalytic cleft with weak van der Waals interactions. After all viral protease inhibitors with crystal structures were deeply analyzed, the general features of the protease inhibitors always consist of the carboximide group and aromatic moieties and at present are: Leupeptins, Saquinavir, Nelfinavir, Ritonavir, Indianavir, Amprenavir, Lopinavir, Fig. 5 The comparison between a the known protease inhibitor, Nelfinavir and b NSC319990 oriented on the binding site of YHV protease, both 2D and 3D views, as well as c the superimpose structures and the three common interactions Chymostatins, Antipain, Elastinal, and β-MAPI. These features were also found in our 11 candidate compounds from 32 compounds in group one which contains aromatic moiety, carboximide, hydroxy, carbonyl, ether, and ester groups. All 11 two-dimensional structures as well as properties were presented in Table 2 . In order to better understand the mode of actions of these small molecules via strong hydrogen bonds, the orientations of Nelfinavir, a representative of known protease inhibitors, and compound NSC319990 (the best docking score with carboximide functional group, Rank3) have been compared. The post-docking analyses revealed that one-half of NSC319990 is oriented at the same area as Nelfinavir. These two molecules lay along the sub-sites S 2 (H30, S67-D70, I168 and V169) to S 1 (T147, K148, D149 and H167) of YHV protease. There are three common types of binding between these two compounds as presented in Fig. 5 . First, π-π interaction between 6-membered aromatic ring and residue His30 (one of the catalytic triad) have been observed from both compounds. Second, the dipole-dipole electrostatic interaction between ligands and binding residues in pocket A, Asp149, and Cys152 were formed. N-atom of decahydroisoquinoline of Nelfinavir, and S-atom of benzo(d)thiazole of NSC319990 held the negative dipole and acted as nucleophile to form interaction with either thiol group of Cys152 or carboxylic group of Asp149. Third, the van der Waals interactions between the residues Gly170 and Ala172, which were defined as V1 anchor, and the fused rings of both compounds have been monitored. Considering at the compound NSC319990, as mentioned above, one-half of this compound is perfectly occupied on the binding pocket along the sub-site S 2 to S 1 of YHV protease. Moreover, another-half of NSC319990 lies on the S3 sub-pocket (M137, I168-I171, and A180-T182). These observations suggested that the screened compounds might possess the protease inhibitor properties as the known one, Nelfinavir. These sets of ligands not only possessed strong hydrogen bonds with the surrounded residues in the binding pocket but also are reported as the potent inhibitory agents targeting several proteins including the malarial parasite plastid (Table 2) . Stability of the complexes between the first five hit compounds ((NSC122819, NSC345647, NSC319990, NSC50650, and NSC5069) and YHV protease was evaluated by performing MD simulations (Fig. 4) . The backbone RMSD values of each snapshot with respect to the initial structures obtained before energy minimizations were plotted over the course of 15 ns simulation time (Supplementary data Fig. S2 ). Although the RMSD values for the apo simulation tended to be more deviated from the initial structure, those for the complex simulations had reached equilibrium well after t= 5 ns and remained fluctuated not more than ≈0.1 nm until the end of the simulations, indicating stability of the complex systems. Potential energies for interactions between the hit compounds and the enzyme were calculated as the sum of the electrostatic and van der Waals (vdW) energies over the last 5 ns of the simulations (Fig. 6) . It is noted that vdW interactions were found to be stronger than electrostatic interactions in all cases. The compounds NSC345647 and NSC319990 contribute more interaction energies than the other three complexes. These results are consistent with the docking observations in that both compounds are the first two hits among the five. Numbers of hydrogen bonds between the five hit compounds and the enzyme were then investigated throughout the 15 ns in all complex simulations (Supplementary data Fig. S3 ). List of the interaction pairs and percentage of occupancy were shown in Table 3 . The analyses showed that the compounds NSC345647 and NSC319990 exhibit the largest hydrogen bond contributions among the five complexes as expected (Table 3 ). In particular, the compound NSC319990 makes strong hydrogen bonds with several residues including the catalytic triad of the protease (Fig. 7) . Among these, the interaction between the N3 and N4 atoms of the compound and the side chain of Asp70 exhibits stability with the highest bond occupancy (95.8 % and 69.5 %, respectively). The Asp149 side chain (in pocket A) also makes strong hydrogen bonds with the N atoms of the compound (63.4 % and 61.6 %). Furthermore, the π-π interaction between the 6membered aromatic ring of the compound and His30 was observed (data not shown). These strong hydrogen bonds observed in the NSC319990 complex are in agreement with the observations from molecular docking. The same first five hit compounds were tested for YHV inhibitory assay using bacterial expressed YHV. The relative inhibition results were shown in Fig. 8 . Both NSC122819 and NSC345647 showed a strong inhibitory against YHV at our assay condition while NSC319990, NSC50650, and NSC5069 showed more than 50 % inhibitory. The best Fig. 7 Representation for the interactions between compound NSC319990 and YHV protease. The compound is shown in stick (C atoms in cyan) and three strong binding residues (Asp70, Asp149, and Gly170) are indicated. Hydrogen bonds are shown in magenta dotted lines. Distances of the important hydrogen bonds were plotted over the course of 15 ns simulation time Fig. 8 Percentage of YHV inhibition of the first five hit NCI compounds against bacterial expressed YHV protease using fluoroscopic assay. The final concentration of compounds were 75 nM docked-score compounds, NSC122819 also showed the strongest inhibition. Further experiments will be necessary to understand the inhibiting mechanism of these compounds. In this study, the homology model of YHV protease was created from multiple template technique. The modeled structure was used in virtual screening studies against a set of compounds in the NCI database. This investigation proposed a set of promising small molecules to inhibit or decrease the activity of protein YHV protease from the in silico studies model. The inhibiting activity of the best five docked-score compounds were tested against YHV protease in vitro. All of them showed a good degree of inhibition. The best dockedscore, NSC122819, also showed the best inhibition. This confirmed our successful homology and docking protocols to predict the potential inhibitors against YHV. These compounds can be further tested against yellow head virus in cells and farm levels in the future. Identification and analysis of gp116 and gp64 structural glycoproteins of yellow head nidovirus of Penaeus monodon shrimp Rapid identification of coronavirus replicase inhibitors using a selectable replicon RNA RNA transcription analysis and completion of the genome sequence of yellow head nidovirus The 3C-like proteinase of an invertebrate nidovirus links coronavirus and potyvirus homologs Gillassociated virus of Penaeus monodon prawns: an invertebrate virus with ORF1a and ORF1b genes related to arteri-and coronaviruses Virus-encoded proteinases and proteolytic processing in the Nidovirales Viral protease inhibitors NS3 protease from flavivirus as a target for designing antiviral inhibitors against dengue virus YHV-protease dsRNA inhibits YHV replication in Penaeus monodon and prevents mortality Therapeutic inhibition of yellow head virus multiplication in infected shrimps by YHV-protease dsRNA CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice GeneDoc: a tool for editing and annotating multiple sequence alignments Comparative protein modelling by satisfaction of spatial restraints All-atom empirical potential for molecular modeling and dynamics studies of proteins † {PROCHECK}: a program to check the stereochemical quality of protein structures Ligprep, v2. 2. Schrödinger, LLC Grid-based hydrogen bond potentials with improved directionality A semiempirical free energy force field with charge-based desolvation Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function SiMMap: a web server for inferring site-moiety map to recognize interaction preferences between protein pockets and compound moieties Improved side-chain torsion potentials for the Amber ff99SB protein force field ACPYPE-AnteChamber PYthon Parser interfacE Development and testing of a general amber force field Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation Comparison of simple potential functions for simulating liquid water GROMACS: fast, flexible, and free GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit Molecular dynamics with coupling to an external bath Canonical sampling through velocity rescaling Polymorphic transitions in single crystals: a new molecular dynamics method Constant pressure molecular dynamics for molecular systems LINCS: a linear constraint solver for molecular simulations Particle mesh Ewald: an N⋅log(N) method for Ewald sums in large systems A smooth particle mesh Ewald method DeLano WL The PyMOL molecular graphics system I-TASSER: a unified platform for automated protein structure and function prediction Protein structure prediction on the Web: a case study using the Phyre server GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences Structure prediction for CASP8 with all-atom refinement using Rosetta GeneSilico protein structure prediction meta-server Protein annotation and modelling servers at University College London Protein structure prediction and analysis using the Robetta server Structural basis for the substrate specificity of tobacco etch virus protease Structure-assisted design of mechanism-based irreversible inhibitors of human rhinovirus 3C protease with potent antiviral activity against multiple rhinovirus serotypes The refined crystal structure of the 3C gene product from hepatitis A virus: specific proteinase activity and RNA recognition Solvent structure in crystals of trypsin determined by X-ray and neutron diffraction New structural motifs on the chymotrypsin fold and their potential roles in complement factor B The structure of Staphylococcus aureus epidermolytic toxin A, an atypic serine protease The primary structure and structural characteristics of Achromobacter lyticus protease I, a lysine-specific serine protease On the size of the active site in proteases. I. Papain