key: cord-0770008-9odz9ek9 authors: Hong, Mei; Mandala, Venkata; McKay, Matthew; Shcherbakov, Alexander; Dregni, Aurelio; Kolocouris, Antonios title: Structure and Drug Binding of the SARS-CoV-2 Envelope Protein in Phospholipid Bilayers date: 2020-09-24 journal: Res Sq DOI: 10.21203/rs.3.rs-77124/v1 sha: af214232bb91bca54f2c61d6b6a9cdfbe6955610 doc_id: 770008 cord_uid: 9odz9ek9 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the ongoing COVID-19 pandemic. Successful development of vaccines and antivirals against SARS-CoV-2 requires a comprehensive understanding of the essential proteins of the virus. The envelope (E) protein of SARS-CoV-2 assembles into a cation-selective channel that mediates virus budding, release, and host inflammation response. E blockage reduces virus pathogenicity while E deletion attenuates the virus. Here we report the 2.4 Å structure and drug-binding site of E’s transmembrane (TM) domain, determined using solid-state nuclear magnetic resonance (NMR) spectroscopy. In lipid bilayers that mimic the endoplasmic reticulum Golgi intermediate compartment (ERGIC) membrane, ETM forms a five-helix bundle surrounding a narrow central pore. The middle of the TM segment is distorted from the ideal a-helical geometry due to three regularly spaced phenylalanine residues, which stack within each helix and between neighboring helices. These aromatic interactions, together with interhelical Val and Leu interdigitation, cause a dehydrated pore compared to the viroporins of influenza and HIV viruses. Hexamethylene amiloride and amantadine bind shallowly to polar residues at the N-terminal lumen, while acidic pH affects the C-terminal conformation. These results indicate that SARS-CoV-2 E forms a structurally robust but bipartite channel whose N- and C-terminal halves can interact with drugs, ions and other viral and host proteins semi-independently. This structure establishes the atomic basis for designing E inhibitors as antiviral drugs against SARS-CoV-2. intensities (Fig. S2a, b) , indicating that the protein is immobilized in the lipid membranes at ambient temperature. Two-dimensional (2D) 15 N-13 C and 13 C-13 C correlation spectra show wellresolved peaks for most residues (Fig. 1c, d) with 13 C and 15 N linewidths of 0.5 ppm and 0.9 ppm, indicating that the protein conformation is highly homogeneous. We assigned the chemical shifts using 3D correlation NMR experiments (Fig. S3a, Table S1 ). These chemical shifts indicate that residues 14-34 form the α-helical core of the TM domain (Fig. S3b, c) . Comparison of spectra between the two membranes and at different temperatures ( Fig. S2d-f) indicate that the Nterminal segment (residues E8-I13) is dynamic at high temperature but has α-helical propensity, while the C-terminal segment (residues T35-R38) is more rigid but displays temperaturedependent conformations. Acidic pH perturbed the chemical shifts of C-terminal residues L34-R38 (Fig. S4) , supporting the conclusion that the C-terminal segment is conformationally plastic. The temperature insensitivity of the protein spectra suggests that ETM is oligomerized in lipid bilayers. To determine the oligomeric structure, we prepared two mixed labeled protein samples to measure intermolecular distances. An equimolar 13 C-labeled protein mixed with 4-19 F-Phe labeled protein (Fig. S1e ) was used to measure intermolecular 13 C-19 F distances using the REDOR technique 18 (Fig. 2a) . ETM contains three regularly spaced phenylalanine (Phe) residues, Phe20, Phe23 and Phe26, at the center of the TM segment. 1D and 2D 13 C NMR spectra were measured without and with 19 F pulses. The resulting difference spectra show the signals of carbons that are in close proximity to a fluorinated Phe on a neighboring helix (Fig. 2b, Fig. S5a c). As expected, residues V17 to L31 are affected by 4-19 F-Phe, while residues I13 to S16 and A36 to R38 show no REDOR dephasing. Moreover, the three Phe's display two resolved 19 F chemical shifts, indicating that one of the residues has a distinct sidechain conformation. A 2D 13 C-19 F correlation spectrum (Fig. 2c) shows a cross peak between the -118 ppm 19 F signal and A22 Cβ, indicating that this -118 ppm peak is due to either F20 or F23. The -113 ppm 19 F peak shows strong cross peaks with aromatic and numerous aliphatic 13 C chemical shifts. Since F20 and F26 are too far away from each other to form intermolecular contacts, the -118 ppm 19 F peak must be assigned to F20, while F23 and F26 resonate at -113 ppm. To constrain the interhelical packing at the two termini of the TM domain, we prepared a 13 C and 15 N mixed labeled sample, and measured 2D NHHC correlation spectra, which exhibit 15 N-13 C correlation peaks that are exclusively intermolecular (Fig. 2d) . These experiments together yielded 35 interhelical 13 C-19 F distance restraints and 52 interhelical 15 N-13 C correlations, which are crucial for determining the oligomeric structure of ETM. To further constrain the E channel architecture, we measured the water accessibilities of different residues using water-edited 2D 15 N-13 C correlation experiments (Fig. 2e, Fig. S5d ) 19, 20 . Water 1 H magnetization transfer is the highest to the N-terminal residues, the least to the central residues L17 to A32, and moderate to the C terminus (Fig. 2f) . Thus, the hydration gradient of the protein is primarily along the bilayer normal. The preferential hydration of the N-terminus is especially manifested by the high water-transferred intensity of L19 compared to T30, despite favorable chemical exchange to the Thr sidechain. For the dehydrated center of the TM domain, L28 and V25 show higher hydration than their neighboring residues, suggesting that these residues face the pore. A complementary lipid-edited experiment (Fig. 2g) showed much higher intensities for the Phe sidechain carbons than the corresponding water-transferred intensities, indicating that the Phe's are more lipid-facing. The ERGIC-bound ETM shows two-fold lower water accessibility than the closed state of influenza BM2 at the same neutral pH (Fig. 2f) . We calculated the structure of ETM using the measured 56 (f, y) torsion angles, 87 interhelical distance restraints (Tables S2, S3) , and 196 intrahelical 13 C-13 C contacts obtained from 250 ms 13 C spin diffusion spectra (Fig. S6) 21 . We disambiguated the direction of interhelical contacts from one helix to the two neighboring helices by considering the pore-versus lipid-facing positions of the residues, the helical distortion between F20 and F23 (Fig. S3b) , and the interhelical 13 C-19 F Phe-Phe contacts (Fig. S7e) . The lowest-energy structure ensemble, with a heavy-atom RMSD of 2.4 Å ( Table 1) , shows a long and tight five-helix bundle with a vertical length of ~35 Å for residues V14-L34. The channel diameter varies from 11 Å to 14 Å, based on the Ca-Ca distances between helices i and i+2 for pore-facing residues (Fig. 3a) . The helical bundle is primarily lefthanded, although a minor conformer of right-handed bundle is also seen. Each helix is tilted by an angle of about 10˚ from the bilayer normal (Fig. 3b) ; however, this orientation is not uniform, because the helix is not ideal, and exhibits a significant rotation angle change, or twist, between residues F20-F23 10,17 . The pore of the channel is occupied by mostly hydrophobic residues, including N15, L18, L21, V25, L28, A32 and T35 (Fig. 3b-d, Fig. S8a) , explaining the poor hydration of the protein. The N-terminus pore is constricted by N15, which forms interhelical sidechain hydrogen bonds (Fig. 3g) 22 . Mutation of N15, as well as V25, is known to abolish cation conduction 13 . The helix-helix interface is stabilized by aromatic stacking of F23 and F26 (Fig. 3e, g) and van der Waals packing among methyl-rich residues such as the V29-L31-I33 triad (Fig. 3f) . These extensive hydrophobic interactions create a tighter helical bundle than the influenza viroporin BM2 and the HIV-1 viroporin Vpu (Fig. S8b) . To investigate how the E pentamer interacts with drugs, we measured the chemical shifts of the protein in the presence of HMA and AMT. At a drug : protein molar ratio of 4 : 1, HMA caused significant chemical shift perturbations (CSPs) to N-terminal residues, including T9, G10, T11, I13 and S16, followed by modest CSPs for the C-terminal A36 and L37 (Fig. 4a-c) . This trend is consistent with the micelle data 10,17 , but the CSPs in bilayers are much larger than in micelles, with the N-terminal 9 TGT 11 triplet giving CSPs of 0.35-0.70 ppm. Moreover, in bilayers CSPs were observed at only 4-fold drug excess, while in micelles CSPs were observed at higher drug excesses of 10 to 31-fold 10,17 . The higher sensitivity to drug in lipid bilayers suggests that the bilayer-bound protein conformation is more native. Docking based on these CSPs found that HMA intercalates shallowly into the N-terminal lumen with a distribution of orientations (Fig. 4d, Fig. S9 ), suggesting a dynamic binding mode where HMA exchanges between multiple helices and inhibits cation conduction by steric occlusion of the pore. Within the ensemble of docked structures, more HMA molecules point the guanidinium into the pore and the hexamethylene ring to the lipid headgroups than the reverse orientation. AMT caused smaller CSPs than HMA (Fig. 4c, Fig. S10a, b) , but the site of binding remains at the N-terminus. Using the 3-19 F probe on the adamantane, we measured protein-drug proximities using 13 C-19 F REDOR. The spectra showed modest dephasing for the N-terminal N15 and C-terminal I33 (Fig. S10c-e) , in qualitative agreement with the observed CSPs. The larger CSPs of HMA than AMT are consistent with the micromolar EC50 reported for the HCoV-229E E protein 6 compared to the millimolar binding affinities of AMT to SARS-CoV-2 E 7 . Which structural features of this ETM pentamer might be responsible for cation conduction? The N-terminal part of ETM contains a conserved (E/D/R)1x(G/A)3xxhh(N/Q)8 motif (Fig. 1b) , where h is a hydrophobic residue. The most exposed residue, E8, belongs to a dynamic N-terminus whose residues (e.g. T9 and G10) manifest intensities only at high temperature ( Fig. S2d-f ). The E8 sidechain carboxyl is deprotonated at neutral pH and protonated at acidic pH, as seen in the 13 C chemical shifts (Fig. S2c) . We hypothesize that the protonation equilibria of this loose ring of E8 quintet, together with the anionic lipids in the ERGIC membrane, may regulate the ion selectivity of ETM at the channel entrance. A ring of negatively charged Glu residues has been observed as selectivity filters in the hexameric Ca 2+ -selective Orai channels 23 and designed K + channels 24 . The third residue of the motif, G10, is conserved among coronaviruses to be small and flexible, thus permitting N-terminus motion. The last residue of the motif, N15, is conserved to be either Asn or Gln, whose polar sidechains can coordinate ions as well as forming interhelical hydrogen bonds to stabilize the channel 22 . At the C-terminal end, the conserved small residues A32 and T35 provide an open cavity for ions. In contrast to these small polar residues, the central portion of the TM domain contains four layers of hydrophobic residues, L18, L21, V25 and L28, which narrow the pore radius to ~2 Å (Fig. 3d) . This narrow pore can only permit a single file of water molecules or ions, thus partially dehydrating any ions that move through the pore. Thus, the structure shown here may represent the closed state of SARS-CoV-2 E, while the open state may have a larger and more hydrated pore. We note, however, that narrow pores with multiple hydrophobic layers have been observed in ion channels, including the tetrameric K + channel TMEM175 25 and the pentameric bestrophin channels 26,27 . Thus, it is possible to achieve charge stabilization and ion selectivity in such a hydrophobic environment, although the detailed mechanisms remain to be understood. The bilayer-bound structure of SARS-CoV-2 E has similarities as well as differences from the micelle-bound structure 10,17 . In micelles, ETM helix also displays a kink and the N-terminus is similarly disordered, but the handedness of the helical bundle and the identity of pore-facing residues vary with the detergent. For example, in LMPG micelles, F26 and T30 point to the lumen rather than lipids. Thus, the membrane-mimetic environment appears to influence E's oligomeric structure. Compared to influenza and HIV-1 viroporins, the SARS-CoV-2 E helical bundle is tighter and more rigid. AM2 and BM2's TM domains have a higher percentage of polar residues such as His and Ser. As a result, M2 forms wider and more hydrated pores (Fig. S8b ) 9,28 . The HIV-1 Vpu TM domain has a similarly high percentage of hydrophobic residues as SARS-CoV-2 E, but forms a shorter (~20 Å vertical length) pentameric helical bundle with more tilted helices (~20˚) 29,30 . The E helical bundle is more immobilized than M2 and Vpu 31 , and does not undergo whole-body fast uniaxial rotation at high temperatures in DMPX membranes (Fig. S2) . This immobilization suggest that the protein may interact extensively with lipids. Finally, the helix distortion at F20-F23 may cause the two halves of E's TM domain to respond independently to environmental factors such as pH, membrane composition 16 , and other viral and host proteins. This membrane-bound ETM structure suggests that small-molecule E inhibitors should bind with high affinity to both the acidic E8 and the polar N15 in order to occlude the N-terminal entrance of the protein. The membrane topology of SARS-CoV-2 E is now recognized to be Nlumen -Ccyto based on antibody-detected selective permeabilization assays 32 and glycosylation data 33 . This orientation would prime the protein to conduct Ca 2+ out of the ERGIC lumen to activate the host inflammasome 5 . Thus, small-molecule drugs should ideally be targeted and delivered to the ERGIC and Golgi of host cells to maximally inhibit SARS-CoV-2 E 34 . 2.08 ± 0.83 The difference spectrum (orange) shows residues that are close to the fluorines. (c) 2D 13 C-19 F correlation spectrum allows assignment of the -118 ppm peak to F20 due to a cross peak with A22, while the -113 ppm peak is assigned to F23/F26 based on correlations with F23, F26, and V24/V25. A 1D 1 H-19 F CP spectrum is overlaid on the left. (d) 2D NHHC correlation spectrum of mixed 13 C and 15 N labeled ETM, measured using 0.5 ms (red) and 1 ms (black) 1 H mixing. All peaks arise from interhelical contacts. Selected assignments are given. (e) Residue-specific water accessibilities of ERGIC-bound ETM, obtained from the intensity ratios of water-edited spectra measured with 9 ms and 100 ms 1 H mixing. Higher values (blue) indicate higher water accessibility. (f) Residue-specific N-Ca cross peak intensity ratios in the 9 ms and 100 ms wateredited spectra of ETM (black). Closed and open circles indicate resolved and overlapped peaks, respectively. For comparison, the water-edited intensities for the high-pH closed state of the influenza BM2 channel (blue squares) are much higher, indicating that the ETM pore is drier than the BM2 pore. (g) Water-edited and lipid-edited 1D 13 C spectra of ERGIC-membrane bound ETM. The Phe signals are high in the lipid-edited spectra but very low in the water-edited spectra, indicating that the three Phe residues are poorly hydrated and point to the lipids or the helix-helix interface. The gene encoding the full-length SARS-CoV-2/Wuhan-Hu-1 envelope (E) protein (NCBI reference sequence YP_009724392.1, residues 1-75) was purchased from Genewiz. The gene encoding the TM domain (residues 8-38, ETGTLIVNSVLLFLAFVVFLLVTLAILTALR) was isolated using PCR and cloned into a Champion pET-SUMO plasmid (Invitrogen). The plasmid was transfected into E. coli BL21 (DE3) cells (Invitrogen) to express the SUMO-ETM fusion protein containing an N-terminal His6 tag (Fig. S1a) . The construct's DNA sequence was verified by Sanger sequencing (Genewiz). A glycerol cell swab stored at -70°C was used to start a 10 mL LB culture containing 50 μg/mL kanamycin. The starter culture was used to inoculate 2 L of LB media. Cells were grown at 37°C until an OD600 of 0.6-0.8 and were harvested by centrifugation for 10 minutes at 20°C and 4,400x g. These LB cells were resuspended in 1 L of M9 media (pH 7.8, 48 mM Na2HPO4, 22 mM KH2PO4, 8.6 mM NaCl, 4 mM MgSO4, 0.2 mM CaCl2, 50 mg kanamycin) containing 1 g/L 15 N-NH4Cl. The cells were incubated in M9 media for 30 min at 18°C, then 1 g/L U-13 C glucose dissolved in 5 mL sterile H2O and 3 mL 100x MEM vitamins were added. The cells were grown for another 30 min, then protein expression was induced by addition of 0.4 mM isopropyl β-D-1thiogalactopyranoside (IPTG) along with 2 g/L U-13 C glucose in 10 mL sterile H2O. Additional IPTG was added after 1 hour to bring the final concentration to 0.8 mM. Protein expression proceeded overnight for 16 hours at 18°C, reaching an OD600 of 2.5. The cells were spun down at 4°C and 5,000 rpm for 10 min and resuspended in 35 mL Lysis Buffer I (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 1.0% Triton X-100, 0.5 mg/mL lysozyme, 10 μL benozonase nuclease, 1 mM Mg 2+ , 10 mM imidazole). Cells were lysed at 4°C by sonication (5 sec on and 5 sec off) for 1 hour using a probe sonicator. The soluble fraction of the cell lysate was separated from the inclusion bodies by centrifugation at 17,000x g for 20 min at 4°C. The supernatant was loaded onto a gravity-flow chromatography column containing ~6 mL nickel affinity resin (Profinity IMAC, BioRad) pre-equilibrated with Lysis Buffer I. The fractions were bound to the resin for 1 hour by gentle rocking at 4°C. The column was washed with 50 mL of Wash Buffer I (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 0.1% n-dodecyl-B-D-maltoside (DDM), 30 mM imidazole). SUMO-ETM was eluted with 10-15 mL Elution Buffer (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 0.1% DDM, 250 mM Imidazole) (Fig. S1b) . The eluted protein was diluted to onethird of the original concentration by adding twice the elution volume of Dilution Buffer (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 0.1% DDM) to reduce the imidazole concentration before protease cleavage. Approximately 20% of the protein was also found in the insoluble membrane and inclusion body fraction. To purify this fraction, the pelleted mass was resuspended in Lysis Buffer II (Lysis buffer I with added 6 M urea) and rocked gently at 4°C overnight. Soluble protein was isolated by centrifugation at 17,000x g for 20 min at 4°C. Nickel affinity column chromatography proceeded as described above for the soluble fraction, except that Wash Buffer II (Wash Buffer I with added 3 M urea) was utilized in place of Wash Buffer I. The purified SUMO-ETM fusion protein from both the soluble and inclusion body fractions were cleaved by addition of 1 : 10 (w/w) SUMO protease : SUMO-ETM and 5 mM tris(2carboxyethyl)phosphine (TCEP) for 2 hours at room temperature with gentle rocking. The cleavage efficiency was assessed by analytical HPLC and was typically ~75%. ETM was purified using preparative RP-HPLC on a Varian ProStar 210 System using an Agilent C3 column (5 μm particle size, 21.2 mm × 150 mm). The protein was eluted using a linear gradient of 5-99% (9:1, acetonitrile : isopropanol) : water containing 0.1% trifluoroacetic acid over 35 minutes at a flow rate of 10 mL/min (Fig. S1c) . The purified protein was dried down to a film under a stream of nitrogen gas and placed under vacuum overnight. The protein film was stored at -20°C. Typical yield of the purified protein was 10 mg/L of M9 media. Labeling efficiency was ~94% as estimated by MALDI mass spectrometry (Fig. S1d) . U-13 C-labeled ETM and U-15 N-labeled ETM were expressed and purified following the same protocol, but substituting 15 N-NH4Cl or 13 C-glucose with unlabeled reagents. A glycerol cell swab was used to start a 10 mL LB culture containing 50 μg/mL kanamycin. The starter culture was then used to inoculate 2 L of M9 media (pH 7.8, 48 mM Na2HPO4, 22 mM KH2PO4, 8.6 mM NaCl, 4 mM MgSO4, 0.2 mM CaCl2, 50 mg kanamycin) containing 3 g/L unlabeled glucose and 1 g/L unlabeled NH4Cl. The cells were grown in M9 at 37°C for media for 8 hours until an OD600 of 0.5. The cells were collected by centrifugation at 4,400x g for 10 min at 20°C, then concentrated into a fresh 1 L M9 culture and incubated at 30°C for 60 min. Subsequently, 1.5 g/L of glyphosate was added to halt the pentose phosphate pathway for aromatic amino acid synthesis 35 , followed by addition of 115 mg L-Trp, 115 mg L-Tyr and 400 mg of 4-19 F-L-Phe to the culture. After 30 min, IPTG was added to a final concentration of 0.4 mM, and protein expression was allowed to proceed at 30°C for 5.5 hours. The cells were collected by centrifugation at 4,400x g for 10 min at 4°C. The pellet was stored at -70°C until purification. Cell lysis and protein purification followed the same protocol as outlined above, except that the ETM peak during preparative HPLC was collected in two fractions of approximately 1 min each. Fluorine incorporation in the two fractions was measured using MALDI mass spectrometry. The first fraction had a higher fluorine incorporation level of 83% for all three Phe residues labeled with 19 F, indicating a per-residue labeling efficiency of 94% (Fig. S1e) . Only this fraction was used to prepare the mixed 13 C and 19 F labeled protein for interhelical distance measurement. The final yield of the Phe-fluorinated ETM expression was 1.5 mg/L of M9 media. The protocol was originally tested using 100 mg/L of 4-19 F-Phe, 1.0 g/L of glyphosate, 6 g/L unlabeled glucose and with expression at 18°C for 5.5 hours, which yielded a much lower perresidue labeling efficiency of ~35%. Eight proteins samples in two different lipid membranes were prepared for this study. Five membrane samples contained 13 C, 15 N-labeled ETM and one contained 13 C-labeled ETM. Another sample contained a 1 : 1 mixture of 13 C-labeled protein : 15 N-labeled protein. The last sample contained a 1 : 1 mixture of 13 C-labeled protein : 4-19 F-Phe-labeled protein. Six of the eight samples were prepared in a pH 7.5 Tris buffer (20 mM Tris-HCl, 5 mM NaCl, 2 mM ethylenediaminetetraacetic acid (EDTA) and 0.2 mM NaN3). One sample was prepared in a pH 5 citrate buffer with calcium (20 mM Citrate, 5 mM CaCl2 and 0.2 mM NaN3), while the final sample was prepared in the same pH 5 citrate buffer without calcium chloride. Chemical shift assignment and interhelical distance measurements were conducted on ETM bound to an ERGIC-mimetic membrane 36,37 containing 1-palmitoyl-2-oleoyl-glycero-3phosphocholine (POPC), 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoethanolamine (POPE), bovine phosphatidylinositol (PI), 1-palmitoyl-2-oleoyl-sn-glycero-3-phospho-L-serine (POPS), and cholesterol (Chol). The POPC : POPE : PI : POPS : Chol molar ratios were 45 : 20 : 13 : 7 : 15. All lipids were purchased from Avanti Polar Lipids. The membrane has a protein : lipid molar ratio (P : L) of 1 : 20, and 2-4 mg 13 C, 15 N-labeled protein was used for most 2D and 3D correlation experiments. The intermolecular NHHC spectra were measured using a sample containing 4 mg each of 13 C-labeled ETM and 15 N-labeled ETM. This mixture was reconstituted into the ERGIC membrane at a P : L of 1 : 10 to increase the experimental sensitivity. 13 C-19 F REDOR experiments were conducted on 3.7 mg total of 1 : 1 mixed 13 C-labeled and fluorinated ETM bound to the ERGIC membrane at P : L = 1 : 10. To reconstitute ETM into lipid bilayers, we dissolved 2 mg protein in 1 mL trifluoroethanol (TFE) and mixed with appropriate amounts of lipids in 400 μL chloroform. For the HMA-bound sample, HMA was dissolved in TFE (1 mg/100 μL) and added to the protein-lipid mixture. The organic solvents were removed under a gentle stream of nitrogen gas, and the film was dried under vacuum at room temperature overnight. The proteoliposome film was resuspended in 3 mL of pH 7.5 sample buffer by vortexing and sonicating 2-3 times for 5 min until the suspension was homogenous. This was followed by 7 freeze-thaw cycles between a 42°C water bath and liquid nitrogen. The proteoliposomes were then pelleted using ultracentrifugation for 3 hours at 164,000x g and 4°C. The pellet was dried in a desiccator or under a gentle stream of nitrogen gas to a final hydration level of ~40% by mass and then packed into an appropriate MAS rotor using a benchtop centrifuge. Drug binding to ETM was assessed in a "DMPX" membrane consisting of 1,2-dimyristoyl-snglycero-3-phosphocholine (DMPC) : 1,2-dimyristoyl-sn-glycero-3-phospho-(1'-rac-glycerol) (DMPG) at a 80% : 20% molar ratio. The mixture was chosen to maintain the same 20% anionic lipid fraction as the ERGIC membrane. A drug-free sample contained 2 mg of U-13 C, 15 N-labeled ETM bound to the membrane at a P : L of 1 : 20. The sample containing 5-(N,N-hexamethylene)amiloride (HMA) was prepared using a protein : drug (P : D) molar ratio of 1 : 1, with HMA (0.2 mg) added during organic solution mixing. The same P : L of 1:20 as the apo sample was used. After initial spectra showed only small CSPs, we titrated an additional 0.6 mg of HMA in 6 μl dimethyl sulfoxide (DMSO) into the proteoliposome, giving a P : D of 1 : 4. The solubility of HMA in aqueous solutions was very low (< 0.1 mg/ml), necessitating the use of DMSO. 3-19 Famantadine (AMT) was titrated into the proteoliposome stepwise, from an initial P : D molar ratio of 1 : 1 to a final P : D of 1 : 8. The protein/lipid molar ratio of the sample is 1 : 15. The fluorinated AMT has high solubility in water, thus can be mixed with the membrane directly. For the 13 C-19 F REDOR experiments, the sample was packed in a 1.9 mm MAS rotor, while chemical shift measurements were conducted in a 3.2 mm MAS rotor on the 800 MHz spectrometer. Chemical shift changes under acidic pH and with added calcium were assessed in the same "DMPX" membrane. The sample with 5 mM CaCl2 at pH 5 contained 2 mg of U-13 C, 15 N-labeled ETM bound to the membrane at a P : L of 1 : 20, while the sample without calcium contained 2 mg of U-13 C-labeled ETM bound to the membrane at a P : L of 1 : 20. The synthetic protocol (Scheme S1) used for preparation of F-Amt was adapted from that described by Jasys and coworkers (Jasys were conducted to measure distances between 4-19 F-Phe-labeled and 13 C-labeled ETM, and to measure dipolar dephasing of 13 C-labeled ETM by 3-19 F-AMT. Additional specific parameters for the NMR experiments are given in Table S5 . NMR spectra were processed in the TopSpin software while chemical shifts were assigned in Sparky 45 . TALOS-N 46 was used to calculate (ϕ, ψ) torsion angles after converting the 13 C chemical shifts to the DSS scale. Residue-specific chemical shift differences between drug bound and apo samples were calculated from the measured 13 C and 15 N chemical shifts according to: (1) 2D heatmaps of normalized water-edited 2D NCA spectra were generated using an in-house Python script that removes spectral noise while calculating intensity ratios. The intensities of the 9 ms and 100 ms spin diffusion spectra of the ERGIC-bound ETM were read using the NMRglue package 47 . Spectral intensity was noise filtered by setting signal lower than 3.5 times the average noise level in an empty region of the 2D spectrum to zero for the S spectrum and to a large number for the S0 spectrum 28, 48 . The intensities were divided and scaled by the number of scans to obtain a 2D contour map, I9 ms/I100 ms. The water accessibility data for the closed high-pH state of influenza BM2 proton channel (Fig. 2f) for comparison with the ETM data were originally measured in 2D 13 C-13 C correlation spectra with 4 ms (S) and 100 ms (S0) 1 H-1 H spin diffusion 28 . To enable comparison with the ETM wateredited spectra measured at 9 ms and 100 ms 1 H mixing, we scaled the BM2 S (4 ms) /S0 (100 ms) ratios by the integrated aliphatic intensity ratio of 1.976 between the 1D BM2 water-edited spectra with 9 ms and 4 ms 1 H mixing. This scaling factor was verified to be accurate for two resolved sites, T24 and G26, in the 1D 13 C spectra of BM2. Simulation of the 13 C-19 F REDOR curves 13 C-19 F REDOR data were simulated using the SIMPSON software 49 . The simulations accounted for finite 19 F and 13 C 180° pulse lengths and 19 F pulse imperfections by co-adding REDOR curves for 19 F flip angles of 180° to 145° using a normal distribution centered at 180° with a standard deviation of 15° 43 . The simulations also included 19 F chemical shift anisotropy (CSA), which was obtained from the 19 F CSA sideband patterns measured at 293 K under 14 kHz MAS. The sideband intensities were fit using the Solids Lineshape Analysis module in Topspin. The best-fit CSA was δCSA = 55±2 ppm and η = 0.6±0.1 for the 19 F peak at δiso = -113.5 ppm and δCSA = 53±2 ppm and η = 0.5±0.1 for the 19 F peak at δiso = -117.5 ppm. These CSAs indicate that all three 4-19 F-Phe residues are immobilized. REDOR distance analysis required two other considerations. First, the 1 : 1 13 C and 19 F mixed peptides means that only 50% of all 13 C-labeled helices have an adjacent 19 F-labeled helix. Thus, the lowest possible REDOR S/S0 value is 0.5. Second, while most 13 C-19 F REDOR restraints came from 2D 13 C-13 C resolved peaks, dephasing to sidechain carbons were obtained from 1D å 13 C spectra with resonance overlap. These overlapped peaks will not experience complete dipolar dephasing if some of the carbons contributing to an overlapped signal are far from a fluorine. We first identified the residues experiencing dephasing by 19 F from the 2D 13 C-13 C correlation spectra. These peaks then guided the assignment of the 1D 13 C-19 F spectra. For example, both A22 and A32 Cβ resonate at 16.6 ppm, but only A22 Cα is dephased in the 2D 13 C-13 C spectrum (Fig. 2b) . Thus, we assigned the 16.6 ppm dephased signal in the 1D 13 C-19 F REDOR spectra to A22 Cβ. Making the reasonable assumption that each Ala Cβ contributes equal intensity, we account for this overlap factor by correcting the experimental dephasing (S/S0)exp values according to: (2) where f is the fraction of an overlapped 13 C peak that is dephased by 19 F. For example, for the 2fold overlapped 16.6-ppm Ala Cβ peak with f = 2, the lowest possible (S/S0)exp value is ~0.75, which gives a minimal (S/S0)adj of ~0.0. The random uncertainty σ(S/S0)exp of the measured (S/S0)exp values were propagated from the signal-to-noise ratios (SNRs) of the REDOR S0 and S spectra. The upper and lower limits for the (S/S0)adj values were obtained by adding or subtracting the σ(S/S0)exp to the (S/S0)exp values before using equation (2), respectively. Best-fit distances were obtained as the distance with the lowest c 2 value between the (S/S0)adj values and simulated S/S0 intensities. Upper and lower distance limits were specified using the upper and lower limits for the (S/S0)adj values calculated as described above. For an upper limit of (S/S0)adj >0.95 indicating a negative contact (i.e. dephasing was not significant), an upper limit of 50 Å was used. The final lower and upper distance limits for structure calculation were set by multiplying the uncertainty obtained in this manner by 2 times or by choosing distances that are 2.0 Å from the best-fit value, whichever was larger, to loosen the constraints. Initial structure calculation attempts using ambiguous interhelical contacts, where a central helix can contact both neighboring helices, did not converge. Thus, we generated parallel pentameric models (Fig. S7) to specify the 13 C-19 F and NHHC intermolecular distance restraints in a directional fashion where possible. The models take into account the water-and lipid-edited spectra (vide infra) to pinpoint the pore-facing versus lipid-facing orientation of the residues. An ideal helix model that puts N15, L19, V25, L31 and T35 to be pore-facing and Phe sidechains to be lipid-facing does not satisfy all the experimental constraints (Fig. S7a) . The measured Cβ secondary shifts (Fig. S3b) , with L21 having a 1.4 ppm downfield-shifted Cβ compared to the average of all other helical Leu residues, indicate that the helix is disordered between residues F20 and F23, consistent with previous solution NMR data 10 . Given this disorder, we generated four alternative models (Fig. S7b-e) that satisfy the measured interhelical Phe-Phe 13 C-19 F contacts. Only one model with F26-F23 interhelical contact adequately reproduces the key features of the experimental data. This model was then used to disambiguate the NHHC and 13 C-19 F distance restraints (Tables S2, S3) , by mainly considering only residues that are less than four residues away in the primary sequence and that are in close proximity between two helical wheels. With this approach, 42 of the 87 interhelical restraints were set to be unambiguous. We calculated the ETM structure for residues 8-38 using XPLOR-NIH 50 hosted on the NMRbox computing platform 51 . The calculation contained two stages. In the first, annealing, stage, five extended ETM monomers were placed in a parallel pentamer geometry with each monomer located 20 Å from the center of the pentamer. A total of 120 independent XPLOR-NIH runs were performed with 5,000 steps of torsion angle dynamics at 5,000 K followed by annealing to 20 K in decrements of 20 K with 100 steps at each temperature. After the annealing, final energy minimizations in torsion angle and Cartesian coordinates were carried out. The five monomers were restrained to be identical in the annealing step using the non-crystallographic symmetry term PosDiffPot and the translational symmetry term DistSymmPot. Chemical-shift derived (ϕ, ψ) torsion angles predicted by TALOS-N were implemented with the XPLOR dihedral angle restraint term CDIH with ranges set to the higher value between twice the TALOS-N predicted uncertainty and 20°. The interhelical distance restraints (Tables S2, S3) were implemented using the NOE potential. Distance upper limits were set to 9.0 Å and 11.5 Å for 500 μs and 1000 μs of 1 H-1 H mixing for the NHHC constraints. Negative REDOR contacts, i.e., 13 C sites without dephasing, were implemented as two NOE's: one to each neighboring helix. Implicit hydrogen bonds using the hydrogen-bonding database potential term HBDB were implemented during annealing to favor formation of the α-helix conformation. Finally, standard XPLOR potentials were used to restrain the torsion angles using a structural database with the term TorsionDB, and standard bond angles and lengths were set with terms BOND, ANGL, IMPR and RepelPot. The structures were sorted by energy, using all the potentials in the calculation. The scales for all potentials are given in Table S4 . In the second, structure refinement, stage, the three lowest-energy structures from the annealing stage were used as independent inputs. A total of 64 independent XPLOR-NIH runs from each of the three starting structures were performed with 5,000 steps of torsion angle dynamics at 1,000 K followed by annealing to 20 K in decrements of 10 K with 100 steps at each temperature. This was followed by final energy minimizations in torsion angle and Cartesian coordinates. All the potentials employed in annealing were also used during refinement, with two additions. The 13 C-13 C correlations were implemented as intramolecular NOE distance restraints with an upper limit of 8.0 Å. Inter-residue cross peaks to long hydrophobic side chains such as Phe, Ile, and Leu were sometimes violated, and consequently the upper limits for these 5% of restraints were increased to 12.0 Å. Explicit hydrogen bonds for residues I13 (hydrogen-bonded to V17) -N15 (hydrogen-bonded to L19) and F23 (hydrogen-bonded to L27) -T30 (hydrogen-bonded to L34) were substituted for implicit hydrogen bonds, using the same HBDB potential. Finally, the scales of the NOE, Repel, and TorsionDB potentials were increased (Table S4) . All 192 structures from the three independent runs were pooled and sorted using the CDIH, NOE, HBDB, BOND, ANGL, IMPR, Repel and Repel14 potentials, while excluding PosDiffPot, DistSymmPot and TorsionDB potentials. The ten structures with the lowest energies across the specified potentials were included in the final structural ensemble. Graphical images depicting the structures were generated in PyMOL v2.3.4. The reported channel radii were calculated using the HOLE program 52 , and represent the radii of the largest sphere that can be accommodated from exclusion of the van der Waals diameter of all atoms at each XY plane along the Z channel coordinate, which is collinear with the bilayer normal and the putative direction of ion permeation. The cutoff radius for the calculation was set to be 5 Å. The output from HOLE was visualized in PyMOL by setting the van der Waals radius of the HOLE-generated spheres 'SPH' to the b-factor values of the SPH output by HOLE. The coordinate file for HMA was generated from bond connectivity using the Chem3D module of ChemDraw Professional 18.1. Ligand geometry was optimized within Chem3D using the MM2 energy minimization module. Docking was performed using the HADDOCK 2.4 webserver using the ensemble of the ten lowest-energy protein structures calculated in XPLOR-NIH. The docking was constrained only with an active list. Active residues were defined as the N-terminal residues with significant CSPs (Fig. 4c) :T9, G10, T11, I13, and S16. Several docking runs were conducted using constraints to this list of residues on all helices, one of the five helices, and different combinations of two of the five helices. Docking calculations were performed using default settings in the HADDOCK 2.4 webserver interface, except that the solvent for the final structural refinement were varied as DMSO and water. The N-and C-termini were set as uncharged as the structural model does not include the full protein sequence. Passive residues were automatically defined around the active residues with a 6.5 Å surface radius cutoff. Non-polar hydrogens were removed from the calculation, and 2 partitions for random exclusion of Ambiguous Interaction Restraints ('AIRs') were used (50% of AIRs were randomly excluded in the calculation). The docking used 1000 structures for rigid-body docking with 5 trials of rigid-body minimization. Semiflexible refinement was done with 200 structures selected from the rigid-body minimization stage. Final refinement with explicit solvent (DMSO or water) were performed on all 200 structures from semi-flexible refinement. Output structures were aligned and analyzed in Pymol 2.3.5. In docking with DMSO as a refinement solvent, 200 refined structures were grouped into 5 clusters, with 154 structures belonging to the lowest energy cluster 1. The four best structures of this cluster had an average HADDOCK score of -29.8 +/-2.1, and a Z-score of -1.7. Similar results were obtained with docking with water as a refinement solvent, the 200 refined structures were grouped into three clusters, where cluster 1 contained 170 structures. The four best structures this cluster had a HADDOCK score of -29.3 +/-1.9, with a Z-score of -1.4. The majority of docked structures converged to a state where HMA partitioned to the N-terminal entry cavity of the channel, but the HMA orientation is variable. Visual inspection showed three distinct orientations: 1) HMA tilted and diving into the pore with the hexamethylene ring facing up (Fig. S9a) , 2) HMA tilted and diving into the pore with the ring facing down (Fig. S9b) , 3) HMA laying horizontally across the top of the channel, with the guanidinium intercalated between two helices (Fig S9c) . Among the 32 lowest energy structures in the DMSO and water docking results, 13/32 structures belonged to the first mode (ring up), 6/32 to the second mode (ring down), and 13/32 to the third mode (horizontal). All these modes indicate a pore-occlusion mechanism, similar to the amantadine inhibition mechanism of the influenza AM2 proton channel 9 . Given the hydrophobic nature of HMA, another possible binding mode would be drug binding from the lipid side to the exterior of the helical bundle. This mode could explain the chemical shift perturbation of the lipid-facing S16, but the mechanism of inhibition would be indirect allosteric narrowing of the pore, and would require multiple drug molecules to bind each pentamer to preserve the symmetry, as only a single set of peaks are observed in the drug-bound protein spectra. Thus we consider this mechanism less likely than direct occlusion of the pore. The lipidbinding mode was only observed in docking runs with only one or two helices containing the active residues. Ca correlation spectra of high-pH ETM with 5 mM NaCl and low-pH ETM with 5 mM CaCl2. Chemical shift changes are observed for C-terminal residues such as R38, L37 and L34 (yellow highlighted regions). (b) 2D 13 C-13 C correlation spectra of low-pH ETM with CaCl2 and high-pH ETM with NaCl. (c) 2D 13 C-13 C correlation spectrum of low-pH ETM with CaCl2 and low-pH ETM without salt. These spectra show that chemical shift changes mainly result from pH changes. Figure S5 . Additional 13 C-19 F REDOR spectra and water-edited spectra to determine the interhelical assembly of ETM. (a) 2D 13 C-13 C correlation spectrum of mixed 4-19 F-Phe labeled and U-13 C, 15 N-labeled ETM (black). The 13 C chemical shifts of most residues are similar to the 13 C, 15 Nlabeled protein (red), indicating that fluorination does not significantly perturb the ETM conformation. F20/23 Cβ, F26 Cβ, and T30 Cg2 show small chemical shift changes (blue) of 0.3-0.6 ppm. The spectra were measured at 293 K. (b) 1D 13 C-19 F REDOR control (S0), dephased (S), and difference (DS) spectra. The difference peaks indicate carbons that are in close proximity to a fluorine in a neighboring helix. The broadband REDOR spectra (left) show both sidechain and backbone 13 C signals whereas the Ca-selective REDOR spectra (right) detect only Ca signals. (c) Representative 13 C-19 F REDOR dephasing curves for broadband and Cα-selective C-F REDOR spectra. The S/S0 values have been corrected for the isotopic dilution factor (50%) and the overlap factor. Best-fit distance curves are shown as solid lines, and lower and upper distance bounds are shown as dashed lines. (d) Water-edited 2D 15 N-13 Cα correlation spectra to detect well hydrated residues. The spectra were measured at 293 K under 11.8 kHz MAS using 1 H-1 H mixing times of 9 ms (red) and 100 ms (blue). Table S2 . Interhelical 1 H-1 H distance restraints obtained from the NHHC spectra. The direction of the interhelical contact from the 15 N to the 13 C is indicated as 'CCW' for counter-clockwise, 'CW' for clockwise, and 'Ambig' for either of the two neighboring helices during structure calculation. Definitions of symbols: B0 = magnetic field; Tbearing = thermocouple-reported bearing gas temperature; νMAS = MAS frequency; ns = number of scans per free induction decay; τrd = recycle delay; t1,max = maximum t1 evolution time; t1,inc = t1 increment; t2,max = maximum t2 evolution time; t2,inc = t2 increment; τdwell = dwell-time in the direct dimension; τacq = maximum acquisition time in the direct dimension; τHC = 1 H-13 C cross polarization contact time; τCORD = 13 C-13 C mixing time using CORD; ν1Hacq = 1 H rf field strength for decoupling during acquisition; τNC = total 15 N-13 C TEDOR recoupling time; τHN = 1 H-15 N cross polarization contact time; τ1Hexc = 1 H 90° pulse length for selective excitation of water; τ1HSD = 1 H-1 H spin diffusion mixing time; τ13Cαsel = 180° pulse length for selective inversion of Cα resonances; τCFREDOR = total 13 C-19 F REDOR recoupling time; ν1HREDOR = 1 H rf field strength for decoupling during 13 C-19 F REDOR. for NMR structural analysis of organic and biological solids 2D and 3D 15N-13C-13C NMR chemical shift correlation spectroscopy of solids: assignment of MAS spectra of peptides Cross polarization in the tilted frame: assignment and spectral simplification in heteronuclear spin systems Resonance Assignment for Solid Peptides by Dipolar-Mediated 13C/15N Correlation Solid-State NMR Structural constraints from proton-mediated rare-spin correlation spectroscopy in rotating solids Rapid measurement of long-range distances in proteins by multidimensional (13)C-(19)F REDOR NMR under fast magic-angle spinning Two-dimensional 19F-13C correlation NMR for 19F resonance assignment of fluorinated proteins NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks Nmrglue: an open source Python package for the analysis of multidimensional NMR data Hydration and Dynamics of Full-Length Tau Amyloid Fibrils Investigated by Solid-State Nuclear Magnetic Resonance A General Simulation Program for Solid-State NMR Spectroscopy The Xplor-NIH NMR molecular structure determination package NMRbox: A Resource for Biomolecular NMR Computation HOLE: a program for the analysis of the pore dimensions of ion channel structural models