key: cord-261961-u4d0vvmq authors: St-Germain, Jonathan R.; Astori, Audrey; Samavarchi-Tehrani, Payman; Abdouni, Hala; Macwan, Vinitha; Kim, Dae-Kyum; Knapp, Jennifer J.; Roth, Frederick P.; Gingras, Anne-Claude; Raught, Brian title: A SARS-CoV-2 BioID-based virus-host membrane protein interactome and virus peptide compendium: new proteomics resources for COVID-19 research date: 2020-08-28 journal: bioRxiv DOI: 10.1101/2020.08.28.269175 sha: doc_id: 261961 cord_uid: u4d0vvmq Key steps of viral replication take place at host cell membranes, but the detection of membrane-associated protein-protein interactions using standard affinity-based approaches (e.g. immunoprecipitation coupled with mass spectrometry, IP-MS) is challenging. To learn more about SARS-CoV-2 - host protein interactions that take place at membranes, we utilized a complementary technique, proximity-dependent biotin labeling (BioID). This approach uncovered a virus-host topology network comprising 3566 proximity interactions amongst 1010 host proteins, highlighting extensive virus protein crosstalk with: (i) host protein folding and modification machinery; (ii) membrane-bound vesicles and organelles, and; (iii) lipid trafficking pathways and ER-organelle membrane contact sites. The design and implementation of sensitive mass spectrometric approaches for the analysis of complex biological samples is also important for both clinical and basic research proteomics focused on the study of COVID-19. To this end, we conducted a mass spectrometry-based characterization of the SARS-CoV-2 virion and infected cell lysates, identifying 189 unique high-confidence virus tryptic peptides derived from 17 different virus proteins, to create a high quality resource for use in targeted proteomics approaches. Together, these datasets comprise a valuable resource for MS-based SARS-CoV-2 research, and identify novel virus-host protein interactions that could be targeted in COVID-19 therapeutics. The SARS-CoV-2 ~30 kb positive strand RNA genome (Genbank MN908947.3 1, 2 ) contains two large open reading frames (ORF1a and ORF1ab) encoding polyproteins that are cleaved by viral proteases into ~16 non-structural proteins (NSPs). 13 smaller 3' ORFs encode the primary structural proteins of the virus, spike (S), nucleocapsid (N), membrane (M) and envelope (E), along with nine additional polypeptides of poorly understood function (Fig 1A) . To enable proteomics-based approaches for the analysis of complex biological samples, we analyzed both SARS-CoV-2-infected cell lysates and mature virions, generating a high confidence virus peptide spectrum compendium. This dataset can be used e.g. for the selection of virus peptides for use in targeted proteomics approaches (e.g. the identification of viral peptides in human clinical samples), or for the generation of peptide spectral libraries for increased sensitivity of detection. Two SARS-CoV-2 -host protein-protein interaction (PPI) mapping efforts have utilized immunoprecipitation coupled with mass spectrometry (IP-MS) of epitope-tagged viral proteins to identify >1000 putative virus-host PPIs in HEK293T 3 and A549 4 cells. While extremely powerful for identifying stable, soluble protein complexes, IP-MS approaches are not optimal for the capture of weak or transient PPIs, or for the detection of PPIs that take place at poorly soluble intracellular locations such as membranes, where key steps in viral replication occur. To better understand how host cell functions are hijacked and subverted by SARS-CoV-2 proteins, we used proximity-dependent biotinylation (BioID 5 ) as a complementary approach to map virus-host protein proximity interactions in live human cells. This dataset provides a valuable resource for better understanding SARS-CoV-2 pathogenesis, and identifies numerous previously undescribed virus-host PPIs that could represent attractive targets for therapeutic intervention. A SARS-CoV-2 peptide compendium. To identify high quality tryptic virus peptides for use in targeted proteomics analyses, data-dependent acquisition (DDA) mass spectrometry was conducted on the mature SARS-CoV-2 virion. The Toronto SB3 virus strain 6 was cultured in VeroE6 cells (MOI 0.1), and culture media was collected 48 hrs post-infection. Virus was concentrated by centrifugation, inactivated by detergent and subjected to tryptic proteolysis. The resulting viral tryptic peptides were identified using nanoflow liquid chromatography -tandem mass spectrometry (LC-MS/MS; Fig 1A, Together, these data confirm and expand upon previous proteomic analyses of SARS-CoV-2 virions, infected cells 4, 7-11 and patient samples [12] [13] [14] , and provide a library of high quality virus peptide spectra covering 17 virus proteins that can be used for the creation of peptide spectral libraries and targeted proteomics approaches. A SARS-CoV-2 -host protein proximity interactome. Based on standard transcript mapping algorithms and conservation with ORFs in other coronaviruses 1, 2 , we created a SARS-CoV-2 open reading frame (ORF) vector set 15 (Fig 1A) . Nine SARS-CoV-2 proteins are predicted to have one or more transmembrane domains (S, E, M, NSP3, NSP4, NSP6, ORF3A, ORF7A and ORF7B). To better characterize SARS-CoV-2 -host membrane-associated PPIs, these virus ORFs (along with the remaining poorly understood open reading frames ORF3B, ORF6, ORF8 and ORF9B) were fused in-frame with an N-terminal BirA* (R118G) coding sequence, and the resulting fusion proteins individually expressed in HEK 293 Flp-In T-REx cells. Using these cells, a virus-host PPI landscape was characterized using BioID (as in 16 ; Supp Table 2 ). Applying a Bayesian false discovery rate of ≤1%, 3566 high confidence proximity interactions were identified with 1010 unique human proteins (all raw data available at massive.ucsd.edu, accession #MSV000086006). 412 prey polypeptides were detected as high confidence interactors for a single SARS-CoV-2 bait protein in this analysis, underscoring the high degree of specificity in this virus-host proximity interaction map. Bait-bait correlation analysis (Fig 1B) , based on similarity between interactomes (Jaccard Index analysis conducted in ProHits-viz 17 ) revealed high levels of correspondence between the S (Spike), E, M, NSP4, NSP6, ORF7A, ORF7B, and ORF8 bait proteins. The NSP2, NSP3, ORF3A, ORF3B, ORF6, and ORF9B interactomes shared a lower degree of similarity with the other bait proteins in this set. Bait proteins with one or more predicted transmembrane domains thus largely clustered together, with the exception of ORF3A (which clusters outside the main group of putative membrane baits, even though it is predicted to possess three transmembrane helices), and ORF8 (which clusters with the putative membrane baits, but has no predicted transmembrane domain itself). A self-organized force-directed bait-prey topology map was next generated, in which map location is determined by the number and abundance (i.e. total peptide counts) of host cell interactors (Fig 1C) . This approach similarly clustered all of the baits with one or more predicted transmembrane helices, along with ORF8, in a dense "core" region of the map, indicating that these bait proteins share a large proportion of common interactors. NSP3 and ORF3A occupy regions at the edge of this dense region of the map, indicating a lower number of shared interactors with the other membrane proteins. NSP2, ORF3B and ORF9B occupy peripheral regions of the map, indicating that they share far fewer interactors with the rest of the baits analyzed here. Interestingly, ORF6 occupies a region of the map near ORF3A (these two baits were also clustered near each other in the bait-bait analysis, above). Consistent with this location, more than half of the ORF3A interactome (39 of 73 proteins) is also present in the larger ORF6 interactome (217 proteins). This overlapping group of interactors is enriched in plasma membrane (PM) and ER proteins. Based on this observation, it will be interesting to explore similarities in ORF3A and ORF6 function. As a whole, the virus-host interactome is significantly enriched in proteins associated with the endoplasmic reticulum (ER)/nuclear, Golgi and plasma membranes, and ER-Golgi trafficking vesicles ( Even amongst those viral proteins that appear to localize exclusively to the ER-Golgi-PM endomembrane membrane system, specificity in virus-host interactomes was observed, likely reflecting preferences for interactions with different subsets of membrane proteins and/or localization to unique membrane lipid nanodomains. For example, both the ORF7A and ORF7B interactomes are enriched in PM solute channels, but ORF7A appears to interact uniquely with the anion exchanger SLC4A2, the taurine transporter SLC6A6, and the glycine transporter SLC6A9, while ORF7B interacts specifically with the amino acid transporter SLC1A5, the sulfate transporter SLC26A11, and the divalent metal transporter SLC39A14. Autophagy is an important part of the innate immune response, effecting the elimination of intracellular pathogens such as viruses (virophagy), and delivering them to the lysosome, which processes pathogen components for antigen presentation 26, 27 . Many viruses have thus evolved strategies to inhibit the host autophagic machinery. Notably, however, (+)RNA viruses appear to be dependent on autophagic function for efficient replication, and hijack components of the autophagic machinery for use in membrane re-organization and the creation of ROs 28 . Consistent with these observations, our data highlight multiple interactions amongst SARS-CoV-2 proteins and the ER-phagy receptors FAM134B, TEX2644 and SEC24C. A number of virus protein interactions were also detected with components of the UFMylation system (DDRGK1, CDK5RAP3, UFL1 and UFSP2), which was recently shown to play a key role in ER-phagy 10 , highlighting interesting links between specific autophagy pathways and SARS-CoV-2. interactome is significantly enriched in mitochondrial proteins (Supp Table 2, Supp Table 3 ). Amongst the high confidence ORF9B interactors is the mitochondrial antiviral signaling protein (MAVS), which acts as a hub for cell-based innate immune signaling. The cellular pattern recognition receptors (PRRs) detect pathogen-associated molecular patterns (PAMPs, e.g. viral RNAs). PAMP-bound PRRs interact with MAVS, which activates the NF-kB and Type I interferon signaling pathways 29 . Many different viruses block the host antiviral response by interfering with MAVS signaling. This may be accomplished by e.g. direct MAVS cleavage by viral proteases (a strategy used by HAV, HCV and coxsackievirus B3), or via 26S proteasomemediated degradation, a strategy used by SARS-CoV ORF9B 30 , which recruits the HECT E3 ligase ITCH/AIP4 to effect MAVS ubiquitylation. We did not detect any ubiquitin E3 ligases in the ORF9B BioID interactome, but consistent with a recent report indicating that SARS-CoV-2 ORF9B binds directly to TOMM70 31 to block MAVS-mediated IFN signaling, we detected TOMM70 as a major component of the ORF9B interactome. The SARS-CoV-2 ORF6 interactome is uniquely enriched in nuclear pore complex (NPC) components (Supp Table 2, Supp Table 3 ). SARS-CoV ORF6 was shown to inhibit NPCmediated transport by tethering the importin proteins KPNA2 and KPNB2/TNPO1 to ER/Golgi membranes 32 , which effectively blocks the import of immune signaling proteins such as STAT1 into the nucleus. SARS-CoV-2 ORF6 shares 69% identity at the amino acid level with its SARS-CoV counterpart, and similarly displays potent immune repressor function 33 . It will be interesting to detemine if the SARS-CoV2 ORF6 interaction with NPC components leads to a similar disruption of immune signaling and nuclear transport. ORF3B interacts specifically with LAMTOR1 and LAMTOR2, components of the Ragulator complex, which is localized to the lysosomal membrane and regulates the mechanistic target of rapamycin complex 1 (mTORC1). mTOR signaling is inactivated by amino acid starvation and other types of stress, to inhibit cap-dependent translation and upregulate autophagy 34 . Many viruses have thus evolved mechanisms to maintain mTORC1 activity during infection 35 . Recent work has shown that the LAMTOR1/2 proteins may also play important roles in xenophagy 36 . Based on these observations, SARS-CoV-2 ORF3B could play an important role in regulating mTORC1 activity and/or in the disruption of antiviral immune function. In addition to ER/Golgi proteins, SARS-CoV-2 NSP3 interacts with the cytoplasmic RNA binding proteins FXR1 and FXR2 37 . The FXR proteins were identified as host cell components of (+)RNA Equine Encephalitis Virus (EEV) RNA replication complexes (RC) [38] [39] [40] . FXRs are recruited to the viral RC by EEV nsP3 proteins, and the FXRs are required for RC assembly. It will be interesting to determine whether the FXRs play similar roles in SARS-CoV-2 RNA replication. We and others have previously reported that orthogonal PPI discovery approaches such as proximity-dependent biotinylation (BioID) can provide information that is highly complementary to IP-MS datasets 16, [41] [42] [43] [44] . To this end, we applied BioID to identify proximity partners for SARS-CoV-2 proteins in the proteomics "workhorse" 293 cell system. This mapping project significantly expands upon the SARS-CoV-2 virus-host interactome, providing a rich resource that can be mined by the scientific community for better understanding SARS-CoV-2 pathobiology, and identifying virus-host membrane protein interactions that could be targeted in COVID-19 therapeutics. The design and implementation of sensitive mass spectrometric approaches for the analysis of complex biological samples will be important for clinical and basic research proteomics focused on COVID-19. To this end, we also undertook an analysis of SARS-CoV-2 virions and infected Vero cell lsyates using data-dependent acquisition tandem mass spectrometry, and identified 189 unique tryptic peptides, assigned to 17 different virus proteins. This work provides a significantly expanded SARS-CoV-2 tryptic peptide compendium for use in targeted proteomics approaches such as parallel or selected reaction monitoring (PRM/SRM), or for use in spectral library building. BioID, mass spectrometry and data analysis were conducted exactly as in 16 Supp Table 1 . Virus peptide identification dataset, complete raw data. Data for viral preparation and infected cells is presented in individual tabs. Table 2 Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan A new coronavirus associated with human respiratory disease in China Multi-level proteomics reveals host-perturbation strategies of A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells Sequence, Infectivity, and Replication Kinetics of Severe Acute Respiratory Syndrome Coronavirus 2. Emerg Infect Dis Data, reagents, assays and merits of proteomics for SARS-CoV-2 research and testing Shortlisting SARS-CoV-2 Peptides for Targeted Studies from Experimental Data-Dependent Acquisition Tandem Mass Spectrometry Data Shotgun proteomics analysis of SARS-CoV-2-infected cells and how it can optimize whole viral particle antigen production for vaccines A Genome-wide ER-phagy Screen Highlights Key Roles of Mitochondrial Metabolism and ER-Resident UFMylation Proteomics of SARS-CoV-2-infected host cells reveals therapy targets Mass Spectrometric Identification of SARS-CoV-2 Proteins from Gargle Solution Samples of COVID-19 Patients Mass-Spectrometric Detection of SARS-CoV-2 Virus in Scrapings of the Epithelium of the Nasopharynx of Infected Patients via Nucleocapsid N Protein Proteotyping SARS-CoV-2 Virus from Nasopharyngeal Swabs: A Proof-of-Concept Focused on a 3 Min Mass Spectrometry Window Global interactomics uncovers extensive organellar targeting by Zika virus ProHits-viz: a suite of web tools for visualizing interaction proteomics data An intramembrane chaperone complex facilitates membrane protein biogenesis Here, there, and everywhere: The importance of ER membrane contact sites +)RNA viruses rewire cellular pathways to build replication organelles Rhinovirus uses a phosphatidylinositol 4-phosphate/cholesterol countercurrent for the formation of replication compartments at the ER-Golgi interface Fat(al) attraction: Picornaviruses Usurp Lipid Transfer at Membrane Contact Sites to Create Replication Organelles Host Lipids in Positive-Strand RNA Virus Genome Replication The Oxysterol-Binding Protein Cycle: Burning Off PI(4)P to Transport Cholesterol Hepatitis C Virus Replication Depends on Endosomal Cholesterol Homeostasis Digesting the crisis: autophagy and coronaviruses Autophagy enhances the presentation of endogenous viral antigens on MHC class I molecules during HSV-1 infection Manipulation of autophagy by (+) RNA viruses Regulation of MAVS Expression and Signaling Function in the Antiviral Innate Immune Response SARS-coronavirus open reading frame-9b suppresses innate immunity by targeting mitochondria and the MAVS/TRAF3/TRAF6 signalosome SARS-CoV-2 Orf9b suppresses type I interferon responses by targeting TOM70 Viral subversion of nucleocytoplasmic trafficking SARS-CoV-2 nsp13, nsp14, nsp15 and orf6 function as potent interferon antagonists Nutrient regulation of mTORC1 at a glance Adapting the Stress Response: Viral Subversion of the mTOR Signaling Pathway LAMTOR2/LAMTOR1 complex is required for TAX1BP1-mediated xenophagy Concise review: Fragile X proteins in stem cell maintenance and differentiation Mutations in Hypervariable Domain of Venezuelan Equine Encephalitis Virus nsP3 Protein Differentially Affect Viral Replication Hypervariable Domain of Eastern Equine Encephalitis Virus nsP3 Redundantly Utilizes Multiple Cellular Proteins for Replication Complex Assembly New World and Old World Alphaviruses Have Evolved to Exploit Different Components of Stress Granules, FXR and G3BP Proteins, for Assembly of Viral Replication Complexes BioID-based Identification of Skp Cullin F-box (SCF)beta-TrCP1/2 E3 Ligase Substrates Getting to know the neighborhood: using proximity-dependent biotinylation to characterize protein complexes and map organelles Parallel Exploration of Interaction Space by BioID and Affinity Purification Coupled to Mass Spectrometry Proximity biotinylation and affinity purification are complementary approaches for the interactome mapping of chromatin-associated protein complexes SAINT: probabilistic scoring of affinity purification-mass spectrometry data