key: cord-0000667-x5ardo5j authors: Pedersen, Lasse Eggers; Harndahl, Mikkel; Rasmussen, Michael; Lamberth, Kasper; Golde, William T.; Lund, Ole; Nielsen, Morten; Buus, Soren title: Porcine major histocompatibility complex (MHC) class I molecules and analysis of their peptide-binding specificities date: 2011-07-08 journal: Immunogenetics DOI: 10.1007/s00251-011-0555-3 sha: 28e58cd0eba1d560f6dbc872d7df80e0c98cfced doc_id: 667 cord_uid: x5ardo5j In all vertebrate animals, CD8(+) cytotoxic T lymphocytes (CTLs) are controlled by major histocompatibility complex class I (MHC-I) molecules. These are highly polymorphic peptide receptors selecting and presenting endogenously derived epitopes to circulating CTLs. The polymorphism of the MHC effectively individualizes the immune response of each member of the species. We have recently developed efficient methods to generate recombinant human MHC-I (also known as human leukocyte antigen class I, HLA-I) molecules, accompanying peptide-binding assays and predictors, and HLA tetramers for specific CTL staining and manipulation. This has enabled a complete mapping of all HLA-I specificities (“the Human MHC Project”). Here, we demonstrate that these approaches can be applied to other species. We systematically transferred domains of the frequently expressed swine MHC-I molecule, SLA-1*0401, onto a HLA-I molecule (HLA-A*11:01), thereby generating recombinant human/swine chimeric MHC-I molecules as well as the intact SLA-1*0401 molecule. Biochemical peptide-binding assays and positional scanning combinatorial peptide libraries were used to analyze the peptide-binding motifs of these molecules. A pan-specific predictor of peptide–MHC-I binding, NetMHCpan, which was originally developed to cover the binding specificities of all known HLA-I molecules, was successfully used to predict the specificities of the SLA-1*0401 molecule as well as the porcine/human chimeric MHC-I molecules. These data indicate that it is possible to extend the biochemical and bioinformatics tools of the Human MHC Project to other vertebrate species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00251-011-0555-3) contains supplementary material, which is available to authorized users. Major histocompatibility complex class I (MHC-I) molecules are found in all vertebrate animals where they play a crucial role in generating specific cellular immune responses against viruses and other intracellular pathogens. They are highly polymorphic proteins that bind 8-11 amino acid long peptides derived from the intracellular protein metabolism. The resulting heterotrimeric complexes-consisting of the MHC-I heavy chain, the monomorphic light chain, beta-2 microglobulin (β 2 m), and specifically bound peptides-are translocated to the cell surface where they displayed as target structures for peptide-specific, MHC-Irestricted CTLs. If a peptide of foreign origin is detected, the T cells may become activated and kill the infected target cell. MHC-I is extremely polymorphic. In humans, more than 3,400 different human leukocyte antigen class I (HLA-I) molecules have been registered (as of January 2011), and this number is currently growing rapidly as more efficient HLA typing techniques are employed worldwide. The polymorphism of the MHC-I molecule is concentrated in and around the peptide-binding groove, where it determines the peptide-binding specificity. Due to this polymorphism, it is highly unlikely that any two individuals will share the same set of HLA-I molecules thereby presenting the same peptides and generating T cell responses of the same specificities-something, that otherwise would give microorganisms a strong evolutionary chance of escape. Rather, this polymorphism can be seen as diversifying peptide presentation thereby individualizing T cell responses and reducing the risk that escape variants of microorganisms might evolve. In 1999, we proposed that all human MHC specificities should be mapped ("the Human MHC Project") as a preamble for the application of MHC information and technologies in humans (Buus 1999) . Since then, we have developed large-scale tools that are generally applicable towards this goal: production, analysis, prediction and validation of peptide-MHC interactions (Ferre et al. 2003; Harndahl et al. 2009; Hoof et al. 2009; Larsen et al. 2005; Lundegaard et al. 2008; Nielsen et al. 2003 Nielsen et al. , 2007 Ostergaard et al. 2001; Pedersen et al. 1995; Stranzl et al. 2010; Stryhn et al. 1996) , and a "one-pot, read-and-mix" HLA-I tetramer technology for specific T cell analysis (Leisner et al. 2008) . Here, we demonstrate that many of these tools can be transferred to other vertebrate animals as exemplified by an important livestock animal, the pig. We have successfully generated a recombinant swine leukocyte antigen I (SLA-I) protein, SLA-1*0401, one of the most common SLA molecules of swine (Smith et al. 2005) . Using this protein, we have developed the accompanying biochemical peptide-binding assays and demonstrated that the immunoinformatics tools originally developed to cover all HLA-I molecules, despite the evolutionary distance, can be applied to SLA-I molecules. We suggest that the "human MHC project" can be extended to cover other species of interest. All peptides were purchased from Schafer-N, Denmark (www.schafer-n.com). Briefly, they were synthesized by standard 9-fluorenylmethyloxycarbonyl (Fmoc) chemistry, purified by reversed-phase high-performance liquid chromatography (to at least >80% purity, frequently 95-99% purity), validated by mass spectrometry, and quantitated by weight. Positional scanning combinatorial peptide libraries (PSCPL) peptides were synthesized using standard solidphase Fmoc chemistry on 2-chlorotrityl chloride resins. Briefly, an equimolar mixture of 19 of the common Fmoc amino acids (excluding cysteine) was prepared for each synthesis and used for coupling in 8 positions, whereas a single type of Fmoc amino acid (including cysteine) was used in one position. This position was changed in each synthesis starting with the N-terminus and ending with the C-terminus. In one synthesis, the amino acid pool was used in all nine positions. X denotes the random incorporation of amino acids from the mixture, whereas the single letter amino acid abbreviation is used to denote identity of the fixed amino acid. The peptides in each synthesis were cleaved from the resin in trifluoroacetic acid/1,2-ethanedithiol/triisopropylsilane/water 95:2:1:3 v/v/v/v, precipitated in cold diethylether, and extracted with water before desalting on C18 columns, freeze drying, and weighting. Recombinant constructs encoding chimeric and SLA-1*0401 molecules A synthetic gene encoding a transmembrane-truncated fragment encompassing residues 1 to 275 of human HLA-A*11:01 alpha chain followed by a FXa-BSP-HAT tag (FXa = factor Xa cleavage site comprised of the amino acid sequence IEGR, BSP = biotinylation signal peptide, HAT = histidine affinity tag for purification purposes; see Online Resource 1) had previously been generated and inserted into the pET28 expression plasmid (Novagen) (Ferre et al. 2003) . Synthetic genes encoding the corresponding fragments of the SLA-1*0401 alpha chain (α 1 α 2 ) and α 3 , respectively, (Sullivan et al. 1997) were purchased from GenScript. To exchange domains and generate chimeric human/swine MHC-I gene constructs, a type II restriction endonuclease-based cloning strategy (SeamLess® Strategene; Cat#214400, Revision#021003a), with modifications, was used. All primers were purchased HPLC-purified from Eurofins MWG Operon (Ebersberg, Germany), and all PCR amplifications were performed in a DNA Engine Dyad PCR instrument (MJ Research, MN, USA). All constructs were validated by DNA sequencing. The following MHC-I heavy chain constructs were made HHH, HHP, HPP, PHP, and PPP, where the first, second, and third letter indicates domains α 1 (positions 1-90), α 2 (positions 91-181), and α 3 (positions 182-275), respectively, and H indicates that the domain is of HLA-A*11:01 origin, whereas P indicates that it is of SLA-1*0401 origin. Constructs were transformed into DH5α cells, cloned, and sequenced (ABI Prism 3100Avant, Applied Biosystems) . Validated constructs of interest were transformed into an Escherichia coli production cell line, BL21(DE3), containing the pACYC184 expression plasmid (Avidity, Denver, USA) containing an isopropylβ-d-1-thiogalactopyranoside (IPTG)-inducible BirA gene to express biotin-ligase. This leads to almost complete in vivo biotinylation of the desired product (Leisner et al. 2008 ). To maintain the pET28-derived plasmids, the media was supplemented with kanamycin (50 μg/ml) throughout the expression cultures. When appropriate, the media was further supplemented with chloroamphenicol (20 μg/ml) to maintain the BirA containing pACYC184 plasmid. E. coli BL21(DE3) cells transformed with appropriate plasmids were grown for 5 h at 30°C, and a 10-ml sample adjusted to OD (600) =1 was then transferred to a 2-l fedbatch fermentor (LabFors®). To induce protein expression, IPTG (1 mM) was added at OD (600) ∼25 and the culture was continued for an additional 3 h at 42°C (for in vivo biotinylation of the product, the induction media was further supplemented with biotin (Sigma #B4501, 125 μg/ ml)). Samples were analyzed by reducing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) before and after IPTG induction. At the end of the induction culture, protease inhibitor (PMSF, 80 μg/l) was added, and cells were lysed in a cell disrupter (Constant Cell Disruptor Systems set at 2,300 bar) and the released inclusion bodies were isolated by centrifugation (Sorval RC6, 20 min, 17,000×g). The inclusion bodies were washed twice in PBS, 0.5% NP-40 (Sigma), and 0.1% deoxycholic acid (Sigma) and extracted into urea-Tris buffer (8 M urea, 25 mM Tris, pH 8.0), and any contaminating DNA was precipitated with streptomycin sulfate (1% w/v). The dissolved MHC-I proteins were purified by Ni 2 +/IDA metal chelating affinity column chromatography followed by Q-Sepharose ion exchange column chromatography, hydrophobic interaction chromatography, and eventually by Superdex-200 size exclusion chromatography. Fractions containing MHC-I heavy chain molecules were identified by A280 absorbance and SDS-PAGE and pooled. Throughout purification and storage, the MHC-I heavy chain proteins were dissolved in 8 M urea to keep them denatured. Note that the MHC-I heavy chain proteins at no time were exposed to reducing conditions. This allowed purification of highly active pre-oxidized moieties as previously described (Ostergaard et al. 2001) . Protein concentrations were determined by bicinchoninic acid assay. The degree of biotinylation (usually >95%) was determined by a gel-shift assay (Leisner et al. 2008 ). The pre-oxidized, denatured proteins were stored at −20°C in Tris-buffered 8 M urea. Recombinant constructs encoding human and porcine beta-2 microglobulin Recombinant human β 2 m was expressed and purified as described elsewhere (Ostergaard et al. 2001) , (Ferre et al. 2003) . Using a previously reported E. coli codonoptimized gene encoding human β 2 m as template , a gene encoding porcine β 2 m was generated by multiple rounds of site-directed mutagenesis (QuikChange® Stratagene, according to the manufacturer's instructions) (Online Resource 2). Briefly, the genes encoding human or pig β 2 m were N-terminally fused to a histidine affinity encoding tag (HAT) followed by a restriction enzyme encoding tag (FXa), inserted into the pET28 vector and expressed in inclusion bodies in E. coli. The fusion proteins were extracted into 8 M urea, purified by immobilized metal affinity chromatography (IMAC), and refolded by dilution. The fusion tags were then removed by FXa restriction protease digestion. The liberated intact and native human or pig β 2 m were purified by IMAC and gel filtration chromatography, analyzed by SDS-PAGE analysis, concentrated, and stored at −20°C until use (Fig. 1 ). Purification and refolding of recombinant porcine β 2 m proteins Porcine β 2 m was purified in the same way as human β 2 m (Ostergaard et al. 2001; Ferre et al. 2003) . Briefly, the ureadissolved β 2 m protein was purified by Ni 2 +/IDA metal chelating affinity column chromatography, refolded by drop-wise dilution into an excess refolding buffer under stirring (25 mM Tris, 300 mM urea, pH 8.00), and then concentrated (VivaFlow, 10 kDa). The refolded product was purified by Ni 2 +/IDA metal chelating affinity column chromatography again (this time in aqueous buffer, i.e., without urea). Fractions containing HAT-pβ 2 m were iden-tified by SDS-PAGE and pooled. Removal of the HAT tag was performed by cleavage with factor Xa restriction protease (FXa) followed by renewed purified by Ni 2 +/ IDA metal chelating affinity and Superdex200 gel filtration column chromatography, concentrated by spin ultrafiltration (10 kDa), mixed 1:1 with glycerol, and stored at −20°C. Protein samples were mixed 1:1 in SDS sample buffer (4% SDS, 17.4% glycerol, 0.003% bromophenol blue, 0.125 M Tris, 8 mM IAA (iodoacetamide)) with or without reducing agent (2-mercaptoethanol) as indicated, boiled for 3 min, spun at 20,000×g for 1 min, and loaded onto a 12% or 15% running gel with a 5% stacking gel. Gels were run at 180 V, 40 mA for 50 min. Peptide-MHC class I interaction measured by radioassay and spun column chromatography A HLA-A*11:01-binding peptide, KVFPYALINK (nonnatural consensus sequence A3CON1 ), was radiolabeled with iodine ( 125 I) using a chloramine-T procedure (Hunter and Greenwood 1962) . Dose titrations of MHC-I heavy chains (HHH or HHP) were diluted into refolding buffer (Tris-maleate-PBS) and mixed with β 2 m (human or porcine) and radiolabeled peptide, and incubated at 18°C overnight. Then binding of radiolabeled peptide to MHC-I was determined in duplicate by Sephadex™ G50 spun column gel chromatography as previously described . MHC bound peptide eluted in the excluded volume, whereas free peptide was retained on the microcolumn. Both fractions were counted by gamma spectroscopy, and the fraction peptide bound was calculated as excluded radioactivity divided by total radioactivity. To examine the affinity of the interaction, increasing concentrations of unlabeled competitor peptide were added. When conducted under limiting concentrations of MHC-I molecule, the concentration of competitor peptide needed to effect 50% inhibition of the interaction, the IC 50 , is an approximation of the affinity of the interaction between MHC-I and the competitor peptide. Peptide-MHC class I interaction measured by an enzyme-linked immunosorbent assay Peptide-MHC-I interaction was also measured in a modified version of a previously described enzyme-linked immunosorbent assay (ELISA) (Sylvester-Hvid et al. 2002) . Briefly, denatured biotinylated recombinant MHC-I heavy chains were diluted into a renaturation buffer containing β 2 m and graded concentrations of the peptide to be tested and incubated at 18°C for 48 h allowing equilibrium to be reached. We have previously demonstrated that denatured MHC molecules can de novo fold efficiently, however, only in the presence of appropriate peptide. The concentration of peptide-MHC complexes generated was measured in a quantitative sandwich ELISA (using streptavidin as capture layer and the monoclonal anti-β 2 m antibody, BBM1, as detection layer) and plotted against the concentration of peptide offered (Sylvester-Hvid et al. 2002) . A prefolded, biotinylated FLPSDYFPSV/ HLA-A*02:01 (Kast et al. 1994 ) complex was used as standard. Because the effective concentration of MHC (3-5 nM) used in these assays is below the equilibrium dissociation constant (K D ) of most high-affinity peptide-MHC interactions, the peptide concentration, ED 50 , leading to half-saturation of the MHC is a reasonable approximation of the affinity of the interaction. The experimental strategy of PSCPL has previously been described (Stryhn et al. 1996) . The construction of the sublibraries and the ELISA-driven quantitative measurements of MHC interaction are as given above. Briefly, the relative binding (RB) affinity of each PSCPL sublibrary was determined as RB (PSCPL)=ED 50 (X 9 )/ ED 50 (PSCPL) (where ED 50 is the concentration needed to half-saturate a low concentration of MHC-I molecules) and normalized so that the sum of the RB values of the 20 naturally occurring amino acids equals 20 (since peptides with a given amino acid in a given position are 20 times more frequent in the corresponding PSCPL sublibrary than in the completely random X 9 library). A RB value above 2 was considered as the corresponding position and amino acid being favored, whereas a RB value below 0.5 was considered as being unfavorable (these thresholds represent the 95% confidence intervals). An anchor position (AP) value was calculated by the equation ∑(RB−1) 2 . A primary anchor position is characterized by one or few amino acids being strongly preferred and many amino acids being unacceptable. We have arbitrarily defined anchor residues as having an AP value above 15 ). The peptide-SLA-I*0401 binding activity of each sublibrary was determined using previously published biochemical binding assay (ELISA) (Sylvester-Hvid et al. 2002 ) (with the modifications described above). Sequences logos describing the predicted binding motif for each MHC molecule were calculated as described by Rapin et al. (2010) . In short, the binding affinity for a set of 1,000,000 random natural 9mer peptides was predicted using the NetMHCpan method, and the 1% strongest binding peptides were selected for construction of a position-specific scoring matrix (PSSM). The PSSM was constructed as previously described including pseudo-count correction for low counts. Next, sequence logos were generated from the amino acid frequencies identified in the PSSM construction. For each position, the frequency of all 20 amino acids is displayed as a stack of letters. The total height of the stack represents the sequence conservation (the information content), while the individual height of the symbols relates to the relative frequency of that particular symbol at that position. Letter shown upside-down are underrepresented compared to the background (for details see Rapin et al. (2010) ). MHC distance trees were derived from correlations between predicted binding affinities. For each allelic MHC-I molecule, the binding affinity was predicted for a set of 200,000 random natural peptides using the NetMHCpan method. Next, the distance between any two alleles was defined, as D = 1− PCC, where PCC is the Pearson correlation between the subset of peptides within the superset of top 10% best binding peptides for each allele. In this measure, two molecules that share a similar binding specificity will have a distance close to 0 whereas two molecules with non-overlapping binding specificities would have a distance close to 2. Using bootstrap, 100 such distance trees were generated, and branch bootstrap values and the consensus tree were calculated. We have previously generated highly active, recombinant human MHC-I (HLA-I) molecules and accompanying highthroughput assays and bioinformatics prediction resources. Here, we transfer the underlying approaches to an important domesticated livestock animal, the pig, and its MHC system, the SLAs. MHC-I molecules are composed of a unique and highly variable distal peptide-binding platform consisting of the alpha 1 (α 1 ) and alpha 2 (α 2 ) domains of the MHC-I heavy chain (HC) and a much more conserved proximal immunoglobulin-like membrane attaching stalk consisting of the alpha 3 (α 3 ) domain of the HC noncovalently associated with the soluble MHC-I light chain (β 2 m). A priori, the establishment of recombinant SLA molecules is complicated by the lack of validated reagents. Any failure could therefore be caused either by real technical problems in generating SLA molecules, or merely by a lack of information about strong peptide binders to the SLA in question. To reduce this uncertainty, we decided to migrate from human to pig MHC-I in a step-wise manner and generate an intermediary chimeric MHC-I molecule composed of a well-known human peptide-binding platform attached to a SLA stalk, which might allow us to assess whether we could generate a functional SLA stalk consisting of SLA-1*0401 α 3 HC and pig β 2 m. To this end, we used the α 1 α 2 domains of the HLA-A*11:01 molecule, which we expected should be able to bind a known highaffinity HLA-A*11:01-binding peptide (KVFPYALINK). This peptide could be 125 I radiolabeled and used in a very robust peptide-binding assay testing whether the human stalk could be replaced with the corresponding SLA stalk. Once that had been successfully established, the entire SLA-1*0401 molecule would be constructed and tested. We have previously expressed and purified the extracellular segment spanning positions 1-275 of the human HLA-A*11:01 in a denatured and pre-oxidized version that rapidly refold and bind appropriate target peptides (Ostergaard et al. 2001; Ferre et al. 2003) . Codon-optimized genes encoding the corresponding segments of SLA-1*0401 (α 1 α 2 ) and SLA-1*0401 (α 3 ) were constructed as described in the "Materials and methods" section and used to replace the HLA-A*11:01 gene segment in the above construct generating a new construct allowing for the expression of SLA-1*0401. For the generation of HLA-A*11:01/SLA-1*0401 chimeras, the genes encoding α 1 (spanning positions 1-90), α 2 (spanning positions 91-181), and/or α 3 (spanning positions 182-275) domains of HLA-A*11:01 and SLA-1*0401 were exchanged using Seam-Less and touch-down cloning strategies. Genes encoding the extracellular segments 1-275 of the above natural or chimeric MHC-I molecules were C-terminally fused to a biotinylation tag (as indicated for SLA-1*0401 in Online Resource 1), inserted into pET28, and expressed in inclusion bodies in E. coli (Fig. 2 shows SDS-PAGE of lysates of recombinant E. coli before and 3 h after IPTG induction). The fusion proteins were extracted into 8 M urea (without any reducing agents), purified by ion exchange, hydrophobic and gel filtration chromatography (all conducted in 8 M urea, without any reducing agents) (Fig. 3 shows SDS-PAGE of the purified SLA-1*0401 after gel filtration), concentrated, and stored in urea at −20°C. Testing a chimeric molecule consisting of a SLA-1*0401 stalk and a HLA-A*11:01 peptide-binding platform-comparing human versus porcine β 2 m To test the proximal immunoglobulin-like membrane attaching SLA stalk, we generated recombinant porcine β 2 m and a chimeric human/porcine MHC-I heavy chain molecule where the α 1 α 2 were derived from the human HLA-A*11:01, and the α 3 was derived from the porcine SLA-1*0401. Since this construct contains the entire peptide-binding platform of HLA-A*11:01, we reasoned that the binding of the HLA-A*11:01 restricted peptide, KVFPYALINK, could be used as a functional readout of the refolding, activity, and assembly of the entire chimeric molecule including the porcine SLA stalk. For comparison, we tested the supportive capacity of human β 2 m and folding ability of the entirely human HLA-A*11:01. A total of four combinations could therefore be tested: porcine or human β 2 m in combination with either HHP or HHH (where the first letter indicates the origin of the α 1 domain (Human HLA-A*11:01 or Porcine SLA-1*0401), the second letter the origin of the α 2 domain, and the third letter the origin of the α 3 domain). A concentrationtitration of heavy chain was added to a fixed excess concentration (3 μM) of β 2 m and a fixed trace concentration (23 nM) of radiolabeled peptide. As shown in Figs. 4 and 5, the four combinations gave almost the same heavy chain dose titration with a half-saturation occurring around 1-2 nM heavy chain. Porcine β 2 m supported folding of the chimeric (HHP) α chain slightly better than it supported folding of the human (HHH) α chain. Human β 2 m supported folding of HHP and HHH equally well. Thus, a recombinant SLA stalk can fold and support peptide binding of the peptide-binding platform. These results also suggest that human β 2 m can support folding and peptide binding of porcine MHC-I heavy chain molecules. Using a positional scanning combinatorial peptide library approach to perform an unbiased analysis of the specificity of SLA-1*0401 and human-pig chimeric MHC class I molecules Using human β 2 m to support folding, the recombinant SLA-1*0401 and human-pig chimeric MHC-I molecule were tested for peptide binding. We have previously described how PSCPL can be used to perform an unbiased analysis of MHC-I molecules (Stryhn et al. 1996) . A PSCPL consists of 20 sublibraries for each position where one of each of the 20 natural amino acids have been locked and all other positions contain random amino acids. Analyzing how much of each PSCPL sublibrary is needed to support MHC-I folding (see examples in Fig. 6 ) and comparing each sublibrary with a completely random library, the effect of any amino acid in any position can be examined and expressed as a RB value. Further, an AP value calculated as the sum of squared deviations of RB values for each position can be used to identify the most prominent anchor position (see "Materials and methods" for calculations). Thus, the specificity of a nonamer binding MHC-I molecule can be analyzed comprehensively with 9×20+1 completely random library=181 sublibraries (Stryhn et al. 1996) . Here, this approach was used to perform a complete experimental analysis of SLA-1*0401 and a limited analysis of the chimeric HPP and PHP molecules. A nonamer PSCPL analysis of SLA-1*0401 can be seen in Table 1 . AP values identified positions 9, 3, and 2 (in that order of importance) as the anchor positions of SLA-1*0401. In position 9, the amino acid preferences were dominated by the large and bulky aromatic tyrosine (Y), tryptophane (W), and phenylalanine (F), all having RB values above 4 (Table 1 ). In the almost equally important position 3, preferences for negatively charged amino acids glutamic acid (E) and aspartic acid (D) were observed. In the lesser important position 2, the most preferred amino acids were the hydrophobic amino acids valine (V), isoleucine (I), and leucine (L), followed by the polar amino acids threonine (T) and serine (S). Finally, a very limited PSCPL analysis was performed for the two chimeric human HLA-A*11:01/porcine SLA-1*0401 MHC-I molecules, HPP and PHP (Table 2) . For both chimeric molecules, it could be demonstrated that position 9 is a strong anchor position. The positively charged amino acids, arginine (R) and lysine (K), were preferred in position 9 of the chimeric HPP molecule, whereas the aromatic amino acid, tyrosine (Y), was exclusively preferred in position 9 of the chimeric PHP molecule. The positively charged amino acids, arginine (R) and lysine (K), were preferred in position 9 of the chimeric HPP molecule similar to the position 9 specificity of the HLA-A*11:01 molecule. In contrast, the aromatic amino acid tyrosine (Y) was preferred in position 9 of the chimeric PHP molecule similar to the position 9 specificity of the SLA-1*0401 molecule. Using NetMHCpan to predict peptides that bind to SLA-1*0401 or to human-pig chimeric MHC class I molecules Our recently described neural network-driven bioinformatics predictor, NetMHCpan (version 2.0), has been trained on about 88,000 peptide-binding data points representing more than 80 different MHC-I molecules (primarily HLA-A and HLA-B molecules). We have previously shown that NetMHCpan is an efficient tool to identify peptides that bind to HLA molecules where no prior data exist (Nielsen et al. 2007 ) and demonstrated that NetMHCpan can be extended to MHC-I molecules of other species 1 (Hoof et al. 2009 ). We applied NetMHCpan to our peptide repository of about 10,000 peptides, which over the past decade have been selected to scan infectious agents (e.g., SARS and influenza, Sylvester-Hvid et al. 2004; Wang et al. 2010) , improve coverage of MHC-I specificities (e.g., Buus et al. 2003; Christensen et al. 2003) , etc. We extracted 29 peptides as predicted binders to either the SLA-1*0401, the HPP, or the PHP human/porcine chimeric class I molecules (some of the peptides were predicted to bind to two or even all three of these molecules). All these peptide-MHC-I combinations were tested for binding (Table 3) ; 13 of 14 peptides bound to the SLA-1*0401 molecule with an affinity (IC 50 value) better than 500 nM (6 with an affinity less than 50 nM); all 13 peptides tested on the PHP molecule were strong binders with IC 50 values below 50 nM; and 3 of 12 peptides tested on the HPP molecule bound with an affinity better than 500 nM. Of the 39 peptide-MHC-I combinations tested, 20 (51%) were found to be good binders, 9 (23%) were average binders, and 10 (26%) did not bind well (Table 3) . This is in stark contrast to the 0.5% frequency of binders among randomly selected peptides (Yewdell and Bennink 1999) . Next, the NetMHCpan method was used to generate PSSMs and sequence logos from the corresponding amino acid frequencies as described by Nielsen et al. (2004) . For each position, the frequencies of all 20 amino acids were displayed as a stack of letters showing the sequence conservation/information content (the height of the entire 1 A preliminary report of SLA-1*0401 binding was given in Hoof et al. (2009) The normalized relative binding (RB) value indicates whether an amino acid is favored (RB>2, bold numbers) or disfavored (RB<0.5, italic numbers) in a given peptide position. The anchor position (AP) value is given by the equation ∑(RB−1) 2 . The important anchor positions 2, 3, and 9 for SLA-1*0401 are underlined stack) and the relative frequency of amino acids (the height of the individual amino acids). Figure 7 shows a specificity tree clustering of the SLA-1*0401 molecule compared to prevalent representatives of the 12 common HLA supertypes that NetMHCpan originally intended to cover . By this token, SLA-1*0401 most closely resembles that of HLA-A*01:01. The limited PSCPL analysis of the chimeric MHC-I molecules revealed strong P9 signals with specificities that seemed to be determined by the origin of the α 1 domain: the HPP chimera showed an HLA-A*11:01-like P9 specificity, whereas the PHP chimera showed a SLA-1*0401/HLA-A*01:01-like specificity. Since the NetMHCpan predictor successfully captured these chimeric specificities (see above), we reasoned that the predictor might also be used to perform in silico dissection of these specificities and used the P9 specificity as an example of such an in silico analysis. The NetMHCpan predictor considers a pseudo-sequence consisting of 34 polymorphic positions, which contain residues that are within 4.0 Å of the atoms of bound nonamer peptides (Nielsen et al. 2007 ). Of the 34 positions of the pseudosequence, 10 delineates the P9 binding pocket; however, only 3 of these, positions 74, 77, and 97, differ between SLA-1*0401 and HLA-A*11:01. To explore the effect of these three residues, we performed in silico experiments where we examined single substitutions Y74D, G77D, and S97I (the letter before the position number indicates the SLA-A*0401 single letter residue, whereas the letter after indicates the HLA-A*11:01 residue) as well as the corresponding triple substitution (YGS-DDI). As described above, PSSMs were generated for each of the in silico molecules followed by a specificity tree clustering (including SLA-A*0401, HLA-A*01:01, and HLA-A*11:01). Figure 8 shows this tree along with the sequence logo plots showing the predicted binding specificity of each in silico MHC-I molecule. Albeit the Y74D and G77D single substitutions showing some of the positively charged P9 peptide residue preference of HLA-A*11:01, they still clustered with HLA-A*01:01. In contrast, the in silico (YGS-DDI) triple substitution clustered with the HLA-A*11:01. This suggests that the NetMHCpan method is capable of defining the residues of the F pocket that determine the specificity of position 9. We have previously suggested that the specificities of the entire human MHC-I system should be solved ("the human RB and AP values are defined as described in Table 1 MHC", Buus 1999; Lauemoller et al. 2000) . However, due to the extreme polymorphism of the MHCs, any attempt to address the specificity of the entire MHC system is a significant experimental undertaking. During the past decade, we have established a series of technologies to support a general solution of human MHC class I and II specificities. For MHC-I, this includes (1) a highly efficient E. coli expression system for production of recombinant human and mouse MHC-I molecules (both heavy chain and light chain (β 2 m) molecules) Ostergaard et al. 2001) , (2) a purification system for obtaining the highly active pre-oxidized MHC-I heavy chain species (Ferre et al. 2003) , (3) a high-throughput homogenous peptide-MHC-I binding assay for obtaining large data sets on peptide-MHC-I interactions (Harndahl et al. 2009 ), (4) a positional scanning combinatorial peptide library approach for a robust and unbiased analysis of the specificity of any MHC-I molecule (Stryhn et al. 1996) , (5) an immunobioinformatics approach to generate predictors of the peptide-MHC-I interaction, NetMHCpan, that allows predictions to be made for any human MHC-I molecules, HLA-I, even those that have not yet been covered by existing data set (Hoof et al. 2009; Nielsen et al. 2007) , and finally (6) we have demonstrated that pre-oxidized MHC-I molecules can be used to generate MHC-I tetramers in a simple "one-pot, mix-and-read" manner (Leisner et al. 2008) . Here, we propose that the next goal should be to extend the overall approach to MHC-I molecules of other species of interest. Mouse and rats have been extensively studied in the past, but much less reagents and information have accrued for the MHC-I molecules of other species. Here, we have used an important livestock animal, the pig, as a model system and demonstrated that it indeed is possible to transfer the original human approach to other species. Before attempting to generate a recombinant version of the entire porcine SLA-1*0401 molecule, we grafted the more conserved membrane-proximal "stalk" (the immunoglobulin-like class I heavy chain α 3 and β 2 m domains) of porcine SLA-1*0401 onto the peptide-binding domain of HLA-A*11:01 generating a chimeric human/ porcine MHC-I molecule. This chimeric molecule retained the peptide-binding specificity of the HLA-A*11:01 molecule, and it clearly demonstrated that the recombinant porcine stalk was functional and, by inference, also properly folded. It also suggests that the peptide-binding specificity of the distal domains do not crucially depend upon the identity of the proximal stalk. Further, comparing the ability of human and porcine β 2 m to support MHC-I complex formation using either a human or a porcine MHC-I stalk, we demonstrated that every combination (porcine β 2 m/human-α 3 , porcine β 2 m/porcine-α 3 , human β 2 m/human-α 3 , and human β 2 m/porcine-α 3 ) showed al- most the same heavy chain dose titration with identical half-saturations. These results illustrate the ability for porcine and human β 2 m to support complex formation of SLA molecules and vice versa and suggest evolutionary that the stalk is quite conserved. Next, we generated the entire SLA-1*0401 heavy chain and succeeded in generating complexes using human β 2 m as the light chain and PCSPL as peptide donors. The latter solved the a priori problem of not knowing which peptides would be needed to support proper folding of SLA-1*0401, and it did so in an unbiased manner. Furthermore, this approach is highly efficient since it readily establishes a complete matrix representing the amino acid preference for each amino acid and each position of a nonamer peptide. The specificity of SLA-1*0401 shows two primary anchors: one in positions 9 with a preference for aromatic amino acids and another in position 3 with a preference for negatively charged amino acids. In addition, the SLA-1*0401 features a secondary anchor in position 2 with hydrophobic or polar amino acid preferences. An alternative approach to solve the problem of identifying peptides that support folding of MHC-I molecules of so far unknown specificity is to use our recently developed panspecific predictor, NetMHCpan. The successful use of this predictor to initiate peptide-binding studies was recently Fig. 7 Specificity tree clustering of the SLA-1*0401 molecule compared to prevalent representatives of the 12 common HLA supertypes ). The distance between any two MHC molecules and the consensus tree is calculated as described in "Materials and methods". All branch points in the tree have bootstrap values of 100%. Sequence logos of the predicted binding specificity are shown for each molecule. In the logo, acidic amino acids [DE] are shown in red, basic amino acids [HKR] in blue, hydrophobic amino acids [ACFILMPVW] in black, and neutral amino acids [GNQSTY] in green. The axis of the LOGOs indicates in all case positions one through nine of the motif, and the y-axis the information content (see Materials and methods) demonstrated for HLA-A*3001 . Although originally developed to cover all HLA-A and HLA-B molecules, it has also been shown to extend to MHC-I molecules of other species (Hoof et al. 2009 ). Here, we demonstrate that the NetMHCpan predictor is capable of extracting MHC-I sequence information across species and correctly relate this to peptide binding even in the absence of any available data for the specific query MHC-I molecule, i.e., the SLA-1*0401 as well as the chimeric HPP (hα 1 pα 2 pα 3 ) and PHP (pα 1 hα 2 pα 3 ) molecules. It is not clear why binding of the PHP chimera was more efficiently predicted than binding of the HPP chimera. One could speculate that NetMHCpan has not captured the effect of the different positions of the pseudo-sequence equally well and not all positions and pockets (and by inference-not all chimeric molecules) are therefore predicted equally well. Using the NetMHCpan predictor to cluster SLA-1*0401 and representative molecules of 12 human HLA supertypes according to predicted peptide-binding specificities, the SLA-1*0401 specificity closely resembled that of HLA-A*01:01 (IEDB, http://www.immuneepitope.org/MHCalleleId/142, accessed March 9th 2011). This result was also obvious from an inspection of the PSCPL analysis of the SLA-1*0401. The PSCPL analysis of the P9 specificity of the SLA-1*0401 and the two chimeric molecules suggested that the P9 specificity primarily was determined by the α 1 domain. This contention was further strengthened by a NetMHCpan-driven in silico analysis of the residues delineating the F pocket, which interacts with P9. This suggests that NetMHCpan can be used to design and interpret detailed experiments addressing the structurefunction relationship of peptide-MHC-I interaction. In the case of SLA-1*0401, NetMHCpan suggests that Y74, G77, and S97 play a prominent role in defining the P9 F pocket. Whereas the NetMHCpan readily captured the P9 anchor residue of SLA-1*0401, it did not capture the P3 anchor (at least not in the 2.4 version). We surmise that this shortcoming is due to insufficient examples of the use of P3 anchors within the currently available peptide-MHC-I binding data. Inspecting the pseudo-sequence of SLA-1*0401 and HLA-A*01:01 vs. HLA-A*11:01 suggests that the presence of an arginine in position 156 might Fig. 8 Comparison of specific in silico mutations of the SLA-1*0401 molecule and comparison with the two HLA molecules: HLA-A*11:01 and HLA-A*01:01. The distance between any two MHC molecules and the consensus tree is calculated as described in "Materials and methods". All branch points in the tree have bootstrap values of 100%. The SLA-1*0401 mutations are indicated as Y74D, G77D, and S97I, where the letter before the position number indicates the SLA-1*0401 single letter residue and the letter after indicates the HLA-A*11:01 residue. YGS-DDI is the corresponding triple substitution. Sequence logos are calculated and visualized as described in Fig. 7 . The axis of the LOGOs indicates in all case positions one through nine of the motif, and the y-axis the information content (see Materials and methods) explain the preference for negatively charged amino acid residues in P3. Future NetMHCpan-guided experiments could pointedly address this question, and the resulting data could complement existing data and be used to update and improve the NetMHCpan predictor. All in all the two complementary approaches, PSCPL and NetMHCpan, agreed on the specificity of the SLA-1*0401 molecule, as well as of the two chimeric MHC-I molecules. Thus, the specificity of SLA-1*0401 appear to be well established. This specificity has successfully been used to search for foot-and-mouth disease virus (FMDV)specific CTL epitopes in FMDV-vaccinated, SLA-1*0401positive pigs, and the recombinant SLA-1*0401 molecules have been used to generate corresponding tetramers and stain pig CTLs (Patch et al. 2011) . In conclusion, we here present a set of methods that can be used to generate functional recombinant MHC-I molecules, map their specificities and identify MHC-I-restricted epitopes, and eventually generate peptide-MHC-I tetramers for validation of CTL responses. This suite of methods is not only applicable to humans, but potentially to any species of interest. Description and prediction of peptide-MHC binding: the 'human MHC project Receptorligand interactions measured by an improved spun column chromatography technique. A high efficiency and high throughput size separation method Sensitive quantitative predictions of peptide-MHC binding by a 'Query by Committee' artificial neural network approach Selecting informative data for developing peptide-MHC binding predictors using a query by committee approach Purification of correctly oxidized MHC class I heavy-chain molecules under denaturing conditions: a novel strategy exploiting disulfide assisted protein folding High-throughput polymerase chain reaction cleanup in microtiter format Peptide binding to HLA class I molecules: homogenous, high-throughput screening, and affinity assays NetMHCpan, a method for MHC class I binding prediction beyond humans Preparation of iodine-131 labelled human growth hormone of high specific activity Role of HLA-A motifs in identification of potential CTL epitopes in human papillomavirus type 16 E6 and E7 proteins The peptide-binding specificity of HLA-A*3001 demonstrates membership of the HLA-A3 supertype An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions Identifying cytotoxic T cell epitopes from genomic and proteomic information: "the human MHC project One-pot, mix-and-read peptide-MHC tetramers Definition of supertypes for HLA molecules using clustering of specificity matrices NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11 Reliable prediction of T-cell epitopes using neural networks with novel sequence representations Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence Efficient assembly of recombinant major histocompatibility complex class I molecules with preformed disulfide bonds Induction of footand-mouth disease virus-specific cytotoxic T cell killing by vaccination The interaction of beta 2-microglobulin (beta 2m) with mouse class I major histocompatibility antigens and its ability to support peptide binding. A comparison of human and mouse beta 2m The MHC motif viewer: a visualization tool for MHC binding motifs Peptide binding to the most frequent HLA-A class I alleles measured by quantitative molecular binding assays Nomenclature for factors of the SLA class-I system NetCTLpan: pan-specific MHC class I pathway epitope predictions Peptide binding specificity of major histocompatibility complex class I resolved into an array of apparently independent subspecificities: quantitation by peptide libraries and improved prediction of binding Analysis of polymorphism in porcine MHC class I genes: alterations in signals recognized by human cytotoxic lymphocytes Establishment of a quantitative ELISA capable of determining peptide-MHC class I interaction SARS CTL vaccine candidates; HLA supertype-, genome-wide scanning and biochemical validation HLA class I binding 9mer peptides from influenza A virus induce CD4 T cell responses Immunodominance in major histocompatibility complex class I-restricted T lymphocyte responses Acknowledgments We thank Lise Lotte Bruun Nielsen, Anne Caroline Schmiegelow, and Iben Sara Pedersen for their expert experimental support. This work was in part supported by the Danish Council for Independent Research, Technology and Production Sciences (274-09-0281) and by the National Institute of Health (NIH) (HHSN266200400025C).