key: cord-336573-bpg1dg24 authors: Greenaway, Hui Yee; Kurniawan, Monica; Price, David A; Douek, Daniel C; Davenport, Miles P; Venturi, Vanessa title: Extraction and characterization of the rhesus macaque T cell receptor β-chain genes date: 2009-06-09 journal: Immunol Cell Biol DOI: 10.1038/icb.2009.38 sha: doc_id: 336573 cord_uid: bpg1dg24 Rhesus macaque models have been instrumental for the development and testing of vaccines prior to human studies and have provided fundamental insights into the determinants of immune efficacy in a variety of infectious diseases. However, the characterization of antigen-specific T cell receptor (TCR) repertoires during adaptive immune responses in these models has previously relied on human TCR gene assignments. Here, we extracted and characterized TCR β-chain (TRB) genes from the recently sequenced rhesus macaque genome that are homologous to the human TRB genes. Comparison of the rhesus macaque TRB genes with the human TRB genes revealed an average best-match similarity of 92.9%. Furthermore, we confirmed the usage of most rhesus macaque TRB genes by expressed TCRβ sequences within epitope-specific TCR repertoires. This primary description of the rhesus macaque TRB genes will provide a standardized nomenclature and enable better characterization of TCR usage in studies that utilize this species. The rhesus macaque is widely used as a non-human primate model to study infection and immunity due to the close genetic relationship with humans (∼93% average humanmacaque sequence identity1) and the homology between human and rhesus pathogen genomes2, 3. Indeed, rhesus macaques have been used to study fundamental aspects of immunology, including the development and maintenance of T cell memory 4, immunodominance5 and the aging immune system6. There have also been many studies of Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms immune responses in rhesus macaque models of human infections such as human immunodeficiency virus (HIV)7, influenza virus8, 9, tuberculosis10, Epstein-Barr virus (EBV)11, 12, cytomegalovirus (CMV)4, 13-15, smallpox16, measles17 and severe acute respiratory syndrome (SARS)18. Furthermore, rhesus macaques have been instrumental in the design and testing of vaccines against infections such as HIV19 and smallpox 16. The various roles of T lymphocytes in adaptive immune responses to infection, which include the provision of helper functions to other immune cells and cytolytic control of infected cells, require that T cell populations recognize a large variety of foreign peptides bound to major histocompatibility complex (MHC) molecules. This recognition is facilitated by a diverse repertoire of T cell receptors (TCRs). The TCR repertoires that respond to different peptide-MHC epitopes can vary greatly. Indeed, diversity estimates range from ∼10 to >1000 different TCRs responding to a specific epitope20-23. Moreover, some epitope-specific TCR repertoires can feature biased usage of TCR Vβ (TRBV) or Jβ (TRBJ) genes, or distinct patterns of amino acid usage within the third complementarity-determining region (CDR3)24. Studies of the TCR repertoire can provide valuable information about the molecular evolution of an immune response and the factors that shape clonotype selection in vivo25. Furthermore, it is becoming increasingly apparent that the clonotypic structure of an epitope-specific T cell response can have important implications for the immune control of some viral infections. For example, one issue of current debate that has important consequences for the rational design of immunotherapeutic and vaccination strategies24, 26 is whether a restricted TCR repertoire responding to a highly variable pathogen could be associated with the emergence of viral mutants that escape T cell recognition at this epitope27-31. Many studies of T cell immunity in rhesus macaque models of infection have utilized TCR repertoire data to gain additional insights5, 14, 30, 32-43. In particular, a large number of studies have characterized the TCR repertoires of target CD4 + T cell populations or CD8 + T cell populations involved in the control of simian immunodeficiency virus (SIV) in rhesus macaques5, 30, 32-39, 41-43. Most of these studies have relied on human TCR gene homology to identify V and J gene usage. Although the rhesus macaque TCR Dβ (TRBD) and TRBJ genes have previously been sequenced44, the TRBV genes were not previously available. Here, we present the TRBV, TRBD and TRBJ genes extracted from the rhesus macaque genome1 on the basis of their homology with the human TRB genes. In addition, we demonstrate extracted TRB gene usage in expressed TCRβ sequences by using an existing database of 7218 TCRβ sequences involved in CD8 + T cell responses specific for the immunodominant Mamu-A*01-restricted SL8/TL8 (S/TTPESANL; Tat, residues 28-35) and CM9 (CTPYDINQM; Gag, residues 181-189) epitopes derived from SIV30, 45. The TRB genes extracted from the rhesus macaque genome will enable more accurate characterization of rhesus macaque TCRβ repertoires. TRBV gene corresponding most precisely to each rhesus macaque TRBV gene was identified on the basis of the highest percentage match between the nucleotide sequences for the TRBV genes (i.e. V-GENE in the IMGT standarized labels). The percent similarity between the nucleotide sequences for the rhesus macaque and the best-match human TRBV genes ranged between 78.3% and 96.5%, with an average similarity of 92.2%. We could not identify a one-to-one correspondence between all rhesus macaque and human TRBV genes ( Figure 1 ). In many cases, one human TRBV gene was found to be the best match to more than one of the TRBV genes extracted from the rhesus macaque genome. For example, the human TRBV6-5 gene had the highest percent similarity of all human TRBV genes to five of the rhesus macaque TRBV genes; in contrast, the human TRBV6-6 gene was not the best match to any of the rhesus macaque TRBV genes. For five of the 72 TRBV genes, only partial sequences were available from the rhesus macaque genome (Table S1 in Supporting Information) and only two of these partial TRBV genes were incomplete at the 3' end, which would influence their use in analysis of the CDR3. The human TRBV17 subgroup, consisting of just one gene, was the only one for which no corresponding TRB gene was found in the rhesus macaque genome (using a cutoff of 75% similarity). We also compared the TRBV exons (i.e. L-PART1+V-EXON in the IMGT standardized labels) between the rhesus macaque and best-match human TRBV genes ( Table 1 ). The percent identities between the nucleotide sequences for the rhesus macaque and human TRBV exons ranged between 72.7% and 96.5%, with an average of 92.9%. The similarities between the rhesus macaque and human TRBV exons at the amino acid sequence level ranged between 19.5% and 94.7%, with an average of 85.3%. The two TRBD genes extracted from the rhesus macaque genome were found to have 95.0% and 92.8% agreements at the nucleotide level with the corresponding human TRBD genes ( Table 2 and Rhesus_macaque_TRBD.fsa in Supporting Information). The percent similarities between the rhesus macaque and human TRBD exon (i.e. D-REGION in the IMGT standardized labels) nucleotide sequences were 84.6% and 75.0%. The rhesus macaque TRBD genes have been sequenced in a previous study44. The TRBD1 gene extracted from the rhesus macaque genome does not differ from that reported in this previous study. A 1.2% difference was found between the TRBD2 gene reported here and that reported previously, with a single nucleotide difference occurring in the 5' spacer. Thus, there are no differences in the TRBD2 D-REGION extracted from the rhesus macaque genome compared with that reported previously44. For each of the 14 human TRBJ genes, there was one corresponding TRBJ gene found on chromosome 3 of the rhesus macaque genome (Table 3 and Rhesus_macaque_TRBJ.fsa in Supporting Information). The percent similarities between the rhesus macaque TRBJ genes and the corresponding human TRBJ genes are shown in Table 3 (range: 92.1% and 98.7%; average: 96.1%). A comparison of the rhesus macaque and human TRBJ exons (i.e. J-REGION in the IMGT standardized labels) revealed percent similarities of nucleotide sequences ranging between 90.2% and 100%, with an average similarity of 95.4% (Table 3 ). The similarities between the translated TRBJ exons of the rhesus macaque and human genes ranged between 81.3% and 100%, with an average similarity of 92.3% (Table 3) . We compared the TRBJ genes extracted from the rhesus macaque genome with those reported in a previous study44. The only differences found were in the TRBJ1-6 and TRBJ2-1 genes, which differed by 1.9% and 2%, respectively. A single nucleotide difference in the 20 th nucleotide position of the TRBJ1-6 exon resulted in a difference of a single amino acid (i.e. the TRBJ1-6 exon from the rhesus macaque genome contained H in the 7 th amino acid position instead of Y). In the TRBJ2-1 gene, a single nucleotide difference in the 31 st nucleotide position of the exon did not result in any amino acid differences between the TRBJ2-1 exon extracted from the rhesus macaque genome and that reported by Cheynier et al.44 To demonstrate the use of the TRB genes extracted from the rhesus macaque genome by expressed TCRβ sequences, we used an existing database of 7218 TCRβ sequences involved in CD8 + T cell responses specific for the immunodominant Mamu-A*01-restricted SIV-SL8/TL8 and SIV-CM9 epitopes in 20 rhesus macaques30, 45. Each of these TCRβ sequences was aligned with the TRB gene exons to determine the most likely TRBV, TRBJ and TRBD gene usage. In Table 4 and Table 5 we show the rhesus macaque TRB genes that were found to be most likely used by at least one of the TCRβ sequences. The genes used by the TCRβ sequences included 54 of the 72 TRBV genes, both TRBD genes, and 13 of the 14 TRBJ genes. The highest percent homology and longest match between each TRB gene and a TCRβ sequence is also shown. Of the 18 rhesus macaque TRBV genes not used by the TCRβ sequences, 12 either didn't begin with a start codon or contained stop codons when translated ( Table 1 ). The rhesus macaque TRBJ2-2P gene, which is homologous to the human TRBJ2-2P gene (qualified by IMGT as having an "Open Reading Frame" functionality), was the only TRBJ gene not used by the TCRβ sequences. Deviations between the rhesus macaque TRB genes and TCRβ sequences were mostly attributed to the full-length genes not being used by the TCRβ sequences, owing to nucleotides being cleaved during TCR gene recombination. However, allelic differences could also exist between the single rhesus macaque sequenced in the genome project and the 20 SIV-infected macaques from which the TCRβ sequences were obtained. Possible allelic variants of the TRB genes used by the TCRβ sequences were not identified due to the level of uncertainty associated with distinguishing allelic variants from sequencing errors, in either the rhesus macaque genome or TCRβ sequences, when there were often small numbers of TCRβ sequences per rhesus macaque using a particular TRB gene. However, we investigated whether the nucleotide sequence variants of the TRBJ1-6 and TRBJ2-1 genes reported by Cheynier et al. 44 were used in our collection of epitopespecific TCRβ sequences. The previously reported variant of the TRBJ1-6 gene was found to be used by some TCRβ sequences, suggesting that this is an allelic variant of the TRBJ1-6 gene extracted from the rhesus macaque genome. The TRBJ2-1 gene variant was not used by any of the TCRβ sequences. This TRBJ2-1 gene variant may be an allelic variant that was not present in any of the 20 rhesus macaques in which the Mamu-A*01-restricted SIV-SL8/TL8-and SIV-CM9-specific TCRβ repertoires were studied but it is also possible that the single nucleotide difference in the TRBJ2-1 gene reported Cheynier et al. 44 is due to sequencing error. The assembly of reference TCR gene data sets for many species has often relied on the ad hoc sourcing of different TCR genes from various studies over time. Here, we report a reference set of TRB genes extracted from the rhesus macaque genome, most of which were expressed by TCRβ sequences in our extensive database of TCRβ repertoires involved in CD8 + T cell responses to the immunodominant Mamu-A*01-restricted SL8/TL8 and CM9 epitopes derived from SIV. Although there is a high degree of similarity (93.0%) between the exons of the rhesus macaque and human TRB genes, important interspecies differences exist. These interspecies differences are emphasized by the lack of a one-to-one correspondence between the rhesus macaque and human TRBV genes, and could potentially limit the accuracy of studies that rely on human TCR genes to characterize rhesus macaque TCR repertoires. The rhesus macaque TRB genes described herein will not only aid in the identification of the TRBV and TRBJ genes used by TCRβ sequences, they will also improve the accuracy of studies that aim to characterize the V(D)J recombination mechanisms that produce TCRβ repertoires. Indeed, several of the extracted rhesus macaque TRB genes have already been used in a study of TCRβ sequence sharing between macaques in the SIV-SL8/TL8-specific and SIV-CM9-specific CD8 + T cell responses39. This study required predictions of the potential V(D)J recombination mechanisms involved in producing the observed epitopespecific TCRβ repertoires, which were more reliable using the rhesus macaque TRB genes instead of the human TRB genes. Rhesus macaques are frequently used to study fundamental aspects of immunology and investigate vaccine efficacy in a variety of infectious diseases. Increasing evidence, much of which has come from studies conducted with this non-human primate model, indicates that the clonotypic architecture of antigen-specific T cell populations is a fundamental determinant of immune control and disease outcome26, 45. Thus, the rhesus macaque TRB genes presented here provide a valuable tool for dissecting the molecular features of TCRβ repertoires that underlie such associations in this model. The published rhesus macaque (Macaca mulatta) genome1 is available from the National Center for Biotechnology Information (NCBI) Rhesus Macaque Genome Resources website (http://www.ncbi.nlm.nih.gov/projects/genome/guide/rhesus_macaque/). The TRB gene locus is located on chromosome 3 (Accession number: NC_007860.1). The rhesus macaque chromosome 3 sequence was queried against all human TRB reference genes (obtained from the NCBI Human Resources website http://www.ncbi.nlm.nih.gov/projects/genome/guide/ human/) using BLAST (Basic Local Alignment Search Tool)46 to identify regions in the rhesus macaque sequence that resembled human TRB genes. Results were filtered to those with e-value ≤ 0.001, total alignment length ≥ 35% of the human reference gene, and total percent identity ≥ 75% with the human reference gene. These parameters were chosen to minimize false positive search results. Overlapping regions were merged and all regions were extended in both the 5' and 3' directions to account for regions missed in BLAST's local alignment search. Sequence alignments using ClustalW47 were then performed to compare each region of the rhesus macaque genome with each human TRB gene from the NCBI human reference set. The best human match to each macaque region was identified and then used as a guide to determine the exact length and terminal ends of the rhesus macaque TRB gene sequences, as well as intron and exon positions. We assessed the similarity between the rhesus macaque and the NCBI human TRB reference gene sequences (or the IMGT human TRB reference gene if the NCBI reference gene sequence was partial) by identifying the human TRB gene that had the highest overall percentage identity with each rhesus macaque TRB gene using a ClustalW alignment. We encountered the following scenarios: (i) a clearly identifiable one-to-one correspondence between a rhesus macaque and a human TRB gene; (ii) a rhesus macaque TRB gene with reasonable similarity to a group of human TRB genes; and, (iii) a human TRB gene with no reasonable correspondence to a rhesus macaque TRB gene. We therefore adopted the following approach to labelling the rhesus macaque TRB genes. For each rhesus macaque TRB gene, we first identified the group of human TRB genes to which it was most similar (e.g. TRBV1). We then numbered all rhesus macaque TRB genes which were most similar to this same group of human TRB genes according to the order in which the TRB sequences were found in the rhesus macaque genome (e.g. TRBV1-1, TRBV1-2, etc.). The ImMunoGeneTics (IMGT)48 nomenclature for TCR genes was used throughout. For all Mamu-A*01-restricted SIV-SL8/TL8-specific and SIV-CM9-specific TCRβ sequences, we performed a complete alignment analysis using the identified rhesus macaque TRB genes. This analysis determined for each epitope-specific TCRβ sequence the bestpercentage-match TRBV, TRBD and TRBJ genes over the longest alignment length by initially aligning the TRBV gene at the 5' end of the TCRβ sequence and then aligning the TRBJ gene at the 3' end of the TCRβ sequence. A minimum percentage match of 77% over an alignment length of at least 50 nucleotides was required for alignment of the TRBV genes. For alignment of the TRBJ genes, a minimum percentage match of 70% was required over the length of the TRBJ exon. The TRBD genes were then aligned to the sequence interval between the identified TRBV and TRBJ regions. A match to a string of two or more nucleotides was considered to originate from the TRBD gene. Exons, introns and recombination signal sequences have been included and gene families consisting of multiple genes are highlighted. All TRBV gene sequences were aligned using ClustalW and the tree was constructed in ClustalW using the neighbour-joining method49 and bootstrapped 1000 times. Branches with bootstrap values >80% are indicated with a black dot and branch lengths are those assigned by ClustalW. The tree was visualized using the Interactive Tree of Life50 (available at http://itol.embl.de/). Note that the tree has been rotated about the mid-point of the most distant nodes to assist visualization. Greenaway et al. Comparison of the rhesus macaque TRBV genes and their best human homologues. The best human homologue had the highest percent identity with the rhesus macaque gene nucleotide sequence. The alignment length is the total length across both the aligned rhesus macaque and human gene/exon sequences. The exon amino acid sequence was translated in the frame that yielded a start codon at the 5' end of the exon. Comparisons of the exon amino acid sequences were omitted for TRBV genes in which no start codon was found. For partial rhesus macaque exons missing a portion of sequence at the 5' end, the sequences were translated in the frame in which the start codon was found in the human homologues. The rhesus macaque gene is a partial sequence, with a missing portion of sequence at the 5' end of the gene. The percent identities between rhesus macaque and human genes and exons are calculated with the missing portion of rhesus macaque gene excluded. 5 The rhesus macaque gene is a partial sequence, with a missing portion of sequence at the 3' end of the gene. The percent identities between rhesus macaque and human genes and exons are calculated with the missing portion of rhesus macaque gene excluded. Table 3 Comparison of the rhesus macaque TRBJ genes and their human homologues. 1 The alignment length is the total length across both the aligned rhesus macaque and human gene/exon sequences. The TRBJ exons were translated in the frame that yielded the characteristic FGXG or LGXG motif. 1 The alignments were performed over the total length of the TRBD or TRBJ exon. Evolutionary and biomedical insights from the rhesus macaque genome Complete nucleotide sequence of the rhesus lymphocryptovirus: genetic validation for an Epstein-Barr virus animal model Complete sequence and genomic analysis of rhesus cytomegalovirus Development and homeostasis of T cell memory in rhesus macaque Analysis of TCRalphabeta combinations used by simian immunodeficiency virus-specific CD8+ T cells in rhesus monkeys: implications for CTL immunodominance Dramatic increase in naive T cell turnover is linked to loss of naive T cells from old primates Current concepts in AIDS pathogenesis: insights from the SIV/macaque model Preclinical study of influenza virus A M2 peptide conjugate vaccines in mice, ferrets, and rhesus monkeys Aberrant innate immune response in lethal infection of macaques with the 1918 influenza virus High resolution radiographic and fine immunologic definition of TB disease progression in the rhesus macaque Experimental rhesus lymphocryptovirus infection in immunosuppressed macaques: an animal model for Epstein-Barr virus pathogenesis in the immunosuppressed host The BZLF1 homolog of an Epstein-Barr-related gamma-herpesvirus is a frequent target of the CTL response in persistently infected rhesus macaques Experimental coinfection of rhesus macaques with rhesus cytomegalovirus and simian immunodeficiency virus: pathogenesis Induction and Evolution of Cytomegalovirus-Specific CD4+ T Cell Clonotypes in Rhesus Macaques Rhesus CMV: an emerging animal model for human CMV Prolonged dominance of clonally restricted CD4(+) T cells in macaques infected with simian immunodeficiency viruses The repertoire of cytotoxic T lymphocytes in the recognition of mutant simian immunodeficiency virus variants Clonal focusing of epitope-specific CD8+ T lymphocytes in rhesus monkeys following vaccination and simian-human immunodeficiency virus challenge The role of production frequency in the sharing of simian immunodeficiency virus-specific CD8+ TCRs between macaques Contributions of CD4+, CD8+, and CD4+CD8+ T cells to skewing within the peripheral T cell receptor beta chain repertoire of healthy macaques The disruption of macaque CD4+ T-cell repertoires during the early simian immunodeficiency virus infection Maintenance of CD4+ T cell TCR Vbeta repertoire heterogeneity is characteristic of apathogenic SIV infection in non-human primate model of AIDS Contribution of T-cell receptor repertoire breadth to the dominance of epitope-specific CD8+ T-lymphocyte responses Sequence of the rhesus monkey T-cell receptor beta chain diversity and joining loci Public clonotype usage identifies protective Gag-specific CD8+ T cell responses in SIV infection Basic local alignment search tool CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice IMGT, the international ImMunoGeneTics database The neighbor-joining method: a new method for reconstructing phylogenetic trees Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation We thank Dr Mark Tanaka for assistance with the phylogenetic analysis and Associate Professor Andrew Collins for helpful discussions. Refer to Web version on PubMed Central for supplementary material.