key: cord-271359-dpa8zzc3 authors: Sapats, S. I.; Ashton, F.; Wright, P. J.; Ignjatovic, J. title: Novel Variation in the N Protein of Avian Infectious Bronchitis Virus date: 1996-12-15 journal: Virology DOI: 10.1006/viro.1996.0670 sha: doc_id: 271359 cord_uid: dpa8zzc3 Abstract The nucleocapsid protein of coronaviruses has been considered highly conserved, showing greater than 94% conservation within strains of a given species. We determined the nucleotide sequence of the N gene and the 3′ untranslated region (UTR) of eight naturally occurring strains of IBV which differed in pathogenicity and tissue tropism. In pairwise comparisons, the deduced amino acid sequences of N of five strains Vic S, N1/62, N9/74, N2/75, and V5/90 (group I) shared 92.3–98.8% identity. The three strains N1/88, Q3/88, and V18/91 (group II) shared 85.8–89.2% identity with each other, but only 60.0–63.3% identity with viruses of group I. Amino acid substitutions, deletions, and insertions occurred throughout the N protein and involved regions previously identified as being conserved. Despite the considerable variation observed between the two virus groups, all N proteins contained a high proportion of basic residues, 80% of which were conserved in position. In addition, all strains contained approximately 30 serine residues of which 10 were conserved, the majority occurring between positions 168 and 194. As for all other coronaviruses, the region between positions 92 and 103 was highly conserved. Hence, a large number of amino acid changes can be tolerated within the N protein without affecting its integrity or functioning. The 3′ UTR immediately downstream from the N gene was highly heterogeneous with extensive deletions occurring in the group II strains. Infectious bronchitis virus (IBV), a member of the family be highly conserved differing by only 2 to 6% at the amino acid level (4, 5) . The high level of conservation in the N Coronaviridae, causes an acute highly contagious disprotein has resulted in the widely held view that the S1 ease of chickens resulting in significant economic losses glycoprotein, which may show up to 49% variation (6) , to poultry industries throughout the world. The IBV geis the only relevant structural element in assessing the nome consists of a single strand of positive sense RNA genetic diversity and evolutionary direction of IBV. measuring 27.6 kb in length excluding the poly(A) tail (1) . Immediately downstream of the N gene is the 3 un-The genes encoding the three major structural proteins translated region (UTR) which is presumably important are situated within an 8-kb region located at the 3 end in the initiation of negative-strand RNA synthesis. The of the genome. These proteins are the spike glycoprotein organization of the 3 UTR differs among the coronavi-(S), the membrane glycoprotein (M), and the phosphoryruses. In porcine, canine, and feline coronaviruses this lated nucleocapsid protein (N). The N protein plays a region is conserved among strains within a species and role in viral replication, assembly, and immunity. It intercontains at least one open reading frame (ORF) (7, 8) . acts with leader RNA sequences facilitating viral mRNA To the present time, functional ORFs have not been desynthesis and also binds to the viral RNA forming a helitected in the 3 UTR of IBV. However, the virulent M41 cal nucleocapsid (2). The N protein of all coronaviruses virus lacks a sequence of 184-196 nucleotides that has is overall very basic and in IBV contains 409 amino acids been detected in five other IBV strains. This sequence with a predicted M r of approximately 50,000 (3); it also has been termed a hypervariable region (HVR) and was contains a high proportion of serine residues which act proposed to be exogenous in origin (3, 9) . The region as sites for phosphorylation (2). located downstream of the HVR (315 nucleotides ending The N protein of 27 strains of IBV isolated over a period at the poly(A) tail) is highly conserved in strains of IBV, of 60 years from diverse locations such as the U.S.A., the probably indicative of its role in the synthesis of negative-UK, Holland, Saudi Arabia, and Japan has been shown to strand RNA. Recently we reported the isolation of three IBV strains, other Australian IBV strains. Based on S1 sequences, The IBV strains and methods used for their propagation have been described (10, 11) . Vic S is a commercial these strains formed a distant and novel genetic group of IBV (10) . In addition, the N protein of these strains vaccine (Arthur Webster Pty Ltd, Castle Hill, Australia). Strain N1/62 was isolated from unvaccinated chicks in failed to react with five monoclonal antibodies directed against different epitopes on the N protein. These epi-1962, whereas N9/74, N2/75, N1/88, Q3/88, V5/90, and V18/91 were isolated from vaccinated commercial chicks topes are conserved in other Australian strains of IBV, indicating unusual changes in the N genes of N1/88, Q3/ between 1974 and 1991. Strains were cloned either in chicken embryo kidney cells (by plaque assay) or tra-88, and V18/91 (11) . We have now sequenced the N gene and 3 UTR of eight Australian IBV strains isolated over cheal organ cultures (by limiting dilutions) and passaged 1-2 times in embryonated chicken eggs. All strains repli-the period of thirty years from both vaccinated and unvaccinated flocks. The results demonstrated less conserva-cated in the trachea and strains Vic S, V5/90, N1/62, N9/ 74, and N2/75 also replicated in the kidneys, the latter tion in the N protein and 3 UTR of IBV than previously detected. three causing 32 to 96% mortality (10). FIG. 1. Sequence alignment of the N protein of Australian strains of IBV. The complete deduced amino acid sequence of N of Vic S is shown. Gaps (dashes) were introduced to align the sequences. Dots indicate residues identical to Vic S. Asterisks indicate residues conserved in all strains. The longest region of complete conservation is boxed. The serine-rich region is underlined; the T cell epitope is double underlined. Clusters of basic residues are shaded. The Clustal V program was used for all sequence alignments (17) . of these five strains were also similar to those of geographically distant strains isolated in the U.S.A., Europe, and Japan with which they shared 91.4-92.9% identity at the amino acid level (results not shown). This confirmed the previous observation of the tendency for conservation of the N protein over a long time (Ç30 years) irrespective of geographical distances and immunological pressures. Contrary to this, however, the N protein of three other strains N1/88, Q3/88, and V18/91 (group II) shared only 60.0-63.3% amino acid identity with the N proteins of strains in the first group, while sharing 85.8-89.2% amino acid identity with each other. This lack of conservation of the N protein has not been reported before for any other coronavirus. An alignment of the deduced amino acid sequence of the N gene of Vic S with sequences of the other Australian strains is shown in Fig. 1 N (3, 4, 18) . The tree was constructed using tion 337. The group II strains also contained a number the neighbor-joining method (19) . of insertions and deletions relative to group I strains, the majority of which were at the amino and carboxy termini between positions 7 and 23 and 339 and 401, respec-Viral RNA was purified using methods described (10, tively. Overall, only 54% (221/411) of the residues were 12). Vic S, N1/88, and Q3/88 cDNA was synthesized using conserved in all strains, the longest region of complete random primers, and sequences obtained from the conservation occurring between positions 92 and 103, cloned cDNA were used to design primers for amplificacorresponding to the part of the N protein previously tion of the N gene of all strains by reverse transcription found to have the highest degree of conservation in IBV and polymerase chain reaction. All cDNAs were cloned and all other coronaviruses (13). Published sequences into pUC series plasmids. For each virus two or more for the N proteins of IBV also contain a region of complete independent cDNA clones were sequenced using the conservation between positions 242 and 296 (4). As evi-Pharmacia T7 sequencing kit. Pairwise comparisons of dent from Fig. 1 , the corresponding region is not conthe nucleotide and deduced amino acid sequences of served among the Australian strains. The precise locathe N genes of the eight strains Vic S, V5/90, N1/62, N9/ tion and role of functional domains within the N protein 74, N2/75, N1/88, Q3/88, and V18/91 (excluding the first are not well understood (2, 4, 14) . However, a T cell 17 nucleotides of V5/90, N1/62, N9/74, and N2/75) are epitope has been identified in IBV corresponding to posishown in Table 1 . The strains formed two distinct genotions 74-81 in Fig. 1 (15) . Examination of these setypic groups based on the level of nucleotide and amino quences reveals that they are completely conserved in acid identities. The first five strains in Table 1 2, 3, 14) . (i) All strains possessed a high pro- portion of basic residues (17.2-19.6%). The majority of revealed values of identity ranging from 22.2 to 100%. Within group I the identity was 92.1-100%; within group these (80%) were conserved in position and generally clustered in three regions between positions 66 and 88, II it was 22.2-100% identity (results not shown). The HVR of group I viruses (i) contained a high U content which 181 and 234, and 334 and 373 (Fig. 1) . This basic character probably facilitates protein-RNA interactions. (ii) The was evenly distributed, and (ii) was similar to that of Beaudette, Ark99, and Gray viruses (9) . Little similarity to carboxy terminus contained a clustering of acidic residues located between positions 375 and 415, a region Kb8523 and Holl52 viruses was observed (results not shown). which otherwise showed very little conservation. However, the acidic residues were less conserved (66%) in This HVR downstream of the N gene was originally identified in the vaccine strain Beaudette by comparison position than the basic residues. (iii) All strains contained approximately 30 serine residues, their precise locations with the virulent M41 strain and considered to be an insert of exogenous origin acquired through recombina-varying considerably. Only 10 were totally conserved in position. The majority of these conserved serine residues tion during adaptation in eggs (3, 9) . However, the presence of the long HVR (153 to 216) in group I strains, occurred between positions 168 and 194. Serine residues are potential sites for phosphorylation (2). Hence it which received a low number of passages in eggs, suggests that the HVR region may have been present in appears that the N protein is able to tolerate some variability in the distribution of basic, acidic, and serine resi-an ancestral strain and subsequently deleted in some strains. In addition, in contrast with the Beaudette and dues, although the basic residues were the most conserved consistent with the function of the protein as a M41 comparison, the shorter forms (10 to 38) of the HVR in the Australian strains N1/88, Q3/88, and V18/91 (group ribonucleocapsid protein. These residues are expected to directly influence protein charge, state of phosphoryla-II) are associated with a decrease in virulence. Thus there is no clear association between the length of the tion, and secondary structure. The phylogenetic relationship between Australian and HVR and virulence. The remaining 294 nucleotides downstream of the HVR (from position 220 in Fig. 3b ) showed other strains of IBV based upon the published nucleotide sequences for the N gene is shown in Fig. 2 . The group considerable conservation with values of identity ranging from 84.6-100.0% in pairwise comparisons. Within group I and group II Australian strains formed two clusters distinct from previously published strains. The five group I I strains, the identity was greater than 97.3%. An ORF was detected within the 3 UTR of N1/62 and strains formed one cluster with Vic S and V5/90 being the most closely related and showing approximately the N9/74 with the potential to encode hydrophobic proteins of 8793 M r and 8711 M r , respectively. A similar ORF has same degree of relatedness to N1/62 and N2/75, while N9/74 appeared to be the most divergent. The group II been identified in four other IBV strains (9) . Although the initiation codons of these ORFs fit the Kozak consensus strains formed another cluster having diverged markedly from all other clusters. The phylogenetic relationships sequence (16) with an A at position 03 and a G at position /4, there are no IBV mRNA transcription motifs (CTT/ among the Australian IBV strains based on the N nucleotide sequences were similar to those based on S1 (10). GAACAA) directly upstream. Other coronaviruses such as porcine transmissible gastroenteritis virus also con-This suggests that the S1 and N genes of strains in group II have evolved in parallel. Interestingly, a comparison tain an ORF within the 3 UTR. The latter virus directs the synthesis of a hydrophobic and membrane-associated of the Dutch strain D1466 with other European strains showed that the D1466 S1 protein had significantly di-protein of 9101 M r (7) . At present it is unknown whether the corresponding protein for IBV is synthesized in in-verged from the S1 of the other strains, similar to the distance between the Australian group I and II strains. fected cells and what its function in virus replication may be. More experiments are required to detect the protein However, this divergence was not matched by the N protein, which was conserved between D1466 and the and to examine possible mechanisms for the initiation of its translation. other European strains (5) . The 3 UTR, located immediately downstream of the N gene, was sequenced for Vic S and the other Australian ACKNOWLEDGMENT strains (Figs. 3a and 3b) . A HVR was identified, ranging Fig. 3b ) than previously reported (9) . In the REFERENCES group I strains N1/62, N9/74, and N2/75 the HVR was Tom-214, 214, and 216 nucleotides, respectively, while for Vic ley The Coronaviridae Virus 12