key: cord- - wmt e authors: lee, hee-kyung; yeo, sang-geon title: cloning and sequence analysis of the nucleocapsid gene of porcine epidemic diarrhea virus chinju date: journal: virus genes doi: . /a: sha: doc_id: cord_uid: wmt e the nucleocapsid (n) gene of the porcine epidemic diarrhea virus (pedv) chinju which was previously isolated in chinju, korea was cloned and sequenced to establish the information for the development of genetically engineered diagnostic reagents. also, sequences of the nucleotides and deduced amino acids of the chinju n gene were analyzed by alignment with those of cv and br / . the nucleotide sequence encoding the entire n gene open reading frame (orf) of chinju was bases long and encoded a protein of amino acids with predicted m (r) of kda. it consisted of adenine ( . %), cytosine ( . %), guanines ( . %) and thymines ( . %) residues. the chinju n orf nucleotide sequence was . % and . % homologous with that of the cv and br / , respectively. the chinju n protein revealed . % amino acid identity with that of br / and cv , respectively. the amino acid sequence contained seven potential sites for threonine (t)- or serine (s)-linked phosphorylation by each protein kinase c and casein kinase ii. porcine epidemic diarrhea virus (pedv) causes an acute infection in piglets of ± weeks old, and the disease is characterized by severe enteritis and diarrhea, leading to death with mortality up to % [ , ] . pedv is a member of the genus coronavirus of the family coronaviridae [ ] . the genome consists of a single molecule of positive-sense, single-stranded rna, ± kb in size, which is transcribed into a nested set of several -coterminal subgenomic mrnas for the production of structural and nonstructural proteins [ , ] . among structural proteins of the virion, spike (s) glycoprotein ( ± kda) plays an important role in the attachment of the virion on the host's receptors and penetration into the intestinal villous cells by fusion. the s glycoprotein also induces the production of neutralizing antibodies in the host [ ± ], and therefore, is an important substance for the immunity against pedv. on the other hand, nucleocapsid (n) protein ( ± kda) is known as a basic phosphoprotein associated with the genome [ , , , ] , which can be the target for the accurate and early diagnosis of pedv infection by molecular techniques. cloning and nucleotide sequencing have been done on these genes of cv and br / strains [ , ] . the gene products can be the feasible alternative to develop genetically engineered vaccines and diagnostic reagents. since isolation of pedv in korea was first reported in [ ] , the virus has been one of the major causes for the death of suckling piglets in pig farming. park et al. [ ] cloned a dna of bases from n gene of the viral rna in swine feces, but no further studies on the viral isolation and gene cloning have been reported. in the development of genetically engineered proteins for diagnostic reagents against pedv, molecular characterization of the n gene is rudimental that still need further elucidation. pedv infections occur frequently in korea, and developmental efforts should be geared toward rapid diagnosis and control of the disease. to our knowledge, nucleotide sequences of the full-length n gene of korean pedv isolates have not been reported. in the present study, a dna clone was constructed for the full-length n gene open reading frame (orf) of pedv isolated in chinju, korea. the complete sequences of nucleotides and deduced amino acids of the n gene were determined, and further analyzed with those of other pedvs for the information in the production of genetically engineered diagnostic reagents. a strain of pedv, chinju which was previously isolated from the intestinal tissues of piglets suffering from severe diarrhea by virology laboratory of gyeongsang national university college of veterinary medicine, chinju, korea (data not shown), was used. the virus was propagated in monolayer of vero cells grown in minimal essential medium (mem) containing streptomycin ( mg/ml), penicillin ( u/ml) and trypsin ( mg/ml) in a % co incubator at c following the methods of hofmann and wyler [ ] . when syncytial formation appeared in the vero cells after propagation of the virus, the wasted mem was removed. the cells were washed with pbs (ph . ) and lysed by trizol reagent (invitrogen, usa) at ml per tissue culture flask ( cm ), and homogenized by passing the cell lysate several times through a pipette. viral rna was extracted from the homogenate following the manufacturer's suggestions and dissolved in diethyl pyrocarbonate-treated distilled water. a pair of sense and antisense primers was designed and aligned based on nucleotide sequences of the n gene of cv and br / [ , ] from the genbank data base (national center for biotechnology information, usa). the sense primer nf ( ccgagtgc-ggttctcacagat ) and antisense primer nr ( catagccaggataagccggtc ) were used to generate cdna for the n gene of chinju and relative position of the primers are shown in fig. . synthesis of the first-strand cdna for the n gene was carried out by reverse transcription (rt) using superscript ii reverse transcriptase reagent kit (invitrogen) following manufacturer's suggestions. the viral rna was mixed with ml of pm of the antisense primer, ml of x first-strand buffer, ml of mm dntp mixture, ml of . m dtt, ml of rnase inhibitor ( u/ml), ml of reverse transcriptase ( u/ml) and brought to ml with distilled water. the reaction mixture was incubated for min at c, and the reaction was stopped by heat for min at c. to degrade rna template, the reaction mixture was treated with rnase h ( u) for min at c. the ds-cdna for the n gene was synthesized by polymerase chain reaction (pcr) using a reagent kit (perkin-elmer, usa). a ml portion of the firststrand cdna template was added to ml of x pcr buffer, ml of mm mgcl , ml of mm dntp mixture, ml of each pm sense and antisense primers, ml of taq dna polymerase ( u/ml) and brought to ml with distilled water. the pcr was carried out in a thermocycler (perkin-elmer) following the program of min at c and cycles of min at c, min at c and min at c, and a final extension at c for min. the pcr products were resolved by electrophoresis in % agarose gel. following the routine methods in gene cloning [ ] , the pcr-generated n gene ds-cdnas were bluntended with klenow enzyme ( u) and ml of . mm dntps (invitrogen) in ml reaction volume and cloned into the smai site of ptz r plasmid dna by ligation using t dna ligase ( u) (invitrogen). the recombinant plasmid dnas were transformed into competent escherichia coli dh a cells by heat shock for s at c. after adding soc medium ( . % yeast extract, % tryptone, mm nacl, . mm kcl, mm mgcl , mm mgso , mm glucose), the tube was shaken for h at rpm, c. the transformed cells were plated onto luria bertani (lb) agar (invitrogen) containing ampicillin ( mg/ml), x-gal ( mg/ml) and isopropylthio-b-galactoside ( mg/ml) (invitrogen) and incubated overnight at c. transformed colonies were cultured in lb broth with ampicillin ( mg/ml) by shaking at rpm, overnight, at c, and were subjected to dna extraction by alkaline-lysis, restriction enzyme digestion and electrophoresis in % agarose gel for the identification of recombinant dna clones. nucleotide sequencing was done for the n generecombinant dna clones using dye terminator cycle sequencing kit (perkin-elmer) by the automatic sequencer (abi prism , advanced biotechnologies, usa). the sequences of nucleotides and deduced amino acids were analyzed by clustalw, version . using data available from genbank and the european molecular biology laboratory (embl). n gene nucleotide and amino acid sequences of chinju were compared with cv and br / [ ] (embl accession no. z ). the protein chemistry of chinju amino acids was analyzed using protein statistic programs pepstats (pasteur institute, france) and predictprotein (embl). in the synthesis of ds-cdna of the chinju n gene, a dna fragment of . kb in approximate was amplified by rt-pcr using primers specific to n gene of pedv. the dna was cloned into ptz r vector dna (fig. ) and subjected to sequencing. the nucleotide sequence encoding the entire chinju n gene was bases in length and contained a single orf. the gene had and nucleotide mismatches compared to cv and br / , respectively (fig. ) . it consisted of adenine ( . %), cytosine ( . %), guanine ( . %) and thymine ( . %) nucleotides, and a gc content of . %. the gene showed . % and . % nucleotide sequence homology to that of cv and br / , respectively. the chinju n gene encoded a protein of amino acids with predicted m r of kda. there were seven potential threonine (t)-or serine (s)-linked phosphorylation sites by each protein kinase c and casein kinase ii recognized in the protein. the chinju n protein had amino acid mismatches compared to those of cv and br / (fig. ) and showed . % amino acid sequence identity with these strains. bridgen et al. [ ] previously cloned a gene of nucleotides in a single large orf capable of encoding a amino acid protein of kda from pedv cv and br / , which were very similar nucleotide sequence of pedv n gene in both length and sequence to coronavirus n proteins, and therefore represented it as the pedv n gene. in the present study, the n gene of the pedv chinju was cloned and sequencing was done for the cdna clones. the resulting sequence data showed a single orf of nucleotides encoding a protein of amino acids with m r of kda predicted by pepstats program. chinju n gene also had . % amino acid sequence identity with that of cv and br / [ ] , although there were amino acid mismatches recognized. therefore, the chinju n protein revealed the same features for the nucleotide and putative amino acid sequences in the cv and br / , although pedv n protein is known to possess m r of ± kda by polyacrylamid gel electrophoresis [ , ] . the pedv n protein is known as a phosphorylated, structural protein associated with viral genome [ , , , ] , which appears abundantly in virus-infected cells [ ] . therefore, the appearance of n protein can be a clue to the replication of pedv and used for the early and accurate diagnosis so far as the virus replicates in the infected cells. the chinju n protein had each seven potential t-or s-linked phosphorylation sites by protein kinase c or casein kinase ii, respectively. similarly, the cv and br / [ ] contained six serine (s) residues as possible phosphorylation sites by these enzymes, although some of the s-linked phosphorylation sites were different with those of the chinju . in conclusion, the full-length nucleotide sequence in the coding region of n gene of pedv chinju was determined in the present study. trials were done to analyze the nucleotide and putative amino acid sequences of the chinju n gene comparing to those of other pedvs. however, we could elucidate molecular properties of the n gene by mere comparison to those of cv and br / , because the full-length nucleotides of the pedv n gene have been determined only in these strains. nevertheless, it was recognized that chinju n gene has the minor differences in the structural features of putative protein compared to those of cv and br / . this can be the feasible information for the development of genetically engineered n protein for the rapid and accurate diagnosis of pedv infections in korea. moreover, the genetic information gained from the chinju n gene can be used for diagnostic work such as pcr and nucleic acid hybridization. to our knowledge, this is the first published report on the full-length nucleotides and molecular characteristics of the n gene of korean pedv isolates. and br / [ ] : only the amino acids of cv and br / which mismatched the chinju sequence were included; *, translation termination; seven potential threonine (t)-or serine (s)-linked phosphorylation sites by protein kinase c were underlined; seven potential t-or s-linked phosphorylation sites by casein kinase ii were denoted in italic. diseases of swine hagan and bruner's microbiology and infectious diseases of domestic animals fields virology. lippincott-raven publishers short potocols in molecular biology this study was supported by a grant (no. - - - ) from the korea science and engineering foundation (kosef), ministry of science, korea. key: cord- -m inmc authors: kwon, hyuk moo; jackwood, mark w. title: molecular cloning and sequence comparison of the s glycoprotein of the gray and jmk strains of avian infectious bronchitis virus date: journal: virus genes doi: . /bf sha: doc_id: cord_uid: m inmc the nucleotide sequences of s glycoprotein genes of the gray and jmk strains of avian infectious bronchitis virus (ibv) were determined and compared with published sequences for ibv. the ibv gray and jmk strains had % nucleotide sequence similarity. the overall nucleotide sequence similarity of the gray and jmk strains compared with other ibv strains was between . % and . %. the similarity of the predicted amino acid sequence for the s glycoproteins of the gray and jmk strains was . %. six of the differences in the amino acid sequence were found between residues and , suggesting a possible role for that region in the tissue trophisms of the viruses. the s glycoprotein of the gray and jmk strains had . %– . % amino acid similarity with the published sequence of other ibv strains. serine instead of phenylalanine was observed in the protease cleavage site between the s and s glycoprotein subunits for the gray and jmk strains, which was similar to the published sequence for the ark and se strains. the significance of that amino acid change is not known. based on the nucleotide sequence of the gray and jmk strains, thebsmai restriction enzyme was selected by computer analysis and was used in restriction fragment length polymorphism analysis to differentiate the two strains. avian infectious bronchitis virus (ibv) causes an acute, highly contagious disease of the respiratory and sometimes the urogenital tracts of chickens. infectious bronchitis (ib) is an economically important disease to the poultry industry, and outbreaks continue to occur because ~present address: ohio agricultural research and development center, fahrp, the ohio state university, wooster, oh , usa the nucleotide sequence data reported in this paper have been submitted to the genbank nucleotide sequence database and have been assigned the accession numbers grays = l and jmks = l . different ibv serotypes do not completely crossprotect ( ). the virus is the type species of the family coronaviridae, and its genome consists of one molecule of positive sense single-stranded rna ( ) . it has three major structural proteins: a nucleocapsid protein, an integral membrane glycoprotein, and a spike (s) glycoprotein ( , ) . the s glycoprotein is cleaved into n-terminal s and c-terminal $ subunits ( , ) . the s glycoprotein forms the distal, bulbous part of the s glycoprotein, and the $ glycoprotein anchors the s glycoprotein to the membrane of the virion ( , ) . neutralizing, hemagglutination-inhibiting, and serotype-specific antibodies are directed against the s glycoprotein ( ) ( ) ( ) ( ) . tissue tropism has also been associated with the s glycoprotein ( ) . the s glycoprotein gene of several serotypes of ibv has been sequenced to investigate the antigenic variation of ibv at the molecular level ( ) ( ) ( ) ( ) ( ) . an amino acid sequence comparison of the massachusetts (mass ) vaccine strain and the beaudette laboratory strain revealed that s had two hypervariable regions (hvrs) ( ) . antigenic and serotypic determinants of ibv are thought to be located in the hvrs ( , , ) . recently we reported on a polymerase chain reaction/restriction fragment length polymorphism (pcr/rflp) procedure to distinguish between serotypes of ibv ( ) . in that procedure three restriction enzymes (re) were used to distinguish all of the known serotypes within the united states, as well as variant viruses. only the gray and jmk strains could not be differentiated from each other. in an attempt to distinguish between the gray and jmk strains, over re were tested unsuccessfully. serology indicates that the gray and jmk strains are closely related and belong to the jmk serotype ( ) . the gray strain, however, is nephropathogenic ( , ) , whereas nephrotropism has not been reported for the jmk strain. the objectives of the present study were to clone and sequence the s glycoprotein gene of the gray and jmk strains of ibv in order to identify an re that would differentiate the two strains in the pcr/rflp serotype identification test. it is important to differentiate the two strains in a diagnostic test because the gray strain is nephropathogenic. in addition, it is useful to know the sequence of serologically similar viruses that have differences in their tissue tropism. with that information we can begin to identify regions in the viral genome that may be associated with pathogenicity, dr. jack gelb, jr. (university of delaware, newark, de) provided one gray strain ( ) chicken embryo passage and two (received at different times) jmk strains ( ), chicken embryo passage number i . another gray strain ( ) , chicken embryo passage , was obtained from dr. pedro villegas (university of georgia, athens, ga). all were passaged once in embryonating chicken eggs. the viral rna was extracted and purified as previously described ( ) . briefly, sodium dodecylsulfate (final concentration, % wt/vol) and proteinase k (final concentration, ~g/ml) were added to allantoic fluid, incubated for min at °c, and extracted with acid phenol and chloroform/isoamyl alcohol. the rna solution was further purified using the rnaid tm kit (bio i ) according to the manufacturer's recommendation, then stored at - °c until used in the reverse transcriptase (rt) reaction. the s ioligo ' and s oligoy primers for the rt reaction and pcr, synthesized by the university of georgia molecular genetics facility, have been described previously ( ) . the sequence of the primers and their relative position in relationship to the s glycoprotein gene are shown in fig. . all of the reagents for the rt reaction and pcr have been described previously ( ) . reverse transcripiton of rna purified from allantoic fluid was done with moloney murine leukemia virus reverse transcriptase (gibcobrl) and primer s oligo ', which is complementary to a region at the ' end of the $ glycoprotein gene. for the pcr reaction, the primer s oligo ', which is identical to a sequence near the ' end of the s glycoprotein gene, and units of ampli-taq dna polymerase (perkin-elmer cetus) were added to the rt reaction. for cycles at °c for rain, °c for min, and °c for min, pcr was performed in a twinblock tm thermal cycler (ericomp). the pcr products were electrophoresed ( v constant voltage) on a % agrose gel containing ethidium bromide ( . ~g/ml). cdna cloning the s band, with a predicted size of approximately . kbp, was cut from an agarose gel and purified using the geneclean kit (bio ) according to the manufacturer's recommendations. the purified dna was tigated into the pcr tm ii (invitrogen corp.) cloning vector, then transformed into competent escherichia coli cells ( nv~f', lnvitrogen). the white colonies carrying recombinant plasmids were selected from luria-bertani (lb) agar ( ) plates containing kanamycin ( ~g/ml) and p~l of mg/ml x-gal stock solution. the alkaline lysis method was used for small preparations (mini-preps) of plasmid dna. the purified ptasmid dnas were digested with ecori (promega) and analyzed on a % agrose gel to determine the size of the insert. cesium chloride density gradient centrifugation was used to obtain larger amounts of plasmid dna for sequencing. denatured double-stranded cloned dna was sequenced by the dideoxy chain termination procedure using the sequenase version . kit (usb) as recommended by the manufacturer. initially, the m forward (usb) and reverse (# ) primers were used for sequencing. in addition, six other primers were synthesized to various regions within the gray strain of ibv ( fig. i) . at least three clones of each strain were sequenced. nucleotide sequence data were compiled and analyzed on a ibm personal computer using the pc/gene software (intelligenetics, inc.). the s pcr products of the ibv gray and jmk strains were purified on an agrose gel as previously described ( ) and were digested with bsmai (neb, beverly, ma) according to the manufacturer's recommendations. the restriction fragment patterns were observed following electrophoresis ( v constant voltage) on a % agrose gel containing . ~g/ml ethidium bromide. the nucleotide sequence of the entire s portion of the s glycoprotein gene, including the signal sequence for the gray and jmk strains, is shown in fig, . a comparison of the amino acid sequences deduced from the nucleotide sequences c a g c a g a a c sei * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ppl * * * * * * ct fig. (continue~. fig. (continued) . fig. (continued) . a c j m k b e a g t t g t t a g c g a g c t m g t t t g c t a g c g a g c t a t t a a t a g c c a ' t s e i t t t a a t a g c t ppi t t g c t a g c a c c t g r a t a c t a c c a a a g c g c c t t c a g a c c a c c t a a t g g t t g g c a t t t a c a t g g a g g - j m k c b e a t g a g m t t c g a g t t t g g g a c sei g t t t g g g a c c ppi t t t g g g a c g r a g g a c c t g g c a j m k b e a t g c g c a a g c t g g c t c t t m t g c g a a g c c g g c t c t t a t g c g a g t a g g a c t g s e i c ct g g c a t g ppi t g c g g g c t g g a c t g j m k c b e a c a a g a c t t a t t c a g t g t c g t g t g a m c a c g a t t a t c a g tg t c g t g t g a a c c c a a c a c t t g g c a c t c g s e i c a c a g a c g t t c g g c a c t c ppi c c c a a a a t t g g c a c t c g g r a g c t t c t g t a g c c a t g a t t g c a c c a c a t a g t g g t a t g t c g t g g t c t g t c c a - j m k b e a t a t c g g t c a t c a g t a g a g m t a t cg g t c a t c a g t a g a g a c a c t a a c t c sei t g g c t a a c t a ppi c a c c a a c t t a j m k b e a t g t t a t a g a c m t g t t a c t a g a c a a t c t t a t c a t sei a a t c ct ppl a t c c t a a c t c a t j m k c t a b e a a a c a t t g g t c t a a g g c t g c t a a a g m a a t a t a t g g t c a t a a g g c t g c b-ab_ag a a g c t a a a g g t t a sei a g c t t c a a a g t t t a p p i a g c c a a a c g t t a g r a t t a c a a g g c c a a a t c c g c a t t t c t g c t a t g a g a a g c g t t a a t a g t c g t c c - j m k c c a t b e a a a t t t a t a g t g t t c t . . . . . . . . . . . . . . . . c t g a a a gg m a a t t t t t t a g t g t t c t . . . . . . . . . . . . . . . . . . c t g a~_ha g g a cc a g c t t t t t g a c a t g a g c a c g s e i c a t c a t t c t t g g a t g a a a c g ppi cc a a c t t t t t g a c a t g a g g t a a g j m k --- b e a c g --- c a g a g g c a c t m c g --- c a g a g g c a c t a g g a c t a c a a g sei t c t g t a a t g t a t a c ppi g g t a c t a c a a g a t - j m k b e a g a t t g t t a c m a t t g t t a c a g a g a t t c sei g a t t t a g c c ppi g a g a t t c g a j m k b e a t a c a c c a t c t m t a c a c c c a t c t a t c a ct t g a g t c c sei t a c c a g t t ppi t ca c ct t g a g t c c a c c~a g - j m k b e a t ga g t t m t g a g t t a a g t g a g g t t sei c ppi a g t g g a g g t t g j m k b e a c t g m c t g a ct c t c a sei t ppi ct c t c a j m k b e a a c c t c t m a c c t c t a a c a c a t c t sei t t ppi a c a c a t c t g j m k c b e a c t a g t a c g a m c t t a g t a c g a a c a c g a sei c t a ppi c a c g a j m k b e a c a t gt g t c a c m c a t t g t c a c a g t s e i a g t a ppi g t t g a c - j m k b e a t t a g c g c a a c c c t a t c g a m t a g c g c a a c c c t a tc g a a a g t c ca t g s e i t t cc a t c t g t t t ppi a gt c c a t g g j m k b e a a c a a a c t c a a a g t m a c a a a c t c a a g t a t c a g t s e i g t a g t ppi t c a g t j m k t b e a c a g t t a g a m c a g t t a g a a a g t t g a a g a a sei c a g t t a g c ppi a g t t g a a g a a g ii j m k t b e a t a g a a t a t a m t a g a t a t a a t g g c t a -- s e i t t a t ppi t g t a i i j m k b e a c g m c g a c c t t a t a c a sei c a t c a a ca p p i t t a t a a a j m k b e a aa m a t a a a s e i t ppi a a a a a a j m k t b e a t g m t g a t cg a c g t g c t c a g c s e i c t ppi t cg t c g t g g t c a g c g a t g j m k b e a g a a g m g a t a a c a a g c t t sei g a ppi a c a a g t t t a ] j m k b e a c a m c g c a c t a a c c a t a t c a t t t se] c g ct c pp c t a a c c a t a t c a t t t g a g -! j m k b e a m a c c g g a g g t t sei g pp! c g g a g t t g j m k t c b e a g c m c a a t t a c t c c c sei t ppi a t t a c tc c a j m k b e a c g t m c g t a c g g g g a a sei g ppi g g t a a g j m k b e a t t t m t t c t a c c sei t c c ppi c c c c j m k c b e a c t c g m c t c g a t t sei t ppi t t g a c j m k g b e a t g t g m t g t g a t g c a se] t g a ppi t t g c a - j m k b e a c c a t a m c a t a a c g a a sei c g t a a a pp g t of the gray and jmk strains is shown in fig. . also included in figs. and is a comparison with published sequences ( , , ) . the ibv gray and jmk strains had similar s sequences. the gray and jmk strains differed by only % ( / ) in their nucleotide sequences. the gray and jmk strains had between . % and . % nucleotide identity with the mass , beaudette, ark , se , and pp strains. the gray and ark strains had the least similarity, and the gray and se strains had the most. the gray and jmk strains had extra nucleotides at a position - (fig. ) that were not found in the nucleotide sequences of the mass and beaudette strains. the gray and jmk strains differed by . % ( / ) in their amino acid sequences. most of the differences in the amino acid sequence were found between residues and . a highly variable region containing six differences was observed between residues and . the gray and jmk strains had between . % and . % amino acid identity with the mass , beaudette, ark , se , and pp strains of ibv. a dendrogram of the amino acid alignment is presented (fig. ) . the gray and jmk strains had the least similarity to mass , and the most similarity to the sei strain. like ark and se , the gray and jmk strains had a serine (residue ) instead of phenylalanine in the cleavage site of the connecting peptide between the s and $ glycoproteins (fig. ) . based on a computer re analysis of the nucleotide sequence for the gray and jmk strains, the bsmai re was selected for use in the rflp analysis of the two strains. following digestion of the pcr product with bsmai and electrophoresis, the gray and jmk strains had the expected restriction fragment patterns (fig. ) , which could be used to differentiate between them. the purpose of sequencing the s i glycoprotein genes of the gray and jmk strains of ibv was twofold. first, we wanted to identify a re for use in our pcr/rflp serotype identification test that would distinguish between those viruses. second, we wanted to add the sequence of those strains to the growing database of s glycoprorein sequences for strains of ibv in the united states. those data are a first step toward identi- , and ppi (ppi) ( ) s genes. asterisks indicate unavailable sequences. to con~rm to other published sequences ~r s , numbering begins after the signal sequence (bold~ce). dashes w e~ introduced to align the sequences. the double-underlined sequence is a connecting peptide of the spike precursor polypeptide. fying neutralizing and serotype-specific epitopes, and regions that are involved in attachment of the virus to target cells. the s glycoprotein sequences of gray and jmk presented here are the first published sequences for this serogroup (designated jmk). by computer search and agarose gel electrophoresis, the bsmai was found to be the best enzyme for distinguishing between the gray and jmk strains in our pcr/rflp serotype identification test. three restriction sites were observed in the jmk strain at bases (within hvr ), , and ; the gray strain had two sites at bases and . ten differences in the amino acid sequences of the s i glycoprotein were observed between the gray and jmk strains. beaudette and mass (both massachusetts serotypes) are reported to have differences in their amino acid sequences ( ) . six of the differences between the amino acid sequences of the gray and jmk viruses were in a variable region between residues and t . this corresponds to a variable region with the massachusetts serotype reported by niesters et al. ( ) between residues and . the overall differences in the amino acid sequences observed between all of the ibv strains examined herein were located between residues and and and . similarly variable regions between residues and and and have been reported by cavanagh et ah ( ) for closely related serotypes of ibv. our data extend this observation to include different serotypes of ibv, suggesting (as others have) that these regions may be involved in forming serotype-specific and virus-neutralizing epitopes. a protease cleavage site between the s and $ glycoprotein subunits was reported to be arg-arg-phe-arg-arg for the beaudette and mass viruses ( , ) . the cleavage site of the gray and jmk strains was similar to the recently published sequence for ark and sei ( ) , wherein a serine instead of a phenylalanine (residue ) was observed. although both amino acids are uncharged at physiological ph, serine has an aliphatic hydroxyl side chain, whereas phenylalanine has an aromatic side chain. the significance of this amino acid difference with regard to virulence is not known. the gray and jmk strains of ibv are the same serotype, indicating that they are very similar antigenically. however, the pathogenicity of these viruses is different because the gray strain can produce a nephritis. it follows that the amino acids located between residues and may play a role in the different observed pathogeneses for these viruses. this observation is supported by cavanagh et al. ( ) , who observed an amino acid difference within the hvr region of two vaccine viruses, which may account for the differences in virulence observed for those viruses. the molecular basis for tissue trophism may become more apparent as the sequence becomes available for other nephropathogenic strains, such as holte ( ) , australian t ( ), and one of the holland strains ( ) . diseases of poultry we thank the veterinary medical experiment station, university of georgia, for their support in funding these experiments. key: cord- -h zi k g authors: lin, chao-nan; chang, ruey-yi; su, bi-ling; chueh, ling-ling title: full genome analysis of a novel type ii feline coronavirus ntu date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: h zi k g infections by type ii feline coronaviruses (fcovs) have been shown to be significantly correlated with fatal feline infectious peritonitis (fip). despite nearly six decades having passed since its first emergence, different studies have shown that type ii fcov represents only a small portion of the total fcov seropositivity in cats; hence, there is very limited knowledge of the evolution of type ii fcov. to elucidate the correlation between viral emergence and fip, a local isolate (ntu ) that was derived from a fip cat was analyzed along with other worldwide strains. containing an in-frame deletion of nucleotides in open reading frame c, the complete genome size of ntu ( , nucleotides) appears to be the smallest among the known type ii feline coronaviruses. bootscan analysis revealed that ntu evolved from two crossover events between type i fcov and canine coronavirus, with recombination sites located in the rna-dependent rna polymerase and m genes. with an exchange of nearly one-third of the genome with other members of alphacoronaviruses, the new emerging virus could gain new antigenicity, posing a threat to cats that either have been infected with a type i virus before or never have been infected with fcov. electronic supplementary material: the online version of this article (doi: . /s - - - ) contains supplementary material, which is available to authorized users. feline coronaviruses (fcovs) are large, enveloped, positive-strand rna viruses with a genome of approximately , nucleotides [ ] [ ] [ ] . the fcovs belong to the genus alphacoronavirus, family coronaviridae, order nidovirales. other members of this subgroup include canine coronavirus (ccov), transmissible gastroenteritis virus (tgev), raccoon dog cov (rdcov/gz / ), and chinese ferret badger cov (cfbcov/dm / ) [ ] . fcovs are associated with diseases that range from subclinical and/or mild enteric infections to fatal infectious peritonitis [ ] . despite the high prevalence of fcovs in feline populations around the world, only - % of seropositive cats develop feline infectious peritonitis (fip). fip is a chronic, progressive, immune-mediated disease in domestic and nondomestic fields. the typical histopathological finding of this disease is systemic perivascular necrotizing pyogranulomatous inflammation [ ] . two serotypes that differ in their growth characteristics in tissue culture and in their genetic relationship to ccov and tgev have been identified [ , ] . type ii fcov is significantly correlated with fip when compared to type i viruses [ , ] . however, unlike the ubiquity of type i fcov, infection by type ii virus encompasses only a small percentage of the total number of fcovseropositive cats in different studies [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] . type ii fcov is estimated to have diverged from alphacoronavirues in [ ] . based on partial genomic sequence analysis, type ii fcovs were suggested to result from a double recombination between type i fcov and ccov [ ] . like most rna viruses, covs mutate at a high frequency due to the high error rate of rna polymerization. in addition, a unique feature of cov genetics is the high frequency of rna recombination in the natural evolution of this virus [ ] . recombination among covs is an attribute of the genus and is thought to contribute to the emergence of new pathotypes, such as severe acute respiratory syndrome cov [ , ] , human cov nl (hcov nl ) [ ] , hcov hku [ ] , and avian infectious bronchitis virus (ibv) [ ] [ ] [ ] [ ] . to gain better evolutionary insight into type ii fcovs, we analyzed the complete genome of a novel type ii fcov isolate. taking our data together with data from other strains, we discuss the evolution of type ii fcov. virus and isolation of viral rna fcov ntu was isolated in from a kitten with naturally occurring fip by the co-cultivation of pleural effusion with feline fcwf- cells [ ] . after three rounds of purification by limited dilution, the virus was propagated and titrated. all of the viruses used in this study for the sequencing of complete genome came from a stock virus passaged six times. ntu is relatively fast-growing, induces a coronavirus-typical syncytial cytopathic effect and is a type ii fcov [ ] . eleven microliters of isolated rna was added to the premix, consisting of ll of rt buffer, . mmol dntps (geneteks bioscience, inc., taipei), pmol random primer, . mol dithiothreitol, and ll of u moloney murine leukemia virus reverse transcriptase (invitrogen, ca, usa) in a . ml reaction tube. this reaction mixture was then briefly centrifuged and incubated at °c for min, then at °c for min, and finally at °c for min. a total of primers for pcr were chosen from a relatively conserved region of the fcov genome. following reverse transcription, ll of the rt reaction mixture was added to ll of the pcr mixture, which consisted of ll of taq buffer, each primer ( pmol), dntp ( . mmol), u of taq dna polymerase (geneteks, bioscience, inc., taipei), and ll of . % depc water. an abi- thermal cycler (applied biosystems, usa) consisted of min of preheating at °c, followed by cycles of denaturation at °c for s, annealing at °c for s, and extension at °c for min with a final extension at °c for min. the viral rna termini were amplified using -and -rapid amplification of cdna ends (invitrogen, usa). analysis of pcr-amplified products and sequencing a total of ll of pcr products from each pcr mixture was analyzed using a % agarose gel (viogene, taipei) for electrophoresis. amplification products were visualized using uv illumination after ethidium bromide staining. the nucleotide sequences of the targeted dna fragments were purified (geneaid biotech, taipei) and sequenced in both directions using an auto sequencer (abi xl, usa). full-length genome sequencing of ntu was performed by single-round pcr with a set of overlapping pcr products (average size bp) that encompassed the entire genome. the complete sequences of ntu were then compared with other alphacoronaviruses and the results are summarized in table . multiple alignments of nucleic acid sequences were performed by the clustal w method using the megalign program (dnastar inc., wi, usa). phylogenetic analyses were conducted using mega, version . . similarity graphs were prepared with simplot . software [ ] . potential recombination sites were identified using the recombination detection program (rdp) [ ] . the full genomic rna sequence of fcov ntu comprises , nucleotides (nts), excluding the polyadenylated nts. sequence analysis revealed that ntu contains conserved open reading frames with an overall genome organization similar to known fcovs ( table ). the overall nucleotide composition is as follows: a, . %; c, . %; g, . %; and t, . %. the g?c content is . %. ntu possesses the putative transcription regulatory sequence (trs) motif, -cuaaac- , at the end of the leader sequence and preceding each orf ( table ) . table ). the two strings of accessory genes identified in all of the known fcovs, i.e., orf ab and orf ab, were found in ntu as well (table ) . however, an in-frame deletion of nucleotides in orf c was identified, which resulted in a relatively short gene comprising only nts. the overall sequence comparison revealed that ntu was more closely related to known subgroup a cov but not b within alphacoronaviruses (fig. ) . nucleotide sequences similarity graphs of ntu with known type i fcovs, ccovs, and tgevs were created by the simplot software. the results showed that ntu was more closely related to type i fcovs from the end of the genome to position , and from position , to -utr (fig. ) . genes located at the end (nsp - ) and end (the n gene through orf ) of ntu show consistently high similarity to type i fcovs, whereas from nsp through the e sequence, the similarity to canine and porcine covs varies dramatically (fig. ) . these data indicate that ntu might have arisen from recombination events between different strains of covs from species other than cats. two possible recombination sites, at approximate positions , and , , corresponding to the rnadependent rna polymerase (rdrp) and the m gene, respectively (fig. ) , were further analyzed. phylogenetic trees using the nucleotide sequence of genes for putative proteins and polypeptides of alphacoronaviruses were further constructed. at nsp through nsp ( supplementary fig. a ) and from n gene (supplementary fig. b) to the orf ab gene ( supplementary fig. c) , ntu was more closely related to type i fcovs. at the nsp (rdrp) and the m gene, ntu was not clustered with any known alphacoronavirues (fig. a, f) . from the nsp through the e gene, ntu was clustered with ccov ( fig. b-e) . taken together, these data indicated that ntu might have evolved from two recombination events with ccov, with the sites of recombination located in the rdrp and m genes. a unique feature of cov genetics is the high frequency of rna recombination both in vivo and in vitro [ ] . here, an interspecies recombination between feline and canine cov was identified in a viral strain ntu , which was isolated from the pleural effusion of a fip cat. this is the first time that evidence for natural recombination has been documented through the complete genome sequence analysis of type ii fcov. in , herrewegh et al., based on partial sequence analysis, first determined that type ii fcovs - and - originated from a homologous rna recombination event between type i fcov and ccov [ ] . the complete sequence of strain - was later published in [ ] . when comparing strain - , - , df- , and our ntu strain, the only four type ii fcovs that have had their full-length genomes sequenced to date, a common phenomenon was found; both viruses arose from a double recombination event between type i fcov and coronaviruses from other species. ntu appears to have evolved from a recombination between type i feline and canine cov; however, when we aligned the genes in which recombination took place with other type ii fcovs, genome crossovers with other alphacoronavirus were noted. when the sequences around the putative recombination sites were examined, i.e., one located in the region (strain - ) and two in the region (strain - , and - ), porcine coronavirus (tgev) was also found to contribute to the evolution of type ii fcov (in addition to ccov) (figs. b, ) . this finding is not surprising because the receptor for type ii fcov, feline aminopeptidase n, has been found to serve as a receptor for several covs, including canine, porcine, and human covs [ ] . therefore, the interspecies recombination of type i fcov with any of the above viruses might occur in nature. the recombination of type i fcov in cats analysis was performed using mega software and neighborjoining methods based on , replicates. bootstrap support values greater than are shown living in the same household with dogs or living close to pig farms could give rise to type ii fcov. based on the analysis of the four full genomes of type ii fcov available at present (ntu , - , - , and df- ), type ii fcovs appear to retain type i fcov sequences in their and ends. we asked whether the genes located in these regions are indispensable for fcov replication in cats. to answer that question, the amino acid sequence of genes retained at both ends of fcovs, ccovs, and tgevs, i.e., nsp through nsp and the n and orf genes, were further aligned and compared (table ). in contrast to the greater than % of amino acid homology between different strains of fcovs, the nsp , , and , as well as the n gene and the orf a gene of fcovs, when compared with ccovs or tgevs, exhibited similarity of less than %. this finding indicates that these gene products might possess irreplaceable functions for fcov replication. this might explain why type ii fcovs found in nature harbor genomes that evolved from a double recombination. although the prevalence of type ii fcov is consistently lower ( - %) than type i virus around the world ( - %) [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] , our previous study indicates that infection of type ii fcov correlates significantly with fip a similarity plot was constructed to identify the sequence homology between type i fcovs black, c je, and uu (gray); ccov ntu (red); and tgev purdue, m , and ts (blue). red arrows represent putative recombination regions. a similarity of . indicates regions that share % nucleotide identity. the similarity calculation was performed using the following parameters: a window size of , bp and a step size of bp for full-length sequences when compared to type i [ ] . as shown in the present study, type ii fcov arises by exchanging a large genome fragment (approximately kb) of type i fcov with other members of alphacoronaviruses. the genes exchanged through this double recombination include nsp - , structure protein s (spike), and accessory protein abc. the nsp - proteins are replication proteins with functions such as helicase activity (nsp ), nucleoside triphosphatase activity (nsp ), rna -triphosphatase activity (nsp ), - exoribonuclease activity (nsp ), rna cap formation (nsp and nsp ) , and endonuclease activity (nsp ) [ ] . it has ben reported that the function of the c protein might be crucial for viral replication in the gut but is dispensable for systemic fcov replication [ ] . however, s proteins play a crucial role in receptor binding and eliciting protective immunity [ ] . through the replacement of nearly one-third of the genome, the new virus might gain new antigenicity, posing a threat to cats that either have been infected with a type i virus before or never have been infected with fcov. fields virology key: cord- -bww vx authors: gopinath, m.; shaila, m. s. title: evidence for n( ) guanine methyl transferase activity encoded within the modular domain of rna-dependent rna polymerase l of a morbillivirus date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: bww vx post-transcriptional modification of viral mrna is essential for the translation of viral proteins by cellular translation machinery. due to the cytoplasmic replication of paramyxoviruses, the viral-encoded rna-dependent rna polymerase (rdrp) is thought to possess all activities required for mrna capping and methylation. in the present work, using partially purified recombinant rna polymerase complex of rinderpest virus expressed in insect cells, we demonstrate the in vitro methylation of capped mrna. further, we show that a recombinant c-terminal fragment ( – aa) of l protein is capable of methylating capped mrna, suggesting that the various post-transcriptional activities of the l protein are located in independently folding domains. the presence of cap structure at the end of mrna prevents the mrna from degradation by cellular rnases and also plays an important role in the translatability of mrna [ ] . this di-nucleotide structure is also methylated to various extents in different organisms, and methylation of the first base at n position of guanine residue results in cap structure; methylation of the penultimate base at hydroxyl group results in cap structure. cellular mrna capping and methylation occur by an orderly series of events carried out by rna triphosphatase, guanylyl transferase, n guanine methyl transferase and -o-methyl transferase, respectively [for a detailed review see, ref. ] . paramyxoviruses constitute a group of viruses with single-stranded negative sense rna genome that includes potential pathogens to humans and domestic live stocks. the viral genome consists of a * kb long negative sense rna encapsidated by nucleocapsid protein (n-rna). transcription of viral n-rna occurs in an orderly fashion from the end of n-rna; -le-n-p-m-f-h-l-tr- . excluding the nt leader rna, all the other mrnas are capped and methylated similar to cellular mrna. viruses of this family replicate inside the cytoplasm of the infected cells and hence, are not dependent on the host enzymes located in the nucleus for the post-transcriptional modification of viral mrna. during transcription, the viral mrnas are capped similar to the cellular mrna, although the extent of methylation differs within the viruses belonging to this family. the direct evidence for the belief that the large protein l of paramyxoviruses is responsible for mrna synthesis, capping and cap methylation came from the work of ogino et al. [ ] who showed that the recombinant l protein of sendai virus possesses guanine- methyl transferase activity located within the c-terminal part of l protein. multiple sequence alignment and secondary sequence analysis predicted the presence of -o-methyl transferase domain in c-terminal domain of l protein of mononegavirale [ ] . rinderpest virus (rpv) is an important member of the morbillivirus genus, in the paramyxoviridae family. we have earlier shown that in rpv-infected cells, the viral mrna is capped and has both cap and cap structures [ ] . in addition, we also demonstrated the guanylyl transferase activity of l protein in vitro. in the same study, domain mapping revealed the ability of a truncated l protein (ld -aa - ) to catalyse the first step of guanylyl transferase activity; vis-a-vis formation of a covalent complex with gmp. in the present work, we present evidence for n guanine methyl transferase activity of rpv l protein and further demonstrate that this activity is localized to aa - of l protein indicating the modular nature of the rdrp. spodoptera frugiperda (sf ) insect cells were cultured and maintained as described earlier [ ] . generation of recombinant baculoviruses expressing rpv l (full length), p and domain iii (ld , aa - ) has been described earlier [ , ] . partial purification of l-p complex from insect cells infected with recombinant baculoviruses expressing rpl l and p proteins has been reported earlier [ ] . rpv ld protein was purified from the insoluble fraction of respective baculovirus-infected cells using high salt extraction as described previously [ ] . generation of b capped mrna substrate for methyl transferase assay for in vitro methyl transferase assay, cap-labelled mrna substrate was prepared as described earlier [ ] , except that the substrate was a consensus nt sequence representing rpv viral mrnas (fig. c) . the following primer pair with a t promoter sequence was used for in vitro transcription followed by capping with vaccinia virus guanylyl transferase; for: gat cct tat agt gag tcg tct ta- , rev: -taa tac gac tca cta ta. for in vitro methyl transferase assay, the cap-labelled rna substrate ( cpm) was incubated with the indicated concentrations of enzyme source in methylation buffer containing mm hepes-koh, ph . , mm dtt, mm nacl, and u of human placental rnase inhibitor and lm of s-adenosyl methionine (sam) in a total reaction volume of ll. after incubation for h at °c, the total reaction mix was adjusted to mm sodium acetate ph . , mm mgcl and lg of nuclease p in a total volume of ll and incubated at °c for h. ll of the reaction products was spotted onto a pei-tlc sheet and subjected to chromatography with . m ammonium sulphate as the solvent system. cap structure analogues (gpppa, m gpppa and m gpppa m ) were run in parallel and detected by uv shadowing. in our previous study, using in vitro reconstituted transcription with purified rpv virions, the viral mrna was found to possess cap structure indicating a viral-encoded capping enzyme [ ] . further, the virion-associated capping activity was localized to l protein [ ] . sequence alignment of rpv l protein with -o-methyl transferase from other species revealed the conservation of kdke tetrad suggesting the presence of this motif in domain iii of rpv l protein (fig. a) . in addition, we also found the s-adenosyl methionine (sam) binding motif gxgxg within residues - conserved across morbillivirus genus (fig. b) . considering the presence of both kdke tetrad, responsible for -o-methyl transferase as well as gxgxg motif for sam substrate binding, domain iii could likely represent the methyl transfer module (both n -guanine and -omethyl) of rpv l protein. this is in agreement with other studies with vsv as well as in sendai virus where the methyl transferase activity was mapped to c-terminal half of l protein [ , ] . to investigate the role of l-p complex in viral mrna cap methylation, a b rna template representing the first b consensus sequence of all species of rpv viral mrna (aggauc) was synthesized in vitro using t rna polymerase (fig. c) . the viral rna was capped using vaccinia virus guanylyl transferase enzyme. capped rna was seen to co-migrate with xylene cyanol marker and the unused [a- p] gtp was seen near the bromophenol blue position (fig. a, lane ) while a reaction lacking the guanylyl transferase enzyme showed only the gtp (lane ). this was further gel purified and used as a substrate for in vitro methyl transferase assay. we have earlier partially purified transcriptionally active and capping competent l-p complex from insect cells using glycerol gradient fractionation. given the high molecular weight and oligomerization nature of l-p complex, only high-density glycerol fractions contained both l and p proteins, which is usually devoid of insect or virus genes ( ) : - baculoviral methyltransferase activity [ ] . to test, if rpl l protein possesses methyltransferase activity in addition to capping, we incubated b capped viral rna with partially purified l-p complex. digestion of the substrate alone with p nuclease released a product, which co-migrated with the gpppa (cap) marker (fig. b, labelled as c) . incubation of from genbank and subjected to clustalw analysis and viewed with gsview . . the kdke tetrad is marked by asterisks. alpha helices and beta sheets are marked by bars and arrow marks, respectively. b alignment of l proteins from morbillivirus genus shows the conservation of sam binding motif gxgxg (in bold). c sequence alignment of the ends of rpv viral mrnas. only the first eight bases are shown. consensus sequence between the viral mrnas is given in bold the capped rna with the partially purified l-p complex from insect cells resulted in a concentration dependent n guanine methylation of bp substrate (fig. b , marked as rl) which was not detected in a mock-purified high-density fraction from insect cells infected with non-recombinant baculovirus (fig. b, mock) . however, higher concentrations of rl led to the appearance of a slower migrating spot, likely due to the increase in glycerol concentration present in the reaction mix, leading to aberrant migration of m gpppa. further, to functionally validate the methyl transferase activity of domain iii (aa - , ld ) of rpv l protein, ld was purified from insect cells using metal chelate affinity chromatography as described earlier [ ] . figure c shows the purity of recombinant ld protein in eluted fraction (lane ). incubation of ld alone was able to catalyse the n guanine methylation of a bp cap-labelled substrate in dose-dependent manner suggesting the presence of n methyl transferase domain within this region (fig. d) . however, no products were observed, co-migrating with m gpppa m indicating the lack of -o-methyl transferase activity with domain iii or with l-p complex in our preparation. though l protein is believed to possess all the activities required for the post-transcriptional modification of the viral mrna, due to its size, it has been proposed to fig. a b function in a modular fashion to carry out different enzymatic activities associated with viral mrna synthesis and maturation [ ] . in agreement, putative -o-methyl transferase (mtase) motif was predicted within domain vi ( - aa) of l proteins [ ] . in another report, a structural homology-based comparison was carried out between bacterial -o-mtase, rrmj and the region spanning - aa of vsv l protein, and further mutational analysis revealed the importance of this region in viral mrna transcription as well as methylation [ ] . however, recent evidences point out the importance of regions in domain ii of vsv l protein in both cap and cap methylation [ ] . in the present study, we have shown that rpv l domain iii alone could catalyse the methylation of gpppa which obviates the need of domain ii for cap methyl transferase activity. in support of this observation, ogino et al. [ ] have shown that sendai virus l protein deletion mutant spanning the domain iii alone (aa - ) catalyses cap methyl transferase activity, while inclusion of a portion of domain ii (aa - ) resulted in significantly higher activity. these results suggest that in paramyxovirus l protein (compared to rhabdo viruses), the catalytic module for cap methyl transferase activity resides in domain iii, and domain ii may have additional role of stabilizing the enzyme or increase the catalytic efficiency. we provide evidence for the modular nature of rpv l protein in terms of domain iii alone participating in viral mrna cap methylation. although the rpv l protein was found to possess the kdke motif, the catalytic motif for -o-methyl transferases, the generation of cap ( m gpppa m ) product could not be seen. one likely reason could be that the presence of cap is a mandatory prerequisite for rpv l protein to generate cap structures. in support of this, coronavirus nonstructural protein was found to exhibit -o-methyl transferase activity only on n gpppa substrate rna, while flavivirus ns methyl transferase can catalyse the methylation of both gpppa and n gpppa substrates [ , ] . alternatively, lack of domain ii may render domain iii catalytically inactive with respect to -o-methylation [ ] . hence, it would be interesting to speculate that rpv l protein may also require specific n gpppa substrate rna to exhibit -o-methyl transferase activity although further experiments are needed to confirm this hypothesis. in silico identification, structure prediction and phylogenetic analysis of the -o-ribose (cap ) methyl transferase domain in the large structural protein of ssrna negative-strand viruses coronavirus nonstructural protein is a cap- binding enzyme possessing (nucleoside- -o)-methyltransferase activity independent structural domains in paramyxovirus polymerase protein structural and functional analysis of methylation and -rna sequence requirements of short capped rnas by the methyltransferase domain of dengue virus ns viral and cellular mrna capping: past and prospects analysis of a structural homology model of the -o-ribose methyl transferase domain within the vesicular stomatitis virus l protein recombinant l and p protein complex of rinderpest virus catalyses mrna synthesis in vitro rna triphosphatase and guanylyl transferase activities are associated with the rna polymerase protein l of rinderpest virus identification of a new region in the vesicular stomatitis virus l polymerase protein which is essential for mrna cap methylation processing the message: structural insights into capping and decapping mrna a unique strategy for mrna cap methylation used by vesicular stomatitis virus sendai virus rna-dependent rna polymerase l protein catalyses cap methylation of virus-specific mrna acknowledgments this study was supported in part by a grant-inaid for research from the council for scientific and industrial research, new delhi, india, under the emeritus scientist scheme. key: cord- - zv sc authors: yew, tan do; bejo, mohd hair; ideris, aini; omar, abdul rahman; meng, goh yong title: base usage and dinucleotide frequency of infectious bursal disease virus date: journal: virus genes doi: . /b:viru. . .c sha: doc_id: cord_uid: zv sc base usage and dinucleotide frequency have been extensively studied in many eukaryotic organisms and bacteria, but not for viruses. in this paper, a comprehensive analysis of these aspects for infectious bursal disease virus (ibdv) was presented. the analysis of base usage indicated that all of the ibdv genes possess equivalent overall nucleotide distributions. however when the base usage at each codon positions was analysed by using cluster analysis, the vp open reading frame (orf) formed a different cluster isolated from the other genes. the unusual base usage of vp orf may indicate that the gene was originated by the virus “overprinting strategy”, a strategy in which virus may create novel gene by utilizing the unused reading frames of its existing genes. meanwhile, the gc content of the ibdv genes and the chicken's coding sequences was comparable; suggesting the virus imitation of the host to increase its translational efficiency. the analysis of dinucleotide frequency indicated that ibdv genome had dinucleotide bias: the frequencies of cpg and tpa were lower and the tpg was higher than the expected. classical methylation pathway, a process where cpg converted to tpg, may explain the significant correlation between the cpg deficiency and tpg abundance. “principal component analysis of the dinucleotide frequencies” (df-pca) was used to analyse the overall dinucleotide frequencies of ibdv genome. df-pca on the hypervariable region and polyprotein (vpx-vp -vp ) gene showed that the very virulent ibdv (vvibdv) was segregated from other strains; which meant vvibdv had a unique dinucleotide pattern. in summary, the study of base usage and dinucleotide frequency had unravelled many overlooked genomic properties of the virus. infectious bursal disease (ibd) is an immunosuppressive disease that affects young chickens characterized by the destruction of bursa of fabricius. reviews of the disease have been published elsewhere [ ] [ ] [ ] [ ] [ ] . ibd is caused by infectious bursal disease virus (ibdv), which is a double-stranded rna (dsrna) virus [ , ] . ibdv belongs to the genus avibirnavirus [ ] under the birnaviridae family. other genera of birnaviridae are aquabirnavirus and entomobirnavirus [ ] . ibdv genome consists of two segments, designated as segment a and b [ , ] . the genome is enclosed within a nonenveloped icosahedral capsid approximately nm in diameter [ ] . the complete nucleotide sequence of segment a is , bp [ ] that contains two open reading frames (orfs) of , bp [ ] and bp respectively, in which the smaller orf partially overlaps at the ¢ end [ ] . the large orf encodes a precursor polyprotein (nh -vpx-vp -vp -cooh), which is autoproteolytically processed by cis-acting viral protease vp into vpx ( kda), vp ( kda), and vp ( kda) [ ] . vpx, as a precursor protein, will undergo a second independent proteolytic processing step to yield a smaller matured product known as vp [ ] . vp and vp form the viral capsid [ ] . high conformational epitopes present in vp protein are responsible for the production of neutralizing antibody to protect the chicken from ibdv infection [ , ] . vp is the minor structural protein recognized by the nonneutralizing antibodies [ , ] and can efficiently bind to ssrna and dsrna [ ] . the small orf in segment a encodes vp protein with unknown function [ ] . vp might be important in the pathogenesis [ ] but is unessential for the viral replication and infection [ , ] . vp might also be involved in the release of viral progeny from infected cells [ ] . vp gene overlaps vpx gene at its th nucleotide, therefore almost all of its nucleotides are within the vpx. segment b ( , bp [ ] ) consists of a single orf that encodes for vp ( kda), a rna-dependent rna polymerase [ ] [ ] [ ] with capping activities [ ] . it has been reported that birnaviruses' polymerases formed a defined subgroup of polymerase by the lacking a gdd motif [ ] . the formation of vp -vp complexes plays a critical role in ibdv replication [ ] . there are two serotypes of ibdv, namely serotype and [ , ] . in addition to serological classification, ibdv strains are also grouped according to their virulence (mortality and bursal lesions) [ ] . the very virulent ibdv strain (vvibdv) can cause up to % mortality and severe bursal lesions in specific-pathogen-free (spf) chickens [ , ] . the classical virulent strain (cvibdv) may cause bursal damage and mortality up to % [ ] . chickens infected by the variant strain (vaibdv) may rapidly develop bursal atrophy without the inflammation phase [ ] but the mortality caused by the vaibdv can be less than % [ , ] . attenuated strain (atibdv) is usually derived from the attenuation of cvibdv isolate and typically used as a vaccine; however, despite being attenuated, it may still capable of causing lesions in the bursa [ ] . the newly emerged a typical (ayibdv) strain that has unusual amino acid substitutions in the vp gene is also being documented [ ] [ ] [ ] . meanwhile, the serotype isolates are usually isolated from turkeys and are apathogenic to both chickens and turkeys [ ] . ibdv has also being classified based on its sequence characteristics such as the presence of certain restriction enzyme sites and unique amino acid residues in its vp gene [ ] [ ] [ ] . the diversity of the ibdv strains had complicated the control and prevention of ibd, for example birds vaccinated against cvibdv strain may not have adequate protection against other strains [ , ] . therefore, analysis of the common genomic properties of the various ibdv strains will contribute greatly towards the understanding of the virus and the subsequent control and prevention efforts. although many sequence analyses papers had been published, base usage and dinucleotide frequency of ibdv remained unexplored. by studying the base usage, it was found that the genomic gc content of flaviviruses was associated with its vector specificity [ ] . in thermophilic bacteria, high genomic gc content had been associated with the greater genomic stability (stronger bond of g-c pairs compared with a-t pairs) as a result of evolutionary adaptation to the hot environment [ ] . and for human immunodeficiency virus (hiv) and other lentiviruses, unknown mechanisms had driven these viruses in having a strong bias for adenine nucleotide [ ] . non-random dinucleotide biases of the genome constitute a ''general design'' or genomic signature [ ] [ ] [ ] [ ] . genomic signature reflects the dna properties in terms of its stacking energies, modification, replication, and repair mechanisms [ ] . moreover, genomic signature is useful for the detection of pathogenicity islands in bacterial genomes [ ] . generally, cpg (or ¢-cg - ¢) and tpa dinucleotides are scarce [ ] [ ] [ ] . cpg deficiency is typically associated with the classical methylation pathway, in which susceptible cpg dinucleotides will be methylated and subsequently converted to tpg [ ] . tpa dinucleotides are unfavourable because the ua in mrna is susceptible to rnase activity [ ] . furthermore, avoiding tpa dinucleotides might reduce the occurrence of stop codons since two out of three stop codons are coded by taa and tag. this paper had unveiled several fundamental characteristics of the ibdv genome. the base usage at each codon positions was described. the extracted information from the base usage was then utilized to investigate the origin of the overlapping vp gene. comparison of the viral gc content with that of the host gave an insight into the virus-host interaction. the viral dinucleotide frequencies and their significance were also discussed. all ibdv sequences ( sequences), except the patented sequences, were downloaded from the genbank release . . duplicated sequences, non-coding sequences, and sequences with unresolved/ambiguous sites were discarded. sequences were then grouped into eight groups in reference to the different regions of the ibdv genome -namely vp (n ¼ ), vpx ( ), vp ( ), vp ( ), vp ( ) , vp ( ), polyprotein gene ( ) , and hypervariable region (hvr) ( ) groups. other sequences that cannot fit into the groups were excluded from the analysis. selected sequences were edited and aligned by using bioedit software version . . [ ] and clustalx software [ ] . since most of the genbank's ibdv entries did not clearly state that which strain (pathotype) the isolates belonged to, rather than merely based on molecular markers, strain identification was done manually by extensive literature search. among the ibdv sequences in the genbank, only few of the isolates had been completely sequenced; whereas the majority others were not. since the grouping of the sequences was based on different regions of the ibdv genome, and since a fully sequenced isolate will cover all of the regions of the genome, then an isolate might simultaneously being included into different groups. meanwhile for most isolates, only their hvrs were sequenced and therefore they only formed part of the ''hvr group''. there was serotype isolate in vp dataset whereas isolates in all other datasets. in summary, regardless of the groupings, the nucleotide sequences of ibdv isolates were analyzed. the accession numbers of these isolates were: ab , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , af , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , aj , ay , ay , ay , ay , d , d , d , d , d , d , l , m , m , m , x , x , x , x , x , x , y , y , y , y , y , y , y , z and z . hosts (chicken and turkey) genomic coding sequences were obtained from codon usage database (http://www.kazusa.or.jp/codon/) genbank release . . bursal est database [ ] was referred to identify the highly expressed genes specifically found in the b-cells of the bursa of fabricius. since the database was constructed using a non-normalized cdna library, the most frequently identified chicken (callus gallus) genes will be the most abundantly (or highly) expressed genes in the bursa [ ] . in addition, highly expressed bursal genes from other sources [ ] were also included. therefore the highly expressed base usage and dinucleotide frequency genes used in the analysis were ribosomal ( sequences), heat shock (two sequences), elongation factor -a, b-actin, ig rearranged light-chain vjc, chicken germ line ig light chain, dead-box rna helicase, non-histone chromosomal protein hmg- , mhc b complex, atf , bu-la, and chbl genes. all of the sequences were downloaded from genbank and being meticulously edited, intronexcised, and analysed for the gc content. base usage and dinucleotide frequencies were calculated by using codonw . (software by john penden and available at ftp://molbiol.ox.ac.uk/ win .codonw.zip) and dambe version . . (by xuhua xia and available at http://web.hku.hk/ $xxia/software/installation.htm). both programs were used concurrently to ensure high reproducibility. data editing and various analyses (correlation, cluster analysis, and principal component analysis) were done by using microsoft excel , statistica version , and spss version software. the overall base usage was calculated for each virus gene. in addition, base usage at the first (p ), second (p ), and third codon positions (p ) were also computed. similarly, dinucleotide frequency was calculated for each of the reading frames ( : , : , : ) and as the overall measurement (at all codon positions). dinucleotide index (dni) was computed as the ratio of observed (o d ) to expected (e d ) dinucleotide frequencies: the expected frequency (e d ) of the dinucleotides at sites p and p was calculated as where p(n ) and p(n ) were the proportions of the nucleotides n and n at p and p respectively. if there was no dinucleotide bias, dni value will be . base usage of serotype ibdv genes base usage or the relative distribution of each nucleotide (a, t, g, and c) at each codon positions was calculated for vp , vpx, vp , vp , vp , and vp genes. subsequently, a rank of (least frequently used) to (most frequently used) was assigned to each nucleotide distribution in reference to its relative base usage percentage. the base usage patterns became pronounced after the shading (coloured as grey) of the higher ranks (rank and ) versus the lower ranks (rank and ) (non-coloured) as shown in table . generally, base usage at each codon positions (p , p , and p ) would not be equal because the base usage of the coding sequences was not random. moreover, base usage at p and p was constrained by the coding amino acids. indeed, only % of p mutations were synonymous and all p mutations were non-synonymous [ ] ; these resulted in the inflexibility of the base usage at p and p . however, the p was expected to have a more variable base usage because % of p mutations were silent [ ] . referring to table , thymine (t) was the least preferred nucleotide at p . considering all stop codons begin with t (taa, taa, and tga), avoidance of t at p was understandable to prevent the unwanted occurrence of stop codon in the viral coding sequence (cds). except for vp gene, guanine (g) was comparatively high at p . this showed the inclination of ibdv to encode aliphatic amino acids (alanine, valine, and glycine). intriguingly, the general base usage patterns at p were comparable for all ibdv genes. at p , all viral genes had the lowest g nucleotide except vp ; which had the lowest t nucleotide. deficiency of g at p might attribute to the virus' efforts to prevent the occurrence of stop codon. unlike p , base usage at p was more varied because any p 's mutation will alter the encoded amino acid. in this case, maintaining the physiochemical properties of the virus proteins, most probably by evolutionary forces, would be more important than maintaining a similar base usage. at p , all viral genes were devoid of t, excluding vp and vp genes. in addition, c (cytosine) appeared to be the preferred nucleotide. the bias towards c was an interesting feature because most p 's mutations were silent [ ] . base usage bias at p might confer certain selective advantageous to ibdv; perhaps by having the bias, the virus would be able to match up its codon usage with the host. if so, the virus may improve its translational efficiency and this may lead to increased fitness. meanwhile, it was suggested that favouring of c at p would increase the coding ability or new orf formation, considering none of the stop codons contain c nucleotide [ ] . however, the dearth of g nucleotide in vp gene remained to be investigated. unexpectedly, the overall (total) base usage of all ibdv genes was similar, despite some discrepancies at each codon positions. moreover, although being physically separated, the vp still resembled other genes. it was also found that c and a (adenine) were the most preferred nucleotides, whereas t was the least preferred. given that rna virus had high mutation rate [ ] and short generation time, why did the virus maintain a similar base usage pattern for all its genes? perhaps this could be the virus strategy to optimise its genes expression. it had been shown that virus could take the advantage of the codon composition to regulate its own programs of gene expression [ ] while utilizing the cellular machinery to replicate its genome. base usage of the serotype genome was separately analyzed because only two isolates (oh and / ) were available from the genbank . . results indicated that the serotype 's base usage was comparable to serotype 's (data not shown). as in serotype , serotype 's vp gene had peculiar base usage pattern. further analysis of the vp 's non-overlapping region (nolvps) ( codons, bp) revealed that although its p was also rich in c (> %), it was richest in t ( . %); which was differed from other genes. these findings were in agreement with the previous report [ ] where overlapping genes showed significant bias in their base usage. to study the relationships among the virus genes, cluster analysis was performed on the virus genes' nucleotide compositions. the virus genes were treated as the 'columns' (seven columns: vp , vpx, vp , vp , vp , vp , and nolvp ) and the nucleotide compositions (presented as mean percentages) at each codon positions were treated as the 'attributes' in q-type cluster analysis. since there were three different codon positions and four types of nucleotides, therefore there were attributes: for example, the percentage of adenine at p , the percentage of guanine at p … the percentage of cytosine in p , and so forth. squared euclidean distances were then computed and a tree was constructed using unweighted pair-group average (upgma) amalgamation rule (fig. ) . cutting the tree at . linkage distance, it was clear that vp gene and its non-overlapping region formed different clusters compared with other viral genes. this led us to suspect that vp gene's peculiar base usage was due to its origin; where most likely it was originated by overprinting the 'original' (or existing) viral genes. to generate a novel gene, the virus may either need to synthesize an entirely new nucleotide sequences or alternatively, it may utilize the unused reading frames of the existing genes, a process first proposed by grasse [ ] , who called it ''overprinting'' [ ] . in tymoviruses, overlapping gene arose by overprinting the ''original'' replicase gene after the virus had diverged from its sister groups from a common ancestor [ ] . in the birnaviridae family, vp gene was found only in avibirnavirus (ibdv) and aquabirnavirus (infectious pancreatic necrosis virus or ipnv). the other genus, entomobirnavirus (drosophila x virus or dxv) had no equivalent orf to overlap at the ¢ terminus of vpx [ ] . for dxv, the predicted overlapping non-structural protein (believed to be a vp homolog) resides in between vp and vp genes. with regard to the birnaviruses evolution, the most parsimonious explanation appeared to be the polyprotein gene was the birnaviruses' ''original gene'' and vp gene arose after the vertebrate birnaviruses (ibdv and ipnv) and the insect birnavirus (dxv) had diverged from their common ancestor. it was unlikely for dxv to initially possess vp gene, to lose it subsequently after the divergence, and to create another new orf in order to replace the lost gene's function. due to the frame shift of overprinting gene, the gene will have an unusual codon usage and encodes new protein with physiochemically-biased properties [ ] . vp protein had been shown to play a role in ibdv pathogenesis [ ] and in the release of viral progeny from infected cells [ ] . vp -defective virus had exhibited a slight delay in replication [ ] ; but the vp gene was inessential for the virus in vitro [ ] and in vivo replication [ , ] . simply put, the acquaintance of vp gene as a ''new gene'' by overprinting strategy in birnaviruses evolutionary history, although inessential, may give the virus certain survival advantages to retain the vp gene in its genome. gc content (gc%) for many double-stranded dna (dsdna) viruses differed markedly from the gc content of the host cells they infected [ ] . to investigate if the same phenomenon applies to ibdv (dsrna), we compared the virus' gc% with the host ( table ) . results showed the overall gc% of ibdv genome was comparable to the chicken (gallus gallus), in which it was around - %. interestingly, in spite of high mutation rate of the hypervariable region, its gc% nearly matched the host highly expressed genes' gc%. similarly, segment b's gc% was very close to the chicken highly expressed genes' gc%. meanwhile, serotype 's gc% was differed more to turkey (meleagris gallopavo) than to chicken, although serotype isolates usually isolated from turkey. the reason for this discrepancy remained to be answered. a general pattern of gc% for both virus and host was observed: high gc% in p , low in p , and high in p . these findings would suggest the virus attempt in mimicking the host gc%, particularly p gc%, probably in order to optimise its codon usage for translational efficiency and continue to thrive as a successful intracellular parasite. in contrast to the dsdna virus, gc% of the ibdv and the host was comparable. apart from the cpg islands in mammalian genome, cpg dinucleotides were usually under-represented because of two main reasons. first, the classical methylation pathway that converts cpg to tpg [ ] . the pathway works by methylating the ¢ cytosine of cpg and subsequently deaminates the -methylcytosine leading to the mutation of cpg and convert to tpg [ ] . second, cpg dinucleotides exhibit the greatest thermodynamic stacking energy of all dinucleotides [ , ] ; therefore, reducing its frequency might facilitate nucleic acids replication and transcription [ ] . thus, it will be interesting to investigate if ibdv genome was also devoid of cpg dinucleotides. to study the ibdv's dinucleotide frequencies, three datasets were analysed, namely the polyprotein gene (vpx-vp -vp ), hypervariable region, and segment b sequences. the vp gene was excluded because it was highly conserved ( / isolates have identical sequences) and most of its nucleotides were embedded within the vpx gene. the null hypothesis in this study was that there was no selective pressure against cpg dinucleotides or meaning that all dinucleotides pairs had equal chance of occurrence with the reference to the base composition. the mann-whitney u test was used to demonstrate if cpg dinucleotides had significantly deviated from the expected proportion. results from table showed that p and p were highest in c and g, respectively. thus, if there was no dinucleotide bias, one would expect high cpg dinucleotides at the intercodon position (p :p ). however, results from the analysis of the three datasets showed that the dinucleotide bias did occur where the expected intercodon cpg dinucleotides were significantly lower than the observed (p < . ). this succinctly showed the avoidance of cpg dinucleotides in ibdv genome. this finding was in accordance with karlin et al. [ ] where virtually all small eukaryotic viruses were deficient in cpg dinucleotides. meanwhile, tpg intercodon dinucleotide frequency was significantly higher than the expected (p < . ). further analysis of the dinucleotide frequency at all possible codon positions gave the same results where the cpg was lower and tpg was higher than expected. moreover, tpa dinucleotides were also found to be lower than the expected (p < . ). the dearth of tpa could be due to the susceptibility of ua in mrna to rnase activity [ ] (but see [ ] ). tpa was also less energically stable than all other dinucleotides [ , ] , which rendered the nucleic acids to be more flexible in bending and untwisting. this explained why tata sequences at the sites of replication origin were very easy to unwind and interact with other molecules [ ] . hence, the restriction of tpa dinucleotides may help in avoiding inappropriate binding of cellular factors to the viral nucleic acids. furthermore, given the fact that two out of three stop codons have tpa dinucleotides, reducing the genomic tpa dinucleotides would certainly help in avoiding the occurrence of unwanted mutation-derived stop codons. the relationship between cpg and tpg dinucleotides were studied further by using correlation. for each dinucleotide pairs, the value of dinucleotide index (dni) was calculated as the ratio of observed dinucleotides versus the expected dinucleotides. results indicated that the number of cpg dinucleotides was negatively correlated with tpg dinucleotides. the r-values for segment b and polyprotein dataset were ) . and ) . (p < . ), respectively. correlation for hvr dataset (r ¼ ) . , p < . ) was however weaker; probably due to its shorter sequence. we were fully aware that correlation did not imply causation, but based on the fact and our empirical results, we concluded that the deficiency of cpg probably contributed to the abundance of tpg in the ibdv genome through the conversion of methylated cpg to tpg [ ] . the vertebrate immune system had apparently evolved the ability to recognize the unmethylated-cpg motifs and responds with a rapid and coordinated cytokine response leading to the induction of humoral and cell-mediated immunity [ , ] . moreover, cpg-based adjuvant had shown to trigger protective antiviral cytotoxic t cell responses [ ] . therefore, we proposed that by avoiding the cpg dinucleotides, ibdv might be able to minimize its antigenicity and avoid undesirable host immune response. on a different perspective, we suggested the use of cpg-based adjuvant in ibd killed vaccine; considering the virus attempts in avoiding cpg dinucleotides. it had been shown that cpg oligonucleotides could be a valuable adjuvant for poultry vaccines [ ] . thus, the potential usage of cpg-based adjuvant in ibd killed vaccine may be the future research interest. classifying ibdv strains was indispensable for the control and prevention of ibd. apart from path-ological and serological classification, ibdv had been grouped by its sequence characteristics [ , ] ; where each ibdv strains had its own characteristic restriction enzymes sites [ ] and molecular markers [ ] . ibdv dinucleotide usage (or dinucleotide patterns) was however unknown, despite many sequence analysis papers on ibdv genome had been published. in coronaviruses, analysis of dinucleotide frequency had separated the virus into two groups that roughly reflect its taxonomic origins [ ] . thus, the current study was to investigate if dinucleotide patterns differed among the ibdv strains and the practicality of ''principal component analysis of the dinucleotide frequencies'' (df-pca) approach in studying the ibdv dinucleotide patterns. dni was calculated for each of the -types of dinucleotide pairs. since dni was a relative measure of dinucleotide frequency, pca rather than the correspondence analysis was used in the analysis [ ] . the concepts and principles of pca have been extensively described in most multivariate analysis textbooks, so it will not be discussed here. all the datasets (hypervariable region, polyprotein and segment b) were analysed by the df-pca approach. for hypervariable region and polyprotein datasets, three outliers namely the australian cvibdv ( / ) and serotypes (oh, / ) isolates were excluded because of their unique sequence characteristics. results of df-pca were depicted as a graph plot in which the axes represent the amount of ''extracted variation'' (fig. ) . in fig. a , the first two axes accounted for % ( . % + . %) of the total variance, or in other words it explained % of the total variation observed from the dinucleotide patterns of the hypervariable region. noticeably, there were two distinct groups separated along the first axis: a very virulent group on the left and attenuated group on the right. other strains were remained in between the two major groups. there was no clear separation between classical and attenuated strain. this probably because many attenuated isolates originated from the attenuation of classical isolates. the bold capital v and a were okym (vvibdv) and okymt (attenuated form of okym) isolates respectively [ ] . interestingly, it appeared to be a subtle ''right-shift'' of okym towards the attenuated strains after the attenuation process, but not to the extent of total separation from the vvibdv cluster. while the impact of the attenuation on the ibdv's dinucleotide patterns remained to be investigated, the inability of okymt to be within the atlbdv cluster reflected that df-pca was in fact influenced by the virus evolutionary relationship. however, there was no evidence that ibdv isolates situated on the extreme left will be the ''most virulent'' vvibdv and the extreme right isolate will be the ''most attenuated'' ati-bdv. incorrectly classified isolates could be quickly detected on the df-pca graph due to their odd positions. it was found that the classifications of zj (genbank accession no. af ) and gz (af ) isolates were inappropriate. zj was reported as a highly virulent ibdv [ ] but its position in the graph ( fig. a and b ) seemed to be related more to the attenuated or classical strain than to the vvibdv strain. to examine this problem closely, sequence analysis for zj was done. it was found that none of the important vvibdv markers ( ile, ile, and ile) [ ] and serine-rich heptapeptide virulent marker ''swsasgs'' [ ] were present in zj . in addition, zj had his and thr that were closely related with the attenuated strain than to the virulent strain [ ] . for the gz (''variant strain''), its hypervariable region sequence was found to be identical with another attenuated strain gz (af ) and located exactly at the same position in the map (circle in fig. a was the location for both gz and gz ). sequence analysis on both isolates found that gz was grouped correctly whereas gz should be grouped as the attenuated strain by referring to the molecular markers. fig. b and c showed the df-pca results for polyprotein and segment b datasets. the first two axes of polyprotein and segment b datasets explained about % and % of total variation, respectively. we found that df-pca on hypervariable region sequences could yield comparable result as the longer polyprotein gene sequences. this probably because the virulence molecular determinants, cell tropism, and pathogenic phenotype of ibdv all fall within the hypervariable region [ ] . meanwhile, atypical isolates (upm / and k ) were located closely with the vvibdv isolates as shown in fig. b . this was understandable because atypical strain was considered as a subset of vvibdv strain [ ] . vp gene had an intricate dinucleotide pattern (fig. c) where different ibdv serotypes and strains were intermingled with each other on the graph. intriguingly, rather than forming an isolated cluster, serotype isolates (oh and / ) located near the cvibdv and atibdv isolates. in addition, the vvibdv isolates (sh and habin- ), cvibdv isolate ( / ), and atibdv (il ) had unique dinucleotide patterns whereby they did not belong to any significant cluster. these findings disagreed with islam et al. [ ] where vvibdv's vp gene distinctly separated from other strains. perhaps this was because the number of sequences used in this study (n ¼ ) was larger compared with islam's (n ¼ ). new vvibdv isolates such as sh (ay ) and habin- (af ) were not included in the previous study. furthermore, df-pca approach was differed from the phylogenetic approach because df-pca analysed the inter-relationships of the dinucleotide pairs, whereas the phylogeny method (specifically distance method) calculated evolutionary distances based on a chosen substitution (or evolutionary) model. the substitution model chosen by islam and co-workers in the construction of their vp -phylogenetic tree was however not stated in their report. in a different viewpoint, it should be remembered that ibdv is a bisegmented virus and whether the bewildering dinucleotide patterns of vp gene were due to the inter-strains gene reassortment remained to be investigated. the use of df-pca was unintended to be a substitute for the current strains classification methods, even though it was granted with some abilities in grouping the ibdv strains. in this study we used df-pca to demonstrate the unique characteristics of each ibdv strains by its dinucleotide frequency. df-pca analyzed the delicate inter-relationships among the dinucleotide pairs and visually projected the results in a form of graph or ''map''. the result from the df-pca analysis was not solely dependent on the sequence's identity percentage, albeit this was an important factor. for example, although okym shared a . % of sequence identity with both f and zj isolates, zj was located far away from okym in comparison with f (see fig. a ). although many underlying biological properties of df-pca remained to be investigated, we believed that the results of df-pca reflected the evolutionary history of the virus considering each dinucleotide pairs were influenced by the evolutionary forces (and thus constituted the genomic signature). in phylogenetic analysis, particularly clustering algorithm, evolutionary relationships were studied by grouping the taxa into various groups or clades. and with regard to ibdv, these clades usually reflect the strain of the virus; for example, very virulent isolates are grouped together but not with the variant isolates.therefore, a taxon must either be in or out from a given clade. in contrary, by using df-pca, the inter-rela-tionships among the ibdv isolates were visually displayed as ''points'' on the graph rather than forming the distinct clusters. thus, df-pca allowed the shades of grey and may promote further insight into the virus evolutionary history. the virus genome is packed with information and it means everything for the virus survival. in this study we had uncovered many genomic properties of ibdv by analysing its base usage and dinucleotide frequency. we envisaged that similar approach could be adopted to study other viruses' genes to the understanding of the fundamental properties of the viruses. virus taxonomy: seventh report of the international committee on the taxonomy of viruses base usage and dinucleotide frequency nd international congress/ th vam congress and cva-australasia/oceania regional symposium vet assoc malaysia fundamental of molecular evolution evolution of living organisms this work was generously supported by irpa grant from the malaysian government. key: cord- - bo tux authors: ibrahim, madiha salah; watanabe, yohei; ellakany, h. f.; yamagishi, aki; sapsutthipas, sompong; toyoda, tetsuya; abd el-hamied, h. s.; ikuta, kazuyoshi title: host-specific genetic variation of highly pathogenic avian influenza viruses (h n ) date: - - journal: virus genes doi: . /s - - -y sha: doc_id: cord_uid: bo tux the complete genome sequences of two isolates a/chicken/egypt/cl / (cl / ) and a/duck/egypt/d br / (d br / ) of highly pathogenic avian influenza virus (hpai) h n isolated at the beginning of outbreak in egypt were determined and compared with all egyptian hpai h n sequences available in the genbank. sequence analysis utilizing the rna from the original tissue homogenate showed amino acid substitutions in seven of the viral segments in both samples. interestingly, these changes were different between the cl / and d br / when compared to other egyptian isolates. moreover, phylogenetic analysis showed independent sub-clustering of the two viruses within the egyptian sequences signifying a possible differential adaptation in the two hosts. further, pre-amplification analysis of h n might be necessary for accurate data interpretation and identification of distinct factor(s) influencing the evolution of the virus in different poultry species. during november and december of preceding year and january and february of succeeding year and decline towards warmer weather. human infections are mostly linked to the peaks of the cold-season avian outbreaks. this was repeatedly detected in the following years up to and further posing eradication challenges because of such long-term endemicity. in this report, we analyzed our samples that were collected between peak and the beginning of its decline in . utilizing the h n virus directly from the original tissue without previous propagation showed molecular differences in the virus unique for each species and further different from those reported for other seasons' viruses in seven viral segments. in this article, we directly sequenced and analyzed the complete genome sequences of h n virus from the tissues of two host species; chicken and duck. tissue samples were collected separately from individual dead birds of birds total for each species. the samples were collected in january and march from house reared ducks (damanhour, el behiera governorate; n = ) and large scale breeder chicken farm (alexandria governorate; n = , ), respectively. all of the ducks and chicken showed severe clinical signs of classical hpai h n [ , ] with high mortality rates. in order to minimize any possible variations due to laboratory passage either in embryonating chicken eggs (eces) or mdck cell line, total rna was directly extracted from the original clinical materials from chicken and duck samples using trizol reagent (invitrogen, japan) according to the manufacturer's instructions. rt-pcr amplification of the entire genome was performed using sets of specific primers [ ] . the pcr products were separated on . % agarose gels, and the fragments of interest were isolated from the gel using a qiaquick gel extraction kit (qiagen, japan). the purified full-length dna fragments were cloned into the mighty ta cloning kit (takara, japan) according to the manufacturer's instructions. for individual segments, ten colony-purified plasmids were sequenced by capillary electrophoresis using the applied biosystems genetic analyzer (applied biosystems, usa). sequence data were assembled using genetyx (software development co, ltd, tokyo, japan) and bioedit [ ] . the genbank/embl/ddbj accession numbers for the sequences reported in this article are ab -ab , ab -ab , ab , and ab . sequence analysis showed [ % identity within each species-derived sequences, so representative sequences from chicken-and duck-derived viruses, cl / and d br / , respectively, were selected for further analysis. analysis of the ha genes showed a high percent of identity ([ %) with other sequences from egypt, nigeria, and the middle east. several nucleotide changes (table ) were detected between cl / and d br / as well as with the genbank available reference sequences from egypt. these changes were reflected on the amino acid sequences with three substitutions in cl / and two in d br / . further, the two amino acid substitutions in the d br / were different from those in cl / and the reference sequences for ; asn in ha and ser in ha (h numbering) but same as the a/bar-headed-goose/ qinghai/ / (genbank accession no. dq ). for and , the cl / and d br / -specific amino acids were detected in genbank available sequences but they could not be species-correlated due to the low number of duck reference sequences, except the lys (h table ). the highly pathogenic characteristic sequence of multiple basic amino acids at the cleavage site, gerrrkkrrg, was similarly detected in the ha of cl / and d br / viruses. further, the - -neuacgal avian receptor binding preference was maintained in both viruses, expressing the gln and gly (h numbering) [ , ] . in addition, the lys in the ha receptor binding site detected in all clade . viruses [ ] was also detected in our viruses. the alignment of the na sequences added to the differences between the cl / and d br / (table ) , where the cl / had one amino acid substitution while the d br / had two, even though the former had nine nucleotide mutations versus four in the latter. unlike the ha, such amino acid changes were not detected in any of the genbank available sequences for egypt from to . moreover, the amino acid genotype z dominant deletion in the stalk of the na protein [ ] , from residue to resulting in the loss of an n-linked glycosylation site upstream the deletion, was equally detected in cl / and d br / viruses. phylogenetic analysis was performed using the mega software [ ] employing the neighbor-joining method on the basis of full nucleotide sequences for the whole genome. estimates of phylogenies were calculated by performing bootstrap replicates. phylogenetic analysis of the ha gene ( fig. ) and na (data not shown) showed that both isolates belong to clade . [ ] together with other egyptian isolates as well as the isolates from nigeria and middle east. although, cl / and d br / subclustered far from each other, they were close to the and derived egyptian sequences indicating that they have originated from endemic viruses circulating the same year with the duck viruses closer to their ancestors and sub-clustering independently. further, the cl / subclustered together with a/chicken/egypt/c br / (genbank accession no. ab ), which was directly sequenced from the clinical materials without prior amplification. the -and -derived sequences were quite far from our sequences indicating that the virus is under continuous genetic evolution in the country. pairwise sequence comparison further revealed several nucleotide changes in the np, m, and ns genes (table ) . further, the amino acid substitutions in the transmembrane region of m protein that are known as a key point for drug resistance [ ] were not detected in cl / or d br / . in addition, there was no amino acid changes associated with the amantadine or rimantadine resistance, and all the amino acids were avian-specific except for the val ile human signature in the np of both viruses [ , , ] . in addition, both isolates contained the glu mutation in the ns protein, which is a major contributor to virulence of h n viruses [ ] . furthermore, the phylogenetic analysis confirmed the separate sub-clustering of the cl / and d br / , except for the ns gene (data not shown). further sequence comparison of the genes encoding the polymerase complex, pb , pb , and pa, revealed a number of differences between cl / and d br / , which were further different from those of the genbank egyptian sequences (table ). in contrast to cl / , the d br / -derived sequences shared several nucleotides and the leu ser in the pb -f with the two human isolates from egypt (tables ). the glu lys mutation in pb , which is characteristic of human viruses and of increased pathogenecity and host range as well [ , ] was equally detected in both viruses. moreover, phylogenetic analysis of the polymerase genes also showed separate subclustering (data not shown). in addition, the polymerase complex appeared to be very distinct from the two nigerian lineages so and ba [ ] , but with closer relation to the middle east isolates (data not shown). in our analysis, comparative genetic characterization of the eight rna segments of cl / and d br / together with - genbank egyptian reference sequences showed that both viruses had nucleotide as well as amino acid differences that appeared to be specific for each, except for the m gene that was highly conserved. the na gene appeared to be uniquely maintained where the cl / and d br / -specific mutations were not detected in till egyptian reference sequences. moreover, was linked to more human infections ( cases) compared to ( cases) and ( cases) [ ] indicating a possible host-dependent molecular adaptation and/or evolution of h n in for which the host is not yet disclosed. an increase in human infections was detected in ( cases) even though reassortmant or new virus entry has not been reported yet for egypt; however, this could be a consequence of the declared endemicity of the virus in the country. on the contrary, cases were only reported for [ ] reflecting a necessity to unravel the possible transfer host. even though, cl / and d br / were derived from different cities, they clustered with other egyptian sequences indicating a single origin with a possible different molecular evolution. this further confirms that in contrast to nigeria, and as cattoli et al. [ ] , egypt seems to have had a single entry of the virus, which appears to have happened in early or late . the maintenance of two different viruses in two different species may increase the burden of threat to human, especially where direct contact with different avian species is common. however, d br / carried more amino acid mutations and shared several nucleotides in the polymerase genes with the human isolate; human/egypt/ / as well as the human signature; leu ser in pb -f with the isolate; human/egypt/ / . this may point a possibility that duck could serve as a viral disseminating and/or amplifying host, being closer to the ancestors, and possibly the potential source for human infection. ducks have been shown to have a central role in the generation and maintenance of h n viruses in china [ ] , while for egypt, duck-derived sequences are so scarce compared to those of chicken-or human-derived ones even though ducks are extensively reared and consumed, mainly in the country side. considering the diversity of duck susceptibility to hpai h n , understanding the role of ducks in the emergence and maintenance of these viruses and its role in viral spread to other poultry species and human is required. the differences in the amino acids detected in our sequences relative to the genbank - available egyptian sequences could result from utilizing the original clinical materials directly for sequence analysis without prior amplification in vivo; ece or in vitro; mdck cells. recently, le et al. [ ] showed that the pb gene population of h n virus grown in ece or on mdck did not reflect that of the original. together, it seems that this was likely to occur in our analysis where mostly ece-derived viruses were used for the sequence analysis of genbank reference viruses. further, it may indicate that viral variants harbored inside the infected host may differ from those shed outside. moreover, such species-associated changes could have been independently selected during replication in individual birds and/or individual tissue (y. watanabe and m.s. ibrahim, unpublished results) reflecting a possible differential evolution within individual species. human infections are mainly linked to a previous contact with an unknown dead host. thus, accurate molecular analysis of h n gene assemblage in different avian hosts, mainly ducks as well as human would improve the detection of the host-associated changes having the potential for viral spread within the human population and identifying the source of infection and the mysterious host behind that. furthermore, our findings highlight an essential need for using the original clinical material as a source for viral sequence analysis to accurately understand the molecular evolution of h n in individual hosts and also to identify sequence changes that may facilitate cross species infection. fig. phylogenetic tree of the hemagglutinin (ha) segments of cl / and d br / , and other genbank egyptian reference viruses. the neighbor-joining trees based on the full-length nucleotides were generated with mega with bootstrap value. bootstrap values over % are shown at the tree nodes. the chicken and duck sequences are indicated in bold and underlined. the arrow head points to a/chicken/egypt/c br / that represents directly sequenced viral rna without prior amplification. trees are rooted to the a/chicken/egypt/r / that was isolated in december . the scale bar represents the distance unit between sequence pairs b virus genes ( ) : - avian influenza virus (h n ) outbreaks, kuwait epidemiological findings of outbreaks of disease caused by highly pathogenic h n avian influenza virus in poultry in egypt during characterization of an avian influenza virus h n egyptian isolate highly pathogenic avian influenza virus subtype h n in africa: a comprehensive phylogenetic analysis and molecular characterization of isolates genomic signature of human versus avian influenza a viruses properties and dissemination of h n viruses isolated during an influenza outbreak in migratory waterfowl in western china establishment of multiple sub-lineages of h n influenza virus in asia-implications for pandemic control molecular and antigenic evolution and geographical spread of h n highly pathogenic avian influenza viruses in western x-ray structures of h avian and h swine influenza virus hemagglutinins bound to avian and human receptors analogs bioedit: a user-friendly biological sequence alignment editor and analysis program for windows / /nt universal primer set for the full-length amplification of all influenza a viruses pathology, molecular biology, and pathogenesis of avian influenza a (h n ) infection in humans selection of h n influenza virus pb during replication in humans genesis of a highly pathogenic and potentially pandemic h n influenza virus in eastern asia molecular characterization of the hemagglutinin and neuraminidase genes of h n influenza a viruses isolated from poultry in vietnam from oie, update on highly pathogenic avian influenza in animals (type h and h ) a single amino acid in the pb gene of influenza a virus is a determinant of host range emergence of amantadine-resistant influenza a viruses: epidemiological study mega : molecular evolutionary genetic analysis (mega) software version . who, continuing progress towards a unified nomenclature system for the highly pathogenic h n avian influenza viruses who, cumulative number of confirmed human cases of avian influenza a/(h n ) reported to who acknowledgments this study was supported by the jsps postdoctoral fellowship for foreign researchers and the grant-in-aid (scientific research (b) (overseas academic research)), japanese society for the promotion of science, japan.open access this article is distributed under the terms of the creative commons attribution noncommercial license which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. key: cord- -ca uthvd authors: jeoung, hye-young; lim, ji-ae; jeong, wooseog; oem, jae-ku; an, dong-jun title: three clusters of bovine kobuvirus isolated in korea, – date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: ca uthvd fecal samples (n = ) were collected from cattle with ascertained or suspected diarrheal disease on korean farms during – . of these, samples tested positive for bovine kobuvirus. the positive samples came from cattle that exhibited diarrhea and five cattle that were non-diarrhetic. the majority of the virus-positive feces samples were from calves under month of age (n = ). nine of the cattle infected with bovine kobuvirus were confirmed to have a co-infection with other viruses including bovine rotavirus (n = ), bovine coronavirus (n = ), bovine viral diarrhea virus (n = ), and both bovine coronavirus and bovine viral diarrhea virus (n = ). a neighbor-joining tree grouped of the korean kobuvirus strains (with the exception of the kb strain) into three clusters (g , g , and g ), while strains derived from thailand and japan (except the u strain) were included in the g cluster. the results indicated that korean bovine kobuvirus has diverse lineages regardless of disease status and species. electronic supplementary material: the online version of this article (doi: . /s - - - ) contains supplementary material, which is available to authorized users. gyongbuk (n = ), and gyongnam (n = ). viral rna was extracted from feces using trizol ls b according to the manufacturer's instructions (invitrogen, carlsbad, ca, usa). bovine kobuvirus was detected from fecal samples using reverse-transcript-polymerase chain reaction (rt-pcr) as previously described [ ] . oligonucleotide primers were designed based on the genome sequence of the u- strain (accession no. ab ) and have the following sequences: u- f (sense, -catgctcctcggtggtct ca- ; nt , ) and u- r (antisense, -gtccgggtc catcacagggt- ; nt , ). together, these primers amplify a -bp region of the d protein. pcr products of size bp were visualized by electrophoresis and were cloned using the pgem-t vector system ii (promega, madison, wi, usa). the cloned genes (three per sample) were sequenced, using t and sp promoter-specific primers, with an abi prism Ò xi dna sequencer (applied biosystems, foster city, ca, usa) at the macrogen institute (macrogen, seoul, korea). to investigate the relationship between kobuvirus and other bovine viruses that cause diarrhea in cattle, a screening test was conducted using primers specific for the detection of bovine rotavirus (brv) [ ] , bovine coronavirus (bcv) [ ] , and bovine viral diarrhea virus (bvdv) [ ] , as previously described. reverse transcription for the extracted rna was performed using a cdna synthesis kit (takara) and random hexanucleotide primers. the rt-pcr was run according to the following temperaturetime profile: °c for min, then °c for min, followed by cycles of virus-specific conditions, as follows: brv: °c for s, °c for min and °c for min; bcv: °c for min, °c for min and °c for min; and bvdv: °c for min, °c for min and °c for min. for all viruses, the denaturation-annealingextension cycles were followed by a final extension at °c for min. the resulting amplicon sizes were bp for brv, bp for bcv, and bp for bvdv. the sizes were assessed by % agarose gel electrophoresis and confirmed through the sequencing and analysis of the nucleotide sequence of each amplicon. the nucleotide sequences of the korean bovine kobuviruses were compared to those of kobuvirus reference strains in the genbank database by blast. the nucleotide sequences were aligned using the clustal w . x program [ ] and.aln files were generated. the.aln files were then converted to.meg files using mega [ ] and a neighbor-joining tree was constructed (bootstrap replicates = , ) using the kimura parameter method for pairwise deletion at uniform rates. the nucleotide sequences of korean bovine kobuvirus strains were deposited in genbank as accession numbers hq -hq (table ) . in prior studies, bovine kobuvirus was detected in of ( . %) stool samples in japan [ ] , of ( . %) fecal samples in thailand [ ] , and of ( . %) fecal samples in hungary [ ] . korean bovine kobuvirus was markedly more prevalent, being detected in of the ( . %) fecal samples. furthermore, the yearly frequency of the korean positive samples was constant: n = in both and , and n = in (table ) . thirty-two of the diarrhea samples ( . %) contained kobuvirus, compared with out of non-diarrhea samples ( . %). however, this result cannot be taken as evidence of a causal relationship between kobuvirus infection and diarrhea, and such a causal relationship has been questioned in previous analyses [ , [ ] [ ] [ ] . infection by kobuvirus occurred in . % ( of ) of the korean native cattle and . % ( of ) of the holstein cattle. this result indicated that kobuvirus infection is not restricted to a single cattle species. regarding the age of infected cattle, this study clearly showed a predominance of infection in calves under the age of month (n = ), a result similar to that of a previous study [ ] . kobuvirus prevalence by geographic region was . % ( / ) in chungnam, % ( / ) in gyongbuk, . % ( / ) in gyonggi, . % ( / ) in chungbuk, % ( / ) in gangwon, and % ( / ) in gyongnam. in spite of these seemingly substantial differences in prevalence, the low numbers of samples did not provide sufficient statistical power to allow any conclusions regarding geographic predilection. the geographic differences can therefore be considered tenuous at this point, requiring further study with larger sample sizes. the clinical significance of any such differences remains unclear. the combined infection involving bovine kobuvirus and other viruses was observed in nine cattles: brv (n = ), bcv (n = ), bvdv (n = ), and bcv ? bvdv (n = ). however, it is unclear whether the other viruses are directly associated with the kobuvirus infection. neighbor-joining analysis revealed that partial nucleotide sequences ( bp in length) of the d genes of bovine kobuvirus ( from korea, from japan, and from thailand), along with that of the aichi virus (as the outgroup), fell into four main lineages (g , g , g , and g ). with the exception of the u and kb strains, all of the sequences fell into one of these four lineages (fig. ) . the four lineages were supported by high bootstrap values ( - %) at the node of each branch. interestingly, the korean kobuvirus strains formed three lineages (g , g , and g ), while the japanese and six thailand strains all fell within the g lineage (fig. ) . a future analysis using a larger number of strains may be required to confirm that the u and kb strains represent the first recognized strains of an additional cluster or two additional clusters. in conclusion, the findings of this study demonstrate the existence of four phylogenetic lineages of bovine kobuvirus. korean kobuvirus strains are found in three of the four lineages, with japanese and thailand strains being clustered together in the other lineage. virus taxonomy, th report of the ictv acknowledgment the authors are grateful to ms. bo-hye shin and ms. hyen-jung kim for their technical assistance. key: cord- -s qsjkj authors: chouljenko, vladimir n.; kousoulas, konstantin g.; lin, xiaoqing; storz, johannes title: nucleotide and predicted amino acid sequences of all genes encoded by the ′ genomic portion ( . kb) of respiratory bovine coronaviruses and comparisons among respiratory and enteric coronaviruses date: journal: virus genes doi: . /a: sha: doc_id: cord_uid: s qsjkj the ′-ends of the genomes ( bp) of two wild-type respiratory bovine coronavirus (rbcv) isolates lsu and ok were obtained by cdna sequencing. in addition, the ′-end of the genome ( ) of the wild-type enteric bovine coronavirus (ebcv) strain ly- was assembled from available sequences and by cdna sequencing of unknown genomic regions. comparative analyses of rbcv and ebcv nucleotide and deduced amino acid sequences revealed that rbcv-specific nucleotide and amino acid differences were disproportionally concentrated within the s gene and the genomic region between the s and e genes. comparisons among virulent and avirulent bcv strains revealed that virulence-specific nucleotide and amino acid changes were located within the s and e genes, and the kda open reading frame. coronaviruses are important etiological agents of human and animal diseases including respiratory infection, gastroenteritis, hepatic and neurological disorders as well as immune-mediated disease such as feline infectious peritonitis, and other persistent infections ( , ) . enteric bovine coronaviruses (ebcv) are generally associated with enteric disease of newborn calves and winter dysentery of adult cattle ( ) . recently, numerous respiratory bovine coronaviruses (rbcv) were isolated in our laboratory from cattle arriving with fever and respiratory disease in feedlots or livestock shows of different states in the usa. the cytopathogenic, cell fusion, and other phenotypic properties of these viruses were different from the known ebcv ( ). coronaviruses contain a single stranded, capped, and polyadenylated positive-sense (infectious) rna molecule of approximately kb length, which directs the synthesis of a nested set of subgenomic mrnas ( , ) . the h -end of the genomic rna consists of approximately . kb and contains the spike (s) glycoprotein, the hemagglutinin-esterase (he) glycoprotein, the integral membrane (m) protein, the small membrane protein (e) and the phosphorylated nucleocapsid (n) protein and a number of orfs potentially encoding non-structural proteins (n s ) ( ) . the kda non-structural protein is a phosphoprotein that accumulates in the cytoplasm of infected cells ( , ) . it is not known whether the . and . kda orfs are expressed in infected cells, while the . kda putative protein, most likely, is not translated ( ) . bcv uses n-acetyl- -o acetyl neuraminic acid as receptor determinant to initiate infection ( ) . although the he glycoprotein also has an af®nity for -o-acetylated sialic acid, the s glycoprotein was identi®ed as the major sialic acid binding protein of bcv ( ) . the s glycoprotein facilitates viral attachment to susceptible cells, causes cell fusion after cell-surface expression (fusion from within), and induces viral infectivity neutralizing antibodies ( ). porcine transmissible gastroenteritis virus (tgev) strains were isolated which exhibited respiratory tissue tropism. these viruses contained point mutations or deletions within the ®rst aa of tgev s which were associated with reduced enteropathogenicity and loss of hemagglutinating activity ( ± ). to examine the genetic basis for the phenotypic differences between rbcv and ebcv, we cloned and sequenced the h -end of the viral genomes of two virus strains rbcv-lsu- lss- - (lsu) and rbcv-ok- - (ok) that originated from louisiana and oklahoma cattle, respectively. we report here, the nucleotide and predicted amino acid sequences of all genes encoded by the h genomic portion ( . kb) of two wild-type rbcv strains and comparisons among respiratory and enteric coronaviruses. viruses and cell line. all rbcv and ebcv strains were propagated in the g clone of human rectal tumor cells (hrt- g) developed recently through selection and medium modulation ( ) . supernatant¯uids from infected hrt- g cells were collected and viruses were puri®ed as described ( ) . rbcv ok and lsu virus stocks were tested at the third and fourth passages, respectively. ebcv ly- (ly) virus stocks were prepared at the second passage, while the ebcv-l virus strain, derived from the ebcv-mebus strain, had been propagated times in cell cultures. strategy for cdna construction and assembly of the . kb cdna sequence representing the h -end of different bcv strains. tri reagent from the molecular research center, inc. (cincinnati, oh, usa), was used for total rna extraction. ready-to-go you-prime first-strand beads from pharmacia biotech inc. (uppsala, sweden) were used for cdna library construction. all ampli®cations were performed using the gene-amp pcr system (perkin-elmer, norwalk, ct, usa) with pcr reagents and amplitaq from perkin-elmer. the tacloning kit from invitrogen inc. (san diego, ca) was used for cloning of rt-pcr products. restriction enzymes were obtained from new england biolabs (beverly, ma, usa). the h genomic end of bcv mebus strain consisting of nucleotides was assembled from available sequences deposited in genbank. the accession numbers of the cdna sequences used to assemble the ebcv mebus genome were m for s and he genes, m for the . , . , . and . kda (e) orfs, and m for the m, n genes and i orf. the assembled mebus genomic sequence did not contain the kda orf. an nt sequence containing the kda orf of bcv-quebec (accession number x ) was used for comparisons with other bcv. the s and he cdna sequences speci®ed by ly- were previously reported ( , ) . a series of overlapping cdna clones representing the entire h -end of two rbcv isolates and unpublished sequences of ly- were constructed. two cdna libraries were produced, a library was made using the bcv h primer representing the h terminus of the genomic rna, and a second cdna library was produced using an oligonucleotide ( b ) to prime cdnas starting at nucleotide (counting from the h -end of the viral genome) (fig. ) . the entire nt sequence representing the h -end of the bcv genome was divided into six overlapping cdna regions. each cdna was ampli®ed by pcr using speci®c primer pairs. primer pair f /bcv h ampli®ed a cdna fragment containing the m and n genes. primer pair f / b ampli®ed a cdna fragment containing the h -end of s, . kda, . kda, . kda, e, m and the h -end of n genes. primer pair b h /b h ampli®ed the h -end of the spike gene, primer pair f /a h ampli®ed a cdna that coded for the carboxy-terminal portion of the s subunit, primer pair a h / b ampli®ed a cdna fragment that coded for the amino-terminus of s, and primer pair f / b ampli®ed the kda and he genes. b h and a h primers were designed to contain an extra bamhi and ecori sites for cloning purposes, while a bstxi site was naturally present in the f primer. the actual primer sequences are: dna sequencing and analyses. dna sequencing was carried out with the modi®ed dideoxynucleotide chain termination procedure ( ) overall comparisons of genes and predicted proteins speci®ed by rbcv (lsu, ok) and ebcv (ly, mebus) strains. to establish the close evolutionary relationship between lsu and ok strains and to ascertain rbcv-speci®c amino acid changes (conserved in lsu and ok but different in other strains), a pairwise comparison of nucleotide and amino acid differences among bcv strains for all orfs, except for the orf coding for the rna-dependent-rna-polymerase and the kda protein was performed (table ). in general, the nucleotide and amino acid sequences of rbcv strains lsu and ok were more conserved to each other than to ebcv strains ly- and mebus, and they were more divergent to the mebus strain than to the ly- . speci®cally, the amino acid sequence of m speci®ed by lsu and ok were identical, while they were different by one, and two aa from that of mebus and ly- , respectively. the s glycoprotein speci®ed by lsu differed by only amino acids from that of ok, while s glycoproteins of lsu and ok differed by and amino acids in comparison with the ly and mebus s sequences, respectively. furthermore, lsu and ok sequences of the n and i orf (located within n) were more conserved to each other than to any other strain compared. most amino acid changes within he, . , . , . kda orfs, e, n and the i orf were strain-speci®c. he and m contained one rbcv-speci®c aa change, and n and i orf contained two rbcv-speci®c aa changes each. rbcv-speci®c amino acid substitutions within s. the s subunit contained most of the rbcv-speci®c aa substitutions and included an amino acid change within the signal sequence as well as two clusters of amino acid substitutions within the amino-terminus and the hypervariable region (fig. ) . the proteolytic cleavage site that separates s and s subunits was conserved among rbcv, ly- and mebus strains. in contrast to the high number of rbcv-speci®c substitutions within s , s contained only two rbcv-speci®c amino acid changes, an ala to ser change immediately adjacent to the proteolytic cleavage site and an asp to gly located within the heptad repeat sequence. the rbcv-g strain was isolated from a nasal sample of a calf that had diarrhea and signs of respiratory distress ( ) . the nucleotide and predicted primary structures of s and he glycoproteins speci®ed by rbcv-g were reported previously ( , ) . lsu and ok had ten unique aa substitutions within s in comparison to all other bcv strains, while g , lsu, and ok shared only three aa substitutions at aa , aa , and aa (fig. ) . rbcv-speci®c nucleotide and amino acid substitutions within the . , . , and . kda orfs. the human coronavirus strain oc (hcv-oc ) lacks two orfs which potentially encode two nonstructural proteins of . and . kda ( ) . furthermore, the same genomic areas are deleted in three hemagglutinating encephalomyelitis virus (hev) strains of swine ( ) . the fact that respiratory hcv-oc and ebcv strains show remarkable genomic and protein similarities as well as immunological cross-reactivities, prompted us to compare the nucleotide sequences speci®ed by the genomic region between the s and the . kda orf of ebcv, rbcv, hcv-oc , and three porcine hev strains (fig. (fig. ) . identical aa changes were also found in the kda protein of two more rbcv strains isolated from texas and arizona cattle (data not shown). genetic comparisons among different bcvs revealed substantial differences between rbcv and ebcv strains principally within the s gene and within orfs located between the s and e genes. furthermore, genetic differences between virulent and avirulent strains were identi®ed within the s gene, the e gene and the kda orf. the salient features of genetic differences between rbcv and ebcv strains are discussed below: rbcv-speci®c genetic alterations in the s gene. a pairwise alignment of tgev and mhv s aa sequences revealed that the n-terminal portion of s which is deleted in the porcine respiratory coronaviruses (prcv) and hcv- e, in comparison with tgev, is the region corresponding to the mhv receptor binding-site (aa ± ) ( ) . the tgev receptor-binding site is in a different location (aa ± ) and aligns with the s polymorphic region of the mhv strain. recently, it was shown that only two aa changes at the n-terminus of tgev s resulted in the loss of enteric tropism ( ) . the s amino terminus speci®ed by rbcv strains lsu and ok contained aa changes at aa , aa , aa , aa and aa which may affect s -mediated receptor binding. hemagglutination of chicken red blood cells (rbc) was shown to be mediated by the s glycoprotein, because puri®ed s of the ebcv mebus strain agglutinated chicken rbc, while puri®ed he did not ( , ) . rbcv strains lsu and ok agglutinated mouse and rat, but not adult chicken rbc ( ). therefore, aa changes within s speci®ed by rbcv may be responsible for the inability to hemagglutinate chicken rbc. the s a virus neutralizing (vn) immunoreactive epitope (aa ± ) ( ) was identical for all viral strains, except for a single aa change at aa speci®ed by the avirulent, cell culture-adapted strain ebcv-l (fig. ) . furthermore, the s a epitopes of hcv-oc and the bcv mebus strain were identical ( ) . monoclonal antibodies (mabs) against the ebcv mebus cross-reacted with different animal and human coronaviruses ( ) . therefore, it is likely that these antibodies react with the s a epitope. the hypervariable region of the s glycoprotein contains the s b immunoreactive epitope which is the target for virus neutralizing mabs ( ) . four rbcv-speci®c aa substitutions at aa , aa , aa and aa were located within or proximal to this epitope. based on the observed aa changes, it can be predicted that mabs speci®c for this region may be able to distinguish between respiratory, enteric and vaccine bcv strains. the bcv s subunit of s induced cell fusion when it was expressed in insect cells, indicating that s contained membrane fusion domains ( ) . the hydrophobic and heptad repeat regions of s are believed to form the coiled-coil structure of the oligomeric s protein that have been associated with fusion activity. speci®cally, three aa changes within a predicted heptad region of the mhv s subunit were shown to be responsible for ph-dependent cell fusion ( ) . rbcv strains lsu and ok are highly fusogenic in cell culture (data not shown). additional experimentation is required to assess whether the aa change of ala to ser immediately after the proteolytic cleavage site and the aa change of asp to gly within the heptad repeat are responsible for the extensive cell fusion induced by rbcv. rbcv-speci®c genetic alterations between the s and e genes. the rbcv genomic regions between the s and e genes contained many nucleotide substitutions, deletions and insertions. hcv-oc and three porcine hev strains speci®ed deletions within the . and . kda orfs, indicating that they are not essential for virus replication. similarly, the high number of mutations within the rbcv . and . kda orfs suggests that these orfs are not essential for virus replication in cell culture ( table ) . the nt leader of a cloned bcv defective interfering (di) rna when mapped by mutations, could be converted rapidly to the wild-type leader of a helper virus following di rna transfection into helper fig. . comparison of the predicted amino acid sequences of the bcv s glycoprotein speci®ed by different strains. amino acids that are different in at least one strain are shown, except aa , aa and aa which are included as reference points. * indicates unique amino acid changes for each strain. boxed amino acids are common among different strains. light-gray boxes contain rbcv-speci®c, dark boxes contain virulent-speci®c, and clear boxes contain ebcv-speci®c aa changes. aa ± is the putative signal peptide; aa ± is the s a immunoreactive domain; aa ± is the s b immunoreactive domain; aa ± is the hypervariable region; aa ± is the hydrophobic region; aa ± is the heptad repeat sequence; aa ± is the carboxy-terminal anchor sequence. virus-infected cells ( ) . nucleotide substitutions mapped the crossover region to a -nucleotide segment that starts from the last nt of the leader-mrna junction sequence and extends further downstream. the rbcv isolates lsu, ok, as well as rbcv az- - (az) and tx- - (tx) isolated from california and texas cattle, respectively (data not shown), contained a four nucleotide deletion located within this nucleotide segment (fig. ) . this deletion may alter the recombination frequency between the leader and the leader-mrna junction sequence immediately upstream of the . kda subgenomic mrna, and cause either inhibition or enhancement of the putative . kda transcription and subsequent protein expression. genetic differences between virulent and avirulent bcv strains. the s glycoprotein contained aa substitutions which were common for all virulent strains (aa , , , , , and ). three mutations within the si portion of s caused conservative aa changes, while one non-conservative aa change of his to asp was located within the s hypervariable region. all three mutations within s caused non-conservative amino acid changes. amino changes within s and s may affect the structure and function of the s glycoprotein and alter the pathogenetic potential of these viruses. the kda orf speci®ed by rbcv virulent strains lsu and ok as well as ebcv ly- contained two frame-shift mutations which resulted in a aa segment near the carboxy-terminus which was different from the corresponding amino acid sequence speci®ed by the avirulent ebcv-quebec strain (fig. ) . a similar double frame shift mutation was found in the kda orf speci®ed by hcv-oc ( ) . these frame shift mutations increased substantially the hydrophilicity of the carboxy terminal portion of the kda protein in virulent versus avirulent strains (data not shown). sequencing of the corresponding region of the kda protein of the avirulent ebcv l and mebus strains as well as the kda of other virulent strains will substantiate these differences between virulent and avirulent strains. the aa substitution of gly to val in the e protein was conserved for all bcv virulent strains (f , ly- , ok, lsu), hcv-oc ( ) , and three different porcine hev strains ( ) . this mutation may affect the ability of these viruses to invade different tissues, because e is part of the virion and is expressed at infected cell-surfaces ( , ) . curr top microb & immunol , ± corona and related viruses: functional domains in the spike protein of transmissible gastroenteritis virus the coronaviridae: the coronavirus surface glycoprotein the coronaviridae: the coronavirus non-structural proteins nucl acids res , ± we acknowledge the technical assistance of mamie burrell with cell culture and virus propagation, and galina rybachuck with sequence analysis and design of ®gures. this work was supported in part by usda grant - - of the national research initiative program to j.s. and k.g.k., louisiana educational support fund (leqsf) grant xrf/ - -rd-b- to j.s. and k.g.k., leqsf grant rd- - -rd-b- to k.g.k., and a grant from immtech biologics, inc., bucyrus, ks. we are indebted for support by the lsu school of veterinary medicine. this publication is identi®ed as genelab publication #gl . key: cord- -gmjnbnx authors: yang, limin; li, jing; bi, yuhai; xu, lei; liu, wenjun title: development and application of a reverse transcription loop-mediated isothermal amplification method for rapid detection of duck hepatitis a virus type date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: gmjnbnx we developed and evaluated a reverse transcription loop-mediated isothermal amplification (rt-lamp) assay for detecting duck hepatitis a virus type (dhav- ). the amplification could be finished in h under isothermal conditions at °c by employing a set of four primers targeting the c gene of dhav- . the rt-lamp assay showed higher sensitivity than the rt-pcr with a detection limit of . eld( ) . ml(− ) of dhav- . the rt-lamp assay was highly specific; no cross-reactivity was observed from the samples of other related viruses, bacteria, allantoic fluid of normal chicken embryos, or the livers of uninfected ducks. thirty clinical samples were subjected to detection by rt-lamp, rt-pcr, and virus isolation, which obtained completely consistent, positive results. as a simple, rapid, and accurate detection method, this rt-lamp assay has important potential applications in the clinical diagnosis of dhav- . duck hepatitis virus type (dhv- ), a member of family picornaviridae and genus avihepatovirus, is a kind of single-stranded rna virus that causes an acute, highly lethal disease in young ducklings called duck hepatitis. duck hepatitis leads to severe economic losses for duck raising farms. duck hepatitis virus includes three serotypes dhv- , dhv- , and dhv- . dhv- is distributed widely, while dhv- and dhv- have only been reported in the uk and the usa, respectively [ ] [ ] [ ] [ ] [ ] [ ] [ ] . duck hepatitis caused by dhv- can lead to mortality up to % in young ducklings during the first week of life, thus accurate and efficient diagnosis is extremely useful to control the initial disease outbreak [ ] . recently, dhv- was renamed to dhav, and dhav has three genotypes (dhav- , ,and ) [ ] [ ] [ ] [ ] . dhav- is distributed widely and prevalent in china, dhav- have been only isolated in taiwan until now [ , ] , while dhav- was first isolated in south korea [ ] , but now it is also epidemic in mainland of china. the traditional detection methods, including virus isolation and neutralization tests, are generally reliable for the diagnosis of dhv- [ ] , but these methods have shortcomings, such as labor intensive, time consuming, and have insufficient sensitivity which cannot detect extremely low viral loads. to address this, a virus antigen-based elisa was first established in [ ] , then a recombinant vp protein-based elisa was developed, which showed agreement with the neutralization test [ ] . nucleic acid-based assays such as rt-pcr, real-time rt-pcr, and real-time quantitative pcr were developed and showed high specificity and sensitivity [ , , [ ] [ ] [ ] . however, these assays need specialized and expensive equipment such as a thermal cycler or real-time pcr system, thus they are of limited application in rural areas. loop-mediated isothermal amplification (lamp) assay was developed in [ ] , which is a novel nucleic acid amplification method that occurs under isothermal conditions. this method employs a dna polymerase and a set of four specially designed primers that recognize a total of six distinct sequences on the target dna, which can be amplified with high specificity. lamp continues with the accumulation of copies of target in less than an hour [ ] . as a simple and efficient diagnostic technique, lamp has been used in the detection of various rna or dna viruses, such as avian leukosis virus [ , ] , barley yellow dwarf virus [ ] , swine transmissible gastroenteritis coronavirus [ ] , avian influenza virus [ ] , and foot-and-mouth disease virus [ ] . here, we report a one-step, single-tube rt-lamp assay for the rapid detection dhav- , and its specificity and sensitivity were assessed. this method has potential applications in the early diagnosis and forecasting of dhav- . the dhav- strain (dhav-sd, stored at the china general microbiological culture collection center, cgmcc no. ) was propagated in the allantoic cavities of -day old spf chicken embryos. the embryos that died - h post inoculation were collected. allantoic fluid was centrifuged ( , g at °c for min) and the suspension was stored at - °c until it was used for rna extraction [ ] . duck enteritis virus (dev), muscovy parvovirus (mpv), avian influenza virus (aiv, h n ), riemerella anatipestifer (ra), salmonella enteritidis, and escherichia coli (o ), which were maintained in our laboratory, were propagated and the nucleic acids were extracted [ , [ ] [ ] [ ] [ ] . total rna was extracted from allantoic fluid and liver samples using trizol reagent (invitrogen, carlsbad, usa) according to the manufacturer's instructions. dhav- total rna concentration was measured spectrophotometrically at a and a . this rna was stored at - °c before use. the primers for the rt-lamp amplification of dhav- were designed based on the conserved region in the c gene (genbank accession no. jx ). primers f , b , fip, and bip were designed by means of the primer software primer explorer v (http://primerexplorer.jp/elamp . . /index.html; eiken chemical co., japan). the primer sequences are shown in table . the rt-lamp reaction was carried out in a total ll reaction volume containing thermopol reaction buffer, u of bst dna polymerase, u amv reverse transcriptase (new england biolabs, ma, usa), mm dntp mix (newpep, beijing, china), . m betaine, mm mgso , . lm of each of the f and b primers, . lm of each of the bip and fip primers, and . ll of the target rna. the mixture was incubated at °c for h followed by min at °c. after the reaction, the amplified dna products were detected by electrophoresis on a . % agarose gel (biowest agarose, spain) followed by ethidium bromide staining under ultraviolet light [ ] . in order to compare the sensitivity of the rt-lamp assay with other conventional assays, an rt-pcr assay was developed using two pairs of primers (for and rev; f and b ) according to the early report with some changes ( table ) [ ] . the rt-pcr was carried out in a ll total reaction volume using the one-step rt-pcr kit (newpep, beijing, china) with . lm of each of the upstream and downstream primers and ll of target rna, according to sensitivity comparison of rt-lamp to rt-pcr to detect the limit of the rt-lamp and rt-pcr assay, dhav- total rnas were extracted from the serially -fold diluted allantoic fluid, ranging from to - % egg lethal dose (eld ) per ll. this single dilution series was used as a template for the two assays. the products were detected by agarose gel electrophoresis as described above ( . % agarose, tae) [ ] . to assess the specificity of rt-lamp, including potential cross-reactions with dhav- , dev, mpv, aiv, r. anatipestifer (ra), s. enteritidis, and e. coli (o ) were examined. total rna from the allantoic fluid of normal chicken embryos and livers of uninfected ducks were also assayed. to evaluate the reliability of the rt-lamp assay, clinical liver samples were collected from dhav-suspected ducks in different provinces of china, including shandong, hebei, sichun, and beijing. rna was extracted from these samples and detected by both the rt-lamp and rt-pcr. the products were detected by agarose gel electrophoresis ( . % agarose, tae). the virus isolation method was also applied to the clinical liver samples using the method previously described [ ] . in order to obtain more specificity and detect multiple strains of dhav- , the rt-lamp primers were designed based on a highly conserved region of the c gene of the dhav- strain. the one-step, single-tube, rt-lamp assay was optimized with the selected primer set by varying the ratio of the concentrations of mgso and dntp, the reaction temperature, and time. to compare the sensitivity of the rt-lamp assay with the conventional rt-pcr, the two assays were used to detect the same rnas which were extracted from -fold serial dilutions (from to - eld per ll) of allantoic fluid. dhav- total rna concentrations were also measured spectrophotometrically at a and a . therefore, the corresponding rna concentration range is from pg to - pg per assay. the results are shown in fig. . the detection limit of the rt-lamp assay was . eld per ll, equivalent to - pg dhav- total rna per reaction, which was -fold higher than the rt-pcr assay. in addition, the rt-pcr assay using two pairs of primers have the same sensitivity. the cross-reactivity of the dhav- rt-lamp assay was evaluated with rna from dev, mpv, aiv, ra, s. enteritidis, e. coli (o ), allantoic fluid of normal chicken embryos, and liver of uninfected duck. all these reactions were negative (fig. ) . to evaluate the feasibility of rt-lamp of detecting dhav- in clinical specimens, clinical specimens collected over the past years were assayed by rt-lamp and rt-pcr. in parallel, virus isolation was also performed. the results showed that of the samples tested contained dhav- by virus isolation, the same clinical specimens were also positive by both rt-lamp and rt-pcr (fig. ) . the results of rt-lamp, rt-pcr, and virus isolation were % correlated. several nucleic acid amplification techniques have been developed for the specific and sensitive detection of dhv- , including rt-pcr and real-time pcr. however, these assays require considerable operator skills, expensive equipment, and - h for amplification; thus, the application of these assays is limited in the field. compared to traditional pcr technology, lamp has more advantages. first, lamp is more specific since it requires or primers to identify or specific domains [ , ] , while pcr uses only two primers. second, lamp is more sensitive, for the amplification of lamp is more efficient than pcr. third, lamp does not require expensive and complex equipment, instead it can be performed using a water bath or heat block for incubation under isothermal conditions. finally, lamp is time saving, the assay can be accomplished within h, whereas the pcr technology typically requires - h [ ] . in addition, the lamp amplification products can be observed by the naked eye directly, as sometimes a white precipitate of magnesium pyrophosphate form during the reaction [ ] . after a comparison of different dhav- subgroup genomes, the conserved domain c of the genome was selected as the domain for lamp primer design and was used for screening a group of primers with good amplification efficiency. the d gene has also been used to design primers for detection of dhv- in the early reports, which encodes an rna-dependent rna polymerase [ , , ] . given that many viruses have rna polymerase gene, we prefer to choose c gene as a detecting marker. a one-step rt-lamp assay with high specificity and sensitivity was developed for rapid diagnosis of dhav- , which has no cross-reaction with dev, mpv, aiv, r. anatipestifer (ra), s. enteritidis, and e. coli (o ), suggesting that this technique has high specificity to distinguish among some common avian viruses and bacteria at the nucleic acid level. the rt-lamp has a detection limit of . eld per ll, equivalent to - pg dhav- total rna per reaction, which was times more sensitive than the conventional rt-pcr, which suggested that this method is useful for the detection of low levels of dhav- and is also useful for confirming the early stages of dhav- infection when viral titers are relatively low. the rt-lamp method was also used to detect dhav- in clinical samples. the results from the rt-lamp assay were consistent with the rt-pcr and viral isolation methods, further confirming the reliability of the rt-lamp assay. considering that dhav- rt-lamp has many advantages, such as being highly sensitive, simple, specific, less time consuming, and not requiring expensive equipment, it is therefore more suitable for use as a dhav- diagnostic tool in the field or rural areas than other nucleic acid-based assays. in summary, the dhav- rt-lamp assay we developed could be a potential diagnostic method for use in the surveillance, control, and molecular epidemiological screening of dhav- for using in developing countries. disease of poultry th edn acknowledgments financial support was provided by the special fund for the agro-scientific research in the public interest ( ) and the nature science foundation of china (nsfc ). key: cord- -lkvt slp authors: barrett, john w.; sun, yunming; nazarian, steven h.; belsito, tara a.; brunetti, craig r.; mcfadden, grant title: optimization of codon usage of poxvirus genes allows for improved transient expression in mammalian cells date: journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: lkvt slp transient expression of viral genes from certain poxviruses in uninfected mammalian cells can sometimes be unexpectedly inefficient. the reasons for poor expression levels can be due to a number of features of the gene cassette, such as cryptic splice sites, polymerase ii termination sequences or motifs that lead to mrna instability. here we suggest that in some cases the problem of low protein expression in transfected mammalian cells may be due to inefficient codon usage. we have observed that for many poxvirus genes from the yatapoxvirus genus this deficiency can be overcome by synthesis of the gene with codon sequences optimized for expression in primate cells. this led us to examine colon usage across -dozen sequenced members of the poxviridae. we conclude that codon usage is surprisingly divergent across the different poxviridae genera but is much more conserved within a single genus. thus, poxviridae genera can be divided into distinct groups based on their observed codon bias. when viewed in this context, successful transient expression of transfected poxvirus genes in uninfected mammalian cells can be more accurately predicted based on codon bias. as a corollary, for specific poxvirus genes with less favorable codon usage, codon optimization can result in profoundly increased transient expression levels following transfection of uninfected mammalian cell lines. our lab is interested in the dissection of poxvirus gene function, particularly those genes with a predicted immunomodulatory function [ ] . towards this goal we routinely attempt to express specific poxvirus open reading frames (orfs) from uninfected mammalian expression vectors for further study in the absence of other competing viral proteins. as well, expression vectors often allow the fusion of the viral protein in-frame with epitope tags that permit detection of the fused, expressed protein. this strategy has generally been successful for the transient expression of leporipoxvirus genes, however we have consistently experienced difficulty expressing many yatapoxvirus genes from mammalian expression vectors. to date we have cloned several dozen viral genes from both tanapox virus (tpv) and yaba monkey tumor virus (ymtv) into the expression vector pcdna . myc/his (invitrogen), and have routinely observed little or no protein expression following transfection into human or primate cells. this poor transient expression could be due to the presence of cryptic splice sites, polymerase ii termination sites or mrna instability motifs within the orf resulting in truncated, incomplete or unstable transcripts. however another explanation is that inefficient colon usage could restrict the amount of translated product from mammalian cells [ ] [ ] [ ] . to probe this issue, we have employed the baculovirus expression system (bes) to over-express yatapoxvirus genes of interest, usually with great success [ , ] . to date, all of the yatapoxvirus genes we have cloned into acnpv are expressed efficiently. although the bes has numerous advantages and allows production of moderate quantities of poxvirus protein, there are still advantages to being able to transiently express a poxvirus gene in an uninfected mammalian cell. as well, we have frequently mutated any predicted cryptic splice sites without altering the encoded amino acid sequence. although such predicted splice sequences could be altered by site directed mutagenesis, we were still not ever able to transiently express yatapoxvirus proteins with efficiencies comparable to genes derived from the lepori-or orthopoxviruses (unpublished). however, for several yatapoxvirus genes of interest we chemically synthesized versions with codon sequences optimized for the human translation machinery. these optimized viral gene sequences were then cloned into pcdna . myc/his and shown to now express at high efficiency in both human (hek ) and non-human primate cells (cos ). these results are consistent with codon optimization of genes from other viruses, including hiv and hpv [ ] [ ] [ ] . this observation led us to examine codon usage bias in members of the poxviridae family. poxvirus members belong to the family poxviridae which is divided into two sub-families: the entomopoxviriane, which are invertebrate poxviruses and can be further subdivided into three ''types'' that are restricted to several insect families, and the chordopoxvirinae, which is subdivided into eight genera that infect vertebrates. complete genomic sequences are now available for representatives of all chordopox genera comprising over -dozen representative members (www.poxvirus.org). here we examine the codon usage profiles of these selected poxviruses and try to derive some general principles regarding the ability to predict efficiencies of translation and transient expression of poxvirus genes in mammalian cells. poxvirus genomes were identified from ncbi and the open reading frames saved as fasta files using the ''viewing coding regions'' option of entrez. lists of the nucleotide coding sequences were loaded into the online version of codonw [ ] ; http://bioweb.pasteur.fr/ seqanal/interfaces/codonw.html) and the effective codon number and percent gc at the third position was measured. all data was compiled into excel:mac v (microsoft), manipulations were . performed and the numbers were plotted against each other. these plots indicate the codon bias on the y-axis so that the more biased (i.e. non-random) the codon usage is, the closer it will be to a value of . the more unbiased (i.e. random) the codon usage, the closer the plot shifts towards (the maximum effective number) where each codon has an equal opportunity to encode an amino acid. transfections and immunoblotting hek and cos cells were transfected using lipofectamine (invitrogen inc.) according to manufacturer's specifications. two micrograms of plasmid dna was transfected into each well of a six-well dish. expression was detected with anti-myc (invitrogen) at : , dilution, anti-his (qiagen) at : , or anti-gp [ ] at : , . total rna was extracted from transfected cells at h post transfection using a qiagen rneasy mini kit (qiagen). first strand synthesis was achieved with superscript ii reverse transcriptase (invitrogen) in a ll reaction volume using oligo-(dt) as a primer. the cdna was used as a template for pcr amplification. primers used for pcr amplifying native and mutated l were ¢-cccaagcttcatggataagttactattatttagcac (forward primer, hindiii site italicized) and ¢-ccgctcgagggtttccgtcttcttcatcctcttc (reverse primer, xhoi site italicized). primers used for pcr amplifying optimized l were ¢-atg aac aaa ctg atc ctg ttc agc (forward primer) and ¢-gcc aag tct tcc tcg tcc tct tcg (reverse primer). the reaction mix was incubated for one cycle at °c for min and then cycles at °c for s, °c for min and °c for min. products were amplified with platinum taq (invitrogen) and resolved on % agarose. transient expression of individual viral proteins allows for the study of the specific gene products without the complications of the background contributions from the other viral proteins. towards this goal, we have cloned several dozen yatapoxvirus genes into mammalian expression vectors to analyze their function. unfortunately, we have been unable to detect any expression of yatapoxvirus genes from mammalian expression vectors. for example, the l gene from tanapox virus (t l), an inhibitor of human tumor necrosis factor (hutnf, [ ] can be readily detected by immunoblotting from tpv-infected primate cells however it is not detectable when transiently expressed in uninfected primate cells (fig. a , compare lanes and ). in contrast, the same t l open reading frame is well expressed in the baculovirus expression system (fig. b, lane ) . the scenario where certain viral genes, cloned from mammalian viruses, are not expressed following transfection in mammalian cells but are well expressed from baculovirus promoters in insect cells led us to examine further the problem. processing the sequences through software (http:// www.friutfly.org/cgi-bin/seq_tools/splice.pl) that searches for cryptic splice sites predicted several potential splice sites (table ) when we examined native t l transcript levels we found that the t l transcript was indeed truncated in transfected cos cells (fig. c , lane ). we attempted to first correct for the cryptic splice sites by site directed mutagenesis which solved the issue of truncated transcripts (fig. c, lane ) , however that did not solve the lack of protein expression (fig. a, lane ) . to overcome this problem we have synthesized several yatapoxvirus genes with codon optimized sequences that favour translation in human cells (topgene, montreal, qc). because poxvirus genes are normally transcribed in the cytoplasm and never encounter the nucleus there has not been any selection pressure exerted by host cell nucleusresident pathways. codon optimization of t l led to detection of transcripts of the correct size (fig. c , lane ), and t l protein from transient expression was now readily detectable with our antibodies (fig. a, lane ). in the case of t l, codon optimization, by adjusting the proportion of at in the third codon position to a higher proportion of gc (fig. ) resulted in the switch from undetectable to significant protein expression. such a dramatic change through codon optimization led us to examine the codon usage patterns in poxvirus family members. the twenty amino acids utilized by the universal translational machinery are encoded by codons. the redundancy of codon specificity, and the particular preference of codon selection within a given species, can be informative about its genetic structure and organization. the range of codon usage bias was therefore examined for the poxviridae. complete genomic sequences for two entomopoxvirus species and representative chordopoxvirus genomes are available in genbank (table ) . to measure the codon bias within a gene, it is first necessary to determine the actual codon usage and compare it to the possible codon options available for each amino acid. this calculation is considered the effective codon number (nc) and this statistic has been developed for comparative studies and evolutionary divergence analyses [ ] . the effective codon number estimates the average number of codons actually used above the native t l sequence, in bold text, are the nucleotides that were altered by site directed mutagenesis to alter the cryptic splice sites and correspond to the sites described in table estimated the nc for all orfs using codonw [ ] . the effective number of codons (nc) used by the poxviridae was on average . and ranged from a very biased nc of . (amsacta moorei entomopoxvirus) to a more random nc of . (shope fibroma virus) ( table ) . two viruses had nc values in the range of - , in the range - , between and and a single species between and . although the poxviridae, as a whole, exhibited a range of codon bias, approximately five species displayed extensive bias while the rest exhibited only minor codon usage bias. poxvirus genera can be separated into distinct classes based on codon bias and gc content twenty-one poxvirus genomes were compared by plotting the effective codon number (nc) against the proportion gc in the third position (gc ) (fig. ) . each plot presents the complete complement of orfs from each genome. there is wide variation in effective codon number (nc) and gc % among the species, however several trends are apparent. generally, all the orfs within a species exhibit a similar gc % and a codon bias that results in a clustering of the orfs. the exception is that the entomopoxviruses, which encode a subset of - genes which appear to deviate from the majority. while the majority of the entomopoxvirus orfs appear to be extremely at rich in the third position, these ''outliers'' have a higher gc content. the parapoxvirus, and to a lesser extent the molluscipoxvirus genomes, also exhibit a subgroup of outlier genes that deviate from the main group (fig. ). in these genera, there are genes, which have a lower percent of gc (less than %) in the third position and these orfs exhibit much less codon bias. where available it also appears that members within a specific genus maintain a conserved codon bias reflected in the effective codon number. we plotted the theoretical effective codon number (line) estimated solely on gc concentration. this suggested that for most poxvirus members the actual codon bias was close to the predicted value based on gc content. based on the plots of the genomes we can group all poxviruses into one of four classes (fig. ) . class one represents genomes with a highly biased codon usage and with a very low gc percentage in the rd position. this class includes the two entomopoxviruses only. we would predict that future epv sequences would also reflect this trend. class two has a more random codon usage with a % gc in rd position and is exclusive to the leporipoxviruses, represented here by myxoma virus and rabbit fibroma virus and appears to encode orfs that exhibit almost random codon usage. class three includes those genomes, which are highly biased in their codon usage but, in contrast to the entomopoxviruses, these species are the final class is the largest and contains the majority of poxvirus genera. once more genomes are sequenced and analyzed this final group may break down into two distinct classes, however for now this final class includes the capripoxviruses, the single member of the suipoxvirus and deerpox, which is unclassified, and these genes are characterized by mild codon bias (nc avg = . ) and between % and % gc (fig. ) . the remaining members of this class exhibit a more random codon usage pattern (nc avg = . ), similar to class two, however, in contrast, the rd position gc% is much lower, on average between % and % and include all published genomes of the orthopoxviruses, avipoxviruses, and yatapoxviruses. overall, poxvirus species exhibit a range of codon bias usage, however members within a genus have evolved a codon usage bias consistent with other members of their genus. this conservation of the codon usage appears to be gc concentration specific, rather than dependent on host requirements. for example, when we plot the percent gc for all coding regions against the gc content of the first two codon positions (gc + ), for each genome we find a high correlation between gc in position and and maintenance of gc in the rd position. the grouping of the members into the four defined groups is easily visualized (fig. ) . highly conserved genes do not share codon bias across species unexpectedly, conservation of codon bias for orthologous genes across the multiple poxvirus genera was not observed. for example, when we examine three highly conserved genes found in all published poxviruses, including dna polymerase, p a (the major core protein) and uracil dna glycosidase, we find that the codon bias is conserved only within the particular genus (fig. ) . the entomopoxviruses, parapoxviruses and molluscipoxvirus are all highly biased in the codon fig. there is a strong correlation between the gc content of the third synonymous codon position (gc ) and the gc content of the first and second codon positions (gc + ). each data point represents the average values calculated for each poxvirus species. the groupings described for fig. are circled and each species is identified by an abbreviated name. the abbreviations are taken from table fig. codon bias of conserved genes is not maintained amongst the poxvirus species. the effective codon number as an estimate of codon usage for dna polymerase, p a and uracil dna glycosidase was calculated for each species and compared within and between species usage of these three genes and this is reflected in the low effective codon number for these three groups. in contrast, the leporipoxviruses are essentially random in the codon selection and the rest of the species fall somewhere in between. therefore it appears that viral genes that are thought to have evolved from a common ancestor have further adapted to the host genetic environment in which the individual poxviruses have invaded. this is true of genes that possess a cellular homolog (dna polymerase) and those that are of strictly viral origin (major core protein). eight poxvirus members from four genera have the ability to infect and produce productive infections in humans, including members of the orthopoxviruses (vaccinia, variola, cowpox, monkeypox), the yatapoxviruses (tanapox, yaba monkey tumor virus), the parapoxviruses (orf, pseudocowpox) and mollusicpoxviruses (molluscum contagiosum) [ , ] . it might be predicted that the ability to infect humans would require a codon usage profile that matches codon usage in humans, or possibly a conserved codon bias shared amongst species able to infect humans. however this is not borne out by analyses of the actual codon bias of these members. the genomic sequences of seven of the eight members with the ability to replicate in humans are available and there does not seem to be any relationship between the codon usage and ability to infect humans. in fact, variola and mcv infections are restricted to human hosts but variola exhibits less codon usage bias (nc = . ) than does molluscum contagiosum (nc = ) despite a dramatic difference in their gc content. the orfs of variola are generally at rich in the gc position (gc = %) versus molluscum contagiosum orfs which are very gc rich (gc = %; table ). a plot of the effective codon number against %gc for human cellular genes [ ] looks more similar to the profiles for molluscum contagiosum and orf virus (fig. ) than for variola virus. the profiles for the orthopoxviruses and other members of class appear most similar to effective codon plot profiles for the amoeba, dictyostelium discoideum [ ] . we have assumed that low expression levels following transfection of certain yatapoxvirus genes were the result of cryptic splice sites that were being processed in the nucleus leading to truncated transcripts. of the yatapox orfs we have tested none has been adequately expressed transiently from pcdna . myc/his in mammalian cells. recently we have had three orfs synthesized to optimize codon usage for human cells. the expression of the modified yatapox genes was dramatic. the codon optimization resulted in excellent expression levels from both transfected human (hek ) and non-human primate (cos ) cells (fig. ) . comparison between the natural codon usage and third position gc levels with the optimized orfs indicate that there are some striking differences (fig. ) . the three native orfs have mild codon bias however they are strikingly at rich in the rd position of the codon. in contrast the optimized versions of the same genes are now strongly biased and are extremely gc rich in the rd position of the codon. based on these results and our earlier work it may be possible to predict which pox genomes encode genes which would be resistant to transient expression, using common mammalian expression vectors in mammalian cells (table ). basically those genomes that are at rich, including the entomopox-, the yatapox-, the orthopox-, capripox-, and suipoxviruses would be predicted to be resistant to transient expression in human and nonhuman primate cells. in contrast we would predict that genes from parapox-and molluscipoxviruses should be well expressed in transient mammalian systems. as well that might also explain why ymtv and tpv genes are so well expressed in the baculovirus expression system. the high proportion of at in the rd position is already adapted to the insect cells environment and may reflect an evolutionary history that involves replication within an insect host. examination of all poxvirus genomes and the proportion of gc content at each position of the codon indicate that all the genomes have a decreasing proportion gc at each successive position except for the leporipoxviruses, the parapoxviruses and the molluscipoxvirus (table ). for the five species within these three genera the highest proportion of gc occurs in the rd position whereas all other species the highest gc content occurs in the first position. the relationship between genomic gc content and gc indicates that all pox genomes except for mv, sfv, mcv, orf and bpsv contain an overall gc content between % and % with a smaller proportion of gc at the rd position ( table ). in contrast the other five genomes have an overall gc content that ranges between % (mv, sfv) to % (orf, bpsv, mcv) and in each case the gc content of the rd position is even higher at about % for mv and above % gc for mcv, orf and bpsv (table ) . codon usage bias in the poxviridae is related to gc content the total gc content of the poxviridae genomes range from % (amepv) to % (orf virus) gc (table ) . however the gc % ranges from % (msepv) to a staggering % (bspv). poxviruses contain very little non-coding dna within their genomes and since the first two codon positions are constrained by codon specificity requirements it would be predicted that the rd position would exhibit the most variation. we compared overall gc content to the gc content at each position of the codon (gc , gc gc ) calculated from the complete coding complement. the assumption was that the third or synonymous position of the codon would be under less selection pressure because of the redundancy of the amino acid coding. however we found that the highest correlation was between overall gc content and gc (r = . ) and gc (r = . ) (fig. ). all members of the poxviridae maintained this strong correlation. and this relationship indicates that codon usage is tightly linked to individual gc content. undetectable levels of transient gene expression of yatapoxvirus genes prompted us to examine codon usage in the family poxviridae. our results indicate that there are high-gc content (parapox-and molluscipoxviruses) and low-gc content (entomopoxviruses) poxviruses and it is those genomes with the largest gc extremes that exhibit the largest bias in codon usage. codon usage bias in poxviruses is skewed in the direction of the overall gc content. we found that optimizing for codon usage resulted in dramatic improvement of the expression signal of transiently fig. comparison of natural gene codon usage versus optimized codon usage. the effective codon numbers for three native tpv genes are compared against the effective codon numbers of the same genes following codon optimization the lines represent best fit expressed genes. in the case described here, the native sequence of tanapox t l, which is at rich and exhibits some codon bias was synthesized to increase the codon usage bias by increasing the gc concentration at the third codon position. the percent gc in the native form of the gene was altered from less than % to greater than % gc in the codon optimized version (fig. ) . because poxviruses replicate exclusively within the cytoplasm the viral transcripts have not evolved under nuclear splicing or processing selection and this may explain the wide variation in gc content within the family poxviridae. the poxvirus members included in our study infect a wide range of hosts however they show similar trends between their genomic gc content and amino acid composition, and therefore the codon bias employed. even members whose life cycle is restricted to infection of a single species, such as variola virus and molluscum contagiosum, which only infect humans, maintain amino acid composition related to their own specific gc content. the observation that modification of the gc in the optimized codons led to dramatic expression is not surprising because codon usage in other virus families is also related to gc content [ ] . however poxviruses have two distinct features that make them unique. first they encode all the necessary transcription machinery within their virus factories in the cytoplasm and therefore do not rely on cellular components [ ] . second, although members of the poxviridae infect a wide range of hosts including insects, birds, reptiles and mammals, with a few exceptions, individual poxvirus species have a narrow host range [ ] . therefore we can expect that individual poxviruses have adapted to the molecular features of their particular host. the genomes are well conserved suggesting a common ancestor and they have been incredibly successful. this is likely due to the fact that they do not require residency within the nucleus but rather construct their own virus factories within the cytoplasm. there is a plasticity to the codon usage found in the poxviruses that does not necessarily reflect the common evolutionary history. we have examined three conserved poxvirus orthologs, which are predicted to function in a similar manner in members of the poxviridae including dna polymerase, p a (the major core protein) and dna uracil glycosidase and which all pox members encode however there is variable codon usage between the pox species for the same genes. the biases appear related to genomic gc content. it has been suggested that codon bias reflects the level of gene expression and/or length of gene [ ] however this does not seem to be supported in the poxviridae because the codon usage for the same orthologs are different depending on the poxviral member (fig. ). it has also been suggested that codon bias could have evolved based on host requirements. however this does not hold for mcv, which has a gc rich genome ( . % gc, table ) and variola virus which is more at rich ( . % gc, table ) however both replicate exclusively in human tissues. perhaps the difference in gc concentration may be explained by the cell type or tissue in which the virus is resident. mcv is found exclusively in the keratinocytes of the dermis while variola virus can be found through out the body including in the lymphatic system, respiratory system and blood [ ] . the incongruence between the %at of a poxvirus genome and the at concentration of its host genomic dna has been noted before [ ] . capripoxand parapoxviruses both infect ungulates (sheep, goats, antelopes) however following the sequencing of . kb of capripoxvirus dna it was noted that the high at concentration ( . %) of the capripoxvirus dna did not reflect the at concentration of the evolutionary hosts [ ] . selected analyses have suggested that sheep and goats had at concentrations around %. as well parapoxviruses, which share the same host range have a viral genomic content of around % at [ ] . complete sequence data now confirms these earlier estimates. the parapoxviruses genomes (orf and bspv) are . % at rich, while the capripoxvirus member (lsdv) is . % at rich (table ) . unlike the situation in poxvirus genomes, analysis of the sars coronavirus and other members of the nidovirales indicated significant variation in codon usage bias among different genes within a species [ ] . it was concluded that gc composition was the primary determinant of synonymous codon usage among these virus genes but the bias was manifested at the gene level rather than at the genome level [ ] . a study on the codon usage in nucleopolyhedroviruses (npv), another family of large dsdna viruses, concluded that there was significant variation in codon usage by genes within the same virus. again this is different from what we are reporting for the poxviruses. however the npv study was based on six genes and we examined the complete complement of orfs. individual variation might be lost in the overall picture for the poxviruses. as well significant variation in codon usage in homologous genes encoded by different npvs was observed. this is similar to our observations with poxviruses. finally there was no correlation between level of gene expression and codon bias in npv or between gene length and codon bias, and patterns of codon usage appeared to be a direct function of gc content of the virus encoded genes [ ] . this is consistent with our observations reported here. virus genes ( ) : - there are now examples from several virus families that indicate that alteration of the native codons will result in dramatically improved expression. in most cases the expression problem seems to be inappropriate codon usage. native human papillomavirus (hpv)- e utilized infrequently used codons in of its amino acids and was undetectable following transient transfection however once the sequence was optimized for more common codons, used in mammalian genes, expression increased to -fold [ ] . another hpv gene, l , hampered by codon usage bias different from the host was corrected by codon optimization resulting in a -fold increase in expression levels [ ] . in conclusion the members of the poxviridae have genomes with a wide range of gc content and this appears to regulate their codon usage bias. the codon bias does not seem to be related to the size of the genes or their expression level because the codon bias seems to be maintained within genomes but not between genera. optimizing codon usage has improved the transient expression of several pox genes in mammalian cells. based on the calculation of the effective codon number for all orfs from all complete genomes we would predict that the best species to study by transient expression of native genes should from the parapox-, mollusci-and leporipoxviruses genera. however those poxvirus members that are resistant to transient transfection and expression in human or non-human primate cells will likely benefit from codon optimization. baculovirus expression vectors: a laboratory manual analysis of codon usage fields virology fields virology proc. natl. acad. sci. usa acknowledgements gm is a canada research chair in molecular virology. this research was supported by cihr and ncic. we thank t. irvine for technical assistance and d. hall for administrative support. key: cord- -nokd kmx authors: yang, guang; che, xibing; gofman, rose; ben-shalom, yossi; piestun, dan; gafny, ron; mawassi, munir; bar-joseph, moshe title: d-rna molecules associated with subisolates of the vt strain of citrus tristeza virus which induce different seedling-yellows reactions date: journal: virus genes doi: . /a: sha: doc_id: cord_uid: nokd kmx citrus tristeza virus (ctv) strains were previously catalogued as seedling-yellows (sy) and non-sy (nsy) types, according to their yellowing and stunting effects on indicator seedlings. among subisolates of the vt strain, which were selected from chronically infected alemow plants, there was a correlation between the presence of . -, . - and . -kb d-rnas, and sy and nsy reactions, respectively. similarly, plants infected with mor-t subisolates, which cause sy, contained d-rnas of . to . kb, while nsy subisolates from recovered sour orange tissue contained a major d-rna of . kb. plants harboring the . -kb d-rna were protected against challenge inoculation with a subisolate harboring the . -kb d-rna. this study suggests that the nsy reaction results either from the absence of sy gene(s) in the genomes of certain ctv strains or through the suppression of the effects of sy gene(s) by d-rnas with ′ parts larger than nt. citrus tristeza virus (ctv) ( , ) , a member of the closterovirus group and the closteroviridae family ( ± ) is an important pathogen, causing considerable economic losses to citrus industries worldwide. citrus trees infected with ctv display two main types of disease: (i) quick decline of sweet oranges (swo) (citrus sinensis l.) and of some other species grafted on the sour orange (c. aurantium) rootstock ( ) ; and (ii) stem pitting of grapefruit (c. paradisi) and pummelo (c. grandis) ( ) . other manifestations of infection with ctv include the seedling-yellows (sy) reaction ( ± ) which is primarily a disease of experimentally inoculated plants but which might also be encountered in the ®eld in top-grafted plants. seedlings of sour orange, lemon (c. limon) and grapefruit become chlorotic and stunted when inoculated with ctv-sy isolates, but no symptoms are elicited when swo or mandarin (c. reticulata) is inoculated ( , ) . the ctv-sy phenomenon is one of the long-standing enigmas in citrus virology. the early studies of mcclean & van der planck ( ), fraser ( ) and wallace ( ) all suggested a complex aetiology of the ctv-sy disease. there have been reports of spontaneous recovery from sy infection by sour orange plants which initially showed sy symptoms, and of the elimination of the sy causal agent by the passage of sy-inducing ctv subisolates through sy-sensitive citrus hosts such as grapefruit and sour orange ( ) , which has led to the emergence of non-sy (nsy) isolates. these phenomena have given rise to the hypothesis that the ctv-sy reaction is caused by two separate components: the ctv agent, capable of autonomous replication and responsible for the quick decline and the lime reaction; and a second component, responsible for the sy reaction and able to replicate only in plants harboring the ctv component. the ctv particles contain a single-component positive-stranded genomic rna of nt for the florida isolate, t ( ) and of nt for the vt strain from israel ( ) . the genomes of these ctv strains showed considerable sequence deviation within the h half, but were found to have similar organization and to encompass orfs which potentially code for at least protein products. in addition to the large replicative form (rf) rna molecule, the infected plants contain a nested set of at least nine smaller species of h -co-terminal single-and double-stranded subgenomic rnas (sgrnas). these sgrnas correspond to the h -terminal orfs ( , ) . cloning of the vt strain of ctv revealed the presence of several defective (d) rnas of various sizes, composed of the h and h termini of the genomic rna with extensive internal deletions, along with the full-length virus. the sizes of the termini varied among species, with minimal lengths of nt and nt from the h and the h termini, respectively, resulting in different sizes of d-rnas with different junction sites ( , ) . inoculation of vt on the sour orange indicator resulted in sy symptoms ( ) . later infections of sour orange seedlings by grafting with ctv-vt infected alemow budwood resulted in inconsistent sy reactions; and not all plants showed the sy symptoms. recently, we selected subisolates of two ctv strains, vt and mor-t ( ) , which differed in their sy reactions on sour orange seedlings. the present paper reports the association of d-rnas with h termini larger then nt, with vt and mor-t subisolates which do not elicit the sy reaction. d-rnas may be involved in the long-standing enigma of the complex etiology of the sy-ctv reaction. the vt strain was originally isolated in from a swo cv. valencia tree grafted on sour orange. the tree showed advanced quick-decline symptoms. inoculation of sour orange plants with the vt inoculum maintained in sour lime caused typical sy symptoms ( ) . later passages of the vt strain from sour lime and alemow plants to sour orange often resulted in inconsistent sy reactions: not all sour orange seedlings showed the sy symptoms, even when inoculum from a single alemow plant was used to infect groups of plants from a single seed source (bar-joseph, unpublished). subisolates of ctv-vt (table ) were randomly selected in from chronically infected alemow plants which had been graft inoculated several years earlier ( to ) with different passages of this strain. the vt subisolates were maintained in a propagation glasshouse with temperatures ranging between and c. the sy reaction was assayed by grafting chip buds from infected alemow stems onto sour orange seedlings grown in a temperature-controlled glasshouse facility with incandescent illumination to complete h of light, and two temperature regimes (tr) of / c or / c for the normal and the semi-warm tr, respectively. in both trs the high and low temperatures were maintained for and h, respectively, and the adjustment from the high daytime to the low night time level and vice versa took h. the sy reactions were recorded and weeks after inoculation, for the normal and semiwarm tr, respectively. the mor-t isolate originated from a declining minneola tangelo tree ( ) . the virus was propagated in alemow and was used to inoculate a group of sour orange seedlings, some of which were inarched with the ctv-tolerant rootstock go-tou. sour orange twigs and leaves showing sy and sy recovery, respectively, were used to infect sour orange and alemow seedlings. double-stranded (ds) rnas were isolated from ± g of alemow or sour orange tissues, according to dodds and bar-joseph ( ) . the rnas were separated by electrophoresis in formamide-formaldehyde denaturating, . % agarose gels, prepared in mops buffer, transferred to hybond n membranes. the hybridization probes consisted of a -bp and a -bp cdna fragment from the h and h ends of ctv-vt genome, respectively ( ) . the dna probes were either non-radioactively labeled using the gene images random prime labeling module kit from amersham or radioactively labeled with p according to mawassi et al. ( ) . rna probes labeled with p-utp were synthesized, with the riboprobe system-t kit (promega) according to the manufacturer's instructions, from cdna fragments of bp and bp of the ctv-vt h and h ends, respectively, cloned in pgem (promega). antibodies for elisa capture were prepared in sheep primed with recombinant ctv coat protein (rctv-cp) antigen and boosted with a partially puri®ed ctv preparation. the second antibodies were obtained from egg yolks of chickens immunized with rctv-cp. the elisa procedure for ctv viral antigen quanti®cation in different tissues, which were soaked overnight in the antibody-coated elisa wells, was according to bar-joseph et al. ( ) . the cdnas were prepared from dsrna templates of vt and vt , with primers p and p for the ®rststrand synthesis, and primers p ±p and p ±p for nested and direct pcr ampli®cation ( table ). the cdna fragments were separated by electrophoresis on % agarose gel. the bands were excised from the gel and tested with the restriction enzymes, sac i and nsi i (promega). for sequence analysis we used primers p and p ; p and p ; p and p to obtain three cdna fragments located at orf ( ± ), orfs ( ± ) and orfs ( ± ), respectively. the cdna fragments were cloned into the puc /t (fermentas) and sequenced from both sides by using sequenase version from usb. sequences of at least bases were read from the h and h termini of each of the cdna fragments. the dsrnas from alemow plants infected with two mor-t subisolates, desig-nated #a and #b for sy-recovered and sy-reacting plants, respectively, were poly-a tailed and used for ®rst-strand cdna synthesis with primer dt v (table ) and for second-strand synthesis with primers p and p , for nested pcr ampli®cation of the viral h and with primers p and ad for the viral h . the cdna fragments were separated by electrophoresis on % agarose gel, cloned into puc /t (fermentas). sequencing from both sides of the h fragments, was performed by using sequenase version from usb and the h sequence was determined with the aid of an automatic sequencing machine. two groups of month old alemow seedlings were graft inoculated at heights of ± and cm, with two chip buds from alemow plants infected with vt or vt , respectively. two weeks post-infection (wpi), the plants were pruned and allowed to develop two side branches. tests for the presence of the speci®c d-rnas were conducted after wpi. the plants where challenged, wpi by top grafting with stems infected with the reciprocal subisolates. two lateral buds were allowed to sprout from each of the protected plants and leaf and stem bark tissue were tested for the presence of d-rnas by northern blotting. biological characterization of vt and mor-t subisolates hybridization with an approximately . -kb cdna probe or riboprobe from the h end of the vt genome with dsrna extracts from alemow plants, revealed the presence of the large rf and the low-molecularweight tristeza h -corresponding rna molecules (lmt) ( ) and d-rnas. vt-subisolates ± and , and , , , , with apparently similar sy reactions, showed the presence of two types of d-rnas, of . kb and . kb, respectively. the three nsy subisolates ( , and ) showed the presence of a . -kb d-rna ( fig. a and table ). the hybridization patterns of dsrnas extracted from sour orange seedlings infected with vt subisolates vt (nsy) and vt (sy) are shown in fig. b . only weak or no hybridization signals of genomic and/or defective rna could be located in bark and leaves from the sour orange plant which showed severe sy compared with those from the nsy plant. hybridization of dsrnas from alemow plants inoculated with mor-t subisolates #a (nsy) and #b (sy), showed the presence of major large (ca. . kb) and small (ca. . kb) d-rnas respectively (fig. c) . one of the sy mor-t subisolates #c showed only weak bands of d-rna molecules compared with the nsy subisolate #e , which showed the major d-rna of ca. . kb (fig. d, lane ) . sequence analyses revealed that sy subisolate #b contained two d-rnas of and nt with junctions of their h termini located at positions and , whereas the nsy subisolate #a , contained a major d-rna of nt, with the junction of the h terminus located at position (fig. b) . the hybridization with the vt h probe with different vt and mor-t subisolates suggested a close relationship between their genomic rnas. in order to examine the genomic composition of the vt (sy) and the vt (nsy) subisolates, we compared the sequences of termini of their genomes by means of nested rt-pcr and sequencing analyses. primers p and p were used for ®rst-strand cdna synthesis and primers p , p , p and p (table ) ampli®cation. the resulting cdna fragments for both subisolates gave the expected lengths for the h ( - ) and h ( ± ) ends of their genome. restriction analysis of these products with saci and nsii gave restriction fragments of identical size (not shown). sequence analyses of internal regions, at least nt in length, of three cdna fragments positioned at different regions of the vt genome ( positions ± , ± and ± ) did not reveal any sequence deviation between the products obtained from the dsrnas of the vt (sy) and the vt (nsy) subisolates (not shown). the possibility of interference between two vt subisolates, vt and vt , harboring the . -and the . -kb d-rnas, respectively, was tested in alemow plants. the dsrnas from plants which had ®rst received a protective inoculation with either the vt or the vt subisolate and were later challenged by top grafting with the reciprocal subisolate, were hybridized with the h -speci®c probe. at weeks post challenge inoculation (wpci), the basal parts of each combination had predominantly the d-rnas of the protective isolate (not shown). later tests at wpci showed only the . -kb d-rna in the basal parts of plants protected with vt (fig. , lanes , and ). plants protected with vt showed the presence of either a conspicuous or a weak band of the challenging . -kb d-rna in addition to the . kb d-rna (fig. , lanes and , respectively) . sour orange seedlings infected with alemow tissues from the interference experiments, which harbored both the . -and the . stronger elisa titers and higher dsrnas concentrations (fig. b, lane ) . biological and molecular characterization of vt subisolates, which were randomly selected from chronically infected alemow plants, revealed the presence of eight sy and three nsy subisolates. the vt subisolates caused similar symptoms and comparable elisa reactions in alemow plants (not shown). the virus titers were considerably higher in sour orange plants infected with nsy than in those infected with sy subisolates. these differences were consistent among plants which were maintained under different trs (table ). low virus titers or the absence of virus (indicated by negative reactions on indicator plants) in sour orange leaves and roots showing severe sy symptoms, suggest the possibility that the sy isolates emit a long-distance signal for a hypersensitive reaction. a similar situation has been previously observed in mature trees infected with ctv-mor-t, where the collapse of the sweet orange/ sour orange combination often preceded the spread and redistribution of the virus towards the upper parts of the infected trees ( ) . the profound differences among the sour orange reactions to the various vt-subisolates were associated with the presence of different major d-rnas. the nsy subisolates, , and , showed the presence of a major band of . -kb d-rna, whereas the eight sy subisolates, , ± and , showed the presence of two smaller d-rnas of . and . kb, with no apparent difference in the intensity of the sy reaction to subisolates which contained either of the smaller d-rnas. infection of sour orange with tissues from alemow plants concomitantly infected with mixtures of vt and vt resulted in reactions ranging from sy to nsy, with virus titers depending on the relative concentrations of the . -and . -d-rnas in the inoculum source. previously, we showed variations in the presence of the . -, . -and . -kb d-rnas in alemow plants infected with budwood from a single vt-infected source plant ( ) . differences in d-rna populations might have accounted for the previously noticed inconsistencies in the sy reaction of sour orange plants infected with vt strain (bar-joseph, unpublished). the selection of vt subisolates which show a more consistent sy reaction was correlated with the presence of a major type of d-rna (table ) . one probable reason for obtaining apparently stable subisolates was their selection from chronically infected plants ( ± years after inoculation) at a time when a single type of d-rna had become dominant. (fig. ) . a nt sequence, h -gaaaactaatttatca, with no homology to other regions of the ctv genome was found at the junction site (fig. ) . a different short sequence, probably of host origin had previously been observed at the junction site of the . -kb d-rna ( ) . the ctv-sy phenomenon is one of the longstanding enigmas in citrus virology. the ®nding that both the ctv and the ctv-sy diseases could be transferred by mechanical inoculation of preparations of ctv particles ( , ) raised the question ( ) of the dual-component theory of the causal agent of the ctv-sy disease ( ) . dodds et al. ( ) noted an association between two dsrnas of about . and . kb and swo trees infected with sy subisolates. molecular characterization associated the . -kbp dsrna with the replicative subgenomic rna coding for orf ( nt) ( , , , ) and hybridization with a h -speci®c probe did not reveal quantitative differences in the amounts of the . kbp dsrnas from sy and nsy plants (not shown). moreover, low-molecular-weight d-rnas of . kb were located in alemow infected with nsy isolates mik-t and ach-t ( ) (not shown). ctv isolates were previously classi®ed by a variety of criteria into subisolates which differed in host reactions, vector transmissibility and dsrnas patterns ( , ± ). the variability among subisolates was considered as an indication of the high frequency of mixed ctv infections. d-rnas were previously implicated in the variability between the dsrna patterns of parental isolates and their subisolates ( , ) and the present ®ndings indicate a correlation between certain d-rnas and host reactions, and support a working hypothesis that the nsy reaction results either from the absence of sy gene(s) or through the suppression of their effects by d-rnas with h parts larger than nt. the genomic and d-rna fragments of the two differentially reacting vt subisolates were found to show a complete sequence identity. nevertheless, the possibility that a minimal sequence deviation between other parts of their genomes is involved in these biological differences cannot at the present be completely ruled out. moreover, the question of the mechanism that causes sy symptoms in sour orange tissues, which contain only low concentrations of viruses or d-rna remains to be answered. d-rnas have been isolated from a broad spectrum of animal viruses and, more recently, also from a large number of plant viruses (for recent reviews, see ( ) ). different d-rnas have previously been reported to have different effects on disease expression: while d-rnas of tombusviruses had attenuating effects on infection ( , ) , the d-rnas associated with the turnip crinkle virus tended to increase the severity of symptoms ( ) and the d-rnas associated with broad bean mottle virus had no effect on some host plants but intensi®ed the severity of symptoms in others ( ) . the correlation between the sy reactions of sour orange seedlings and the genomic composition of the d-rnas in the alemow inoculum, support the notion that the host type is a major determinant of the biological effects of d-rnas ( ) . citrus tristeza virus, revised description. cmi/aab description of plant viruses filamentous viruses of woody plants pathogenesis and host-speci®city in plant diseases encyclopedia of virology agricultural gazette indexing procedures for virus disease of citrus proc. th con. iocv. iocv. gainsville plant diseases of international importance, diseases of fruit crops proc. th con. iocv. iocv, riverside pro. th con. iocv. iocv, riverside proc. th con. iocv. iocv, riverside pro. th con. iocv. iocv sem virol key: cord- -fwdpzv authors: zhu, ying; liu, mo; zhao, weiguang; zhang, jianlin; zhang, xue; wang, ke; gu, chunfang; wu, kailang; li, yan; zheng, congyi; xiao, gengfu; yan, huimin; zhang, jiamin; guo, deyin; tien, po; wu, jianguo title: isolation of virus from a sars patient and genome-wide analysis of genetic mutations related to pathogenesis and epidemiology from sars-cov isolates date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: fwdpzv severe acute respiratory syndrome (sars) caused by sars-associated coronavirus (sars-cov) is a fatal disease. prevention of future outbreaks is essential and requires understanding pathogenesis and evolution of the virus. we have isolated a sars-cov in china and analyzed sars-cov genomes with the aims to reveal the evolution trends of the virus and provide insights into understanding pathogenesis and sars epidemic. specimen from a sars patient was inoculated into cell culture. the presence of sars-cov was determined by rt-pcr and confirmed by electron microscopy. virus was isolated followed by the determination of its genome sequences, which were then analyzed by comparing with other sars-cov genomes. genetic mutations with potential implications to pathogenesis and the epidemic were characterized. this viral genome consists of , nucleotides with overall organization in agreement with that of published isolates. a total of positions were mutated on viral genomes. among them had mutations in more than three genomes. hot spots of nucleotide variations and unique trends of mutations were identified on the viral genomes. mutation rates were different from gene to gene and were correlated well with periodical or geographic characteristics of the epidemic. in november , first case of a novel infectious disease named severe acute respiratory syndrome (sars) suddenly appeared in southern china [ ] . this illness emerged and rapidly spread to different areas of asia and then other countries around the world with a high morbidity (about % required intensive care) and . % fatality [ ] . in march , the world health organization (who) made an unprecedented international effort by organizing world-leading laboratories to find the causative agent. this effort resulted in the declaration made simultaneously by three research groups that a new sars-associated coronavirus (sars-cov) was the pathogen of this disease [ ] [ ] [ ] . when the outbreak of sars came to an end in july , it had caused a cumulative total of cases and deaths worldwide [ ] . since the discovery of sars-cov, progresses regarding the studies of this virus have been swift dramatically as the complete viral genome was sequenced [ ] . although the definition of sars case still largely relied on clinical and epidemiological criteria, diagnostic tests based on the detection of viral rna and proteins have been developed [ ] , along with the development of vaccines [ ] . results from both phylogenetic analysis and epidemiological studies suggested the origin of sars-cov was animal-oriented, most likely from himalayan palm civets, ferrets and raccoon dogs [ ] [ ] [ ] [ ] . as a member of the coronoviridae family, sars-cov is enveloped and positive-stranded rna virus. it harbors coding sequences, including primary structural proteins (nucleocapsid protein n, spike protein s, membrane protein m, and small envelope protein e); non-structural proteins (x , x , x , x , x ); and polyprotein that compose two orfs (orf a and orf b). polyprotein catalytically auto-processes to produce a group of proteins including proteases (plppro and clpro), rnadependent polymerase (pol), rna helicase (hel), and function unknown proteins [ , , ] . like other rna viruses, whose most striking characteristic is the high rate of genetic mutation [ , [ ] [ ] [ ] [ ] [ ] . despite the fact that the sars-cov can cause an atypical and fatal form of pneumonia, the genome structure, gene expression pattern, and protein profiles of the virus are similar to those of other conventional coronaviruses [ ] , which are only responsible for mild respiratory tract infections in a wide range of animals including humans, pigs, cows, mice, cats, and birds [ , ] . it is possible that distinct patterns of several genes and unique variations in the sars-cov genome may contribute to its severe virulence or pathogenesis. the mechanism of sars-cov pathogenesis may involve both direct viral cytocidal effects on the target cells and immunemediated mechanisms. potential mutability of the viral genome may pose problems in the control of future sars epidemics. in this report, we described the isolation of a new sars-cov strain (whu) from a patient in hubei province, china during the late period of sars outbreak. complete genome sequence of whu isolate was determined and compared with that of other sars-cov strains whose complete genomic sequences were available at the time analyzed. comparative study of genetic characterization and nucleotide variation of all known sars-cov offers insights into understanding functions of the viral genes and revealing the evolution trends of the virus. it would also provide basis for clinical diagnosis, future developing potential drugs and vaccines against sars-cov infections. the sars patient was an -year-old male from jiayu county, hubei province, china. he worked in beijing during that time when sars outbreak was occurring. he came back to hubei province and became ill on april th, with fever and atypical pneumonia, and was admitted to hospital for isolation and treatment on may rd . veroe cells were inoculated with specimen obtained from the sars patient. the presence of the sars-associated coronavirus in infected cell cultures was determined by the appearance of cytopathic effects (cpe) as well as by rt-pcr amplification using primers (primer- /primer- and primer- /primer- ; table ) specific to the sars-cov. viral particles were examined under electron microscope. viral rna was extracted from infected veroe cells based on the procedures described by the manufacture (invitrogen, carlsbad, ca). the first strand of the viral cdna was synthesized from extracted viral rna by reverse transcription pcr using random primers provided by the manufacture (promega, madison, wi). double-stranded dna fragments were produced by pcr amplification of the viral cdna using pairs of specific primers (primer to primer ; table ) designed to cover entire viral genome based on the sequences of sars-cov strain hku- (accession number ay ). each of the pcr products was cloned into vector pgem-t, respectively. random clones were selected for dna sequencing analysis. sequences representing the entire viral genome was fully assembled and edited by dnasis software programs. nucleotide sequences of complete genome of the sars-cov isolate (whu) were deposited to genbank (accession number ay ). the complete genome sequences of all sars-associated coronaviruses were downloaded from genbank (table ). homology searches for the dna sequences were conducted and their deduced amino acid sequences were analyzed through the public database with the blast search program provided by the national center for biotechnology information (ncbi). sequence alignment was performed using software clustalw and further analyzed using software bioedit. nucleotide sequences of the entire genome of newly identified whu strain along with that of other sars-cov isolates released in the genbank were aligned with the clustalw software program. phylogenetic trees were created for all nucleotide sequences by neighbor-joining and parsimony methods. sequences were analyzed with reference to the trees to reveal character states relevant to phylogenetic branching. during late period of the sars outbreak in , three patients were identified as probable sars cases in hubei province, a less sars representative area in china. in order to study the sars-cov caused disease, we obtained specimen from one of the patients. seven days after inoculation of veroe cells with patient specimens, cpe was appeared on the infected cells ( fig. ) indicating the presence of an infectious agent. two specific amplicons were detected by rt-pcr amplifications using extracted viral rna as templates when two pairs of sars-cov specific primers were used, respectively (data not shown). these results implicated that exist of a sars-cov in the specimen was highly possible. coronavirus- table . accession numbers of genomic sequences of sarsassociated coronaviruses released in the genbank accession number accession number urbani ay twy ap tws ap twk ap twj ap twh ap cuhk-w ay taiwan tc ay taiwan tc ay taiwan tc ay twc ay frankfurt ay bj ay bj ay bj ay zj ay tor ay tw ay bjo ay shangai qxc ay shangai qxc ay like particles were observed when we further examined infected cells under electron microscope (data not shown). in addition, sars-cov antibodies were detected from the patient's serum. all together, these results provided substantial evidence to suggest that this patient was infected by sars-cov, named whu strain. after identification of the whu strain, we isolated the virus and determined complete nucleotide sequences of its genome (accession numberay ). since this virus was the only sars-cov that has ever been isolated and sequenced from hubei province, we carried out detailed sequence analysis of its entire genome. results from sequence analysis indicated that the genome of whu strain consisted of , nucleotides with a two-nucleotide deletion at residuals , and , . phylogenetic analysis was conducted with the genome sequences of the whu strain and that of all sars-cov isolates, whose genomic sequence information was fully available in the public databases (table ). both phylogenetic study and sequence analysis indicated that the overall genome organization and predicted proteins of whu isolate were in agreement with published studies on other sars-cov isolates (fig. ) . like all sars-cov isolates, the whu strain belongs to a new group of coronavirus [ ] . however, the whu isolate with a two-nucleotide deletion was genetically diverse from most of the published sars-cov isolates, but closely related to twc strain (fig. ) . to investigate the variations of nucleotide sequences among sars coronaviruses, we performed a genome-wide analysis of genetic mutations on all sars-cov genomes. results indicated that a total of positions on the viral genomes had alterative nucleotides. among them, positions with mutations occurred on more than three viral genomes ( table , fig. (table and fig. ) . our next step was to determine whether the high mutability had any implications linked to the viral genes or their functions ( fig. and table ). after further comparison and analysis of the viral sequences, we realized that polyprotein gene (orf a and orf b) had the highest variation rate among all genes. this region not only carried mutations, but also had the second highest variable positions (residual and , ). orf b gene contains additional two residuals ( , and , ) at which viruses were mutated. we also noticed that the s gene had a high mutability with residual mutated in viruses, residual in , and residual in . two positions with high mutation rate were identified within the m gene. one was located at the most variable residual , at which viruses were mutated. the other one was residual , at which viral genomes were changed. e gene and n gene had one mutation spot at residual and , respectively. among five nonstructural genes, x had one mutation site at residual with mutation rate of , while x gene had two mutation spots at residual and with mutation rate of ( fig. and table ). based on the recommendations from who [ ] , all sars cases can be divided periodically into early-period case, mid-period case, and late-period case (table ). in this study, we proposed all known viral isolates into two groups, early-mid period and mid-late period group (table ) . based on results from sequence analysis, we realized that there were some correlations between genetic mutations of the virus and periodical or geographic characteristics of the outbreak. several residuals ( , , , (tables ) . in addition, some genetic mutations were linked to certain geographic regions where the viruses isolated. for instance, high genetic mutation rate at position was mainly found in viruses isolated from taiwan. mutations at residual occurred in most taiwan isolates ( %), but not found in any isolates identified from other regions around the world. moreover, all three viral strains (fra, sod and frankfurt) isolated from europe had mutations at the same residuals, , and , while the rest isolates showed no changes in these positions (tables and ). although the sars epidemic ended after months spreading, many important questions remain unclear. what is the natural reservoir of sars-cov; where and how the virus crossed the barriers between its reservoir and human to initiate reservoir-human transmission, and subsequent human-to-human infection. it was proposed that the natural reservoir of sars-cov was animal originated [ , , ] , most likely himalayan palm civets [ ] . this was not a surprise, since many fatal human viruses including hiv and influenza virus were originated by transmission from animals. hiv pandemic had happened as a consequence of the combination of transmission of sivcpz from chimpanzee and common practice of ''hunting and field-dressing chimpanzee'' in west central africa [ ] . similarly in southern china, where sars-cov initially emerged, people used to consume wild animal meat and some of the animals are now confirmed to carry sars-like coronavirus [ ] . another question is whether sars outbreak will come back. at the beginning of , three sars cases were reported indicating sars do come back. however, the situation of this year seems quite different from last year, since transmission, infection and severity of sars-cov were clearly weakened. one possible explanation is that it might be just a preface of sars epidemics. like last year, in the early period of sars pandemics, the virus did not show strong toxicity. another possibility is that sars-cov might be truly weakened due to many reasons including genetic mutations, like the influenza flua virus which has caused a disaster outbreak in and was weakened after the pandemic that took million lives [ ] . influenza epidemics throughout the world occurred periodically between the first pandemic and present time due to the viral antigenic drift and shift. these processes also resulted in the appearance of influenza b and c virus with significant differences in genetic characterizations [ ] . it would be important to find out if sars-cov has similar epidemic rules as influenza virus dose, whether sars-cov is weakening or will sars breakout periodically. while these questions remain to be addressed, it is for sure that the sars-cov certainly has a high mutation rate on table . summary of genetic mutations within genes of sars-associated coronaviruses orf a position mutation rate its genome, which could in turn play significant roles in its pathogenecity and epidemics of the disease. molecular epidemiology and genome-wide analysis of mutations among sars-cov have provided insights into our understanding some of the questions [ , [ ] [ ] [ ] [ ] [ ] . for instance, except the geographic distribution of potential animal reservoirs, the high homologies among sars-cov of human and sars-like coronavirus of animals strongly supported the hypothesis of animal origin of sars-cov [ ] . it is possible that some mutations on the viral genome were responsible for the transmission of sars-cov from animals to human. in an effort to study the sars-cov, we identified and genetically sequenced a new sars-cov isolated from a patient with sars in hubei province. hubei was a less sars representative area in china, because there were only a total of three patients confirmed as probable sars cases and only one viral strain was isolated from this region. these facts prompted us to study this virus further. our sequence analysis indicated that although the overall genome organization of whu (fig. ) is in agreement with published studies on other isolates, whu carried a two-nucleotide deletion at residuals and was genetically diverse from most sars-cov isolates. these results implicated that mutations occurred during the viral transmission from beijing to hubei, although we do not know at this point whether these mutations have any biological significance. it is interesting to notice that although the sars-cov virus evaded human population only for months, its genetic information already altered in many ways during its short journey of human transmission. individual viral genes displayed distinct patterns of genetic mutations at different time during the sars outbreak. for instance, mutability of the s gene was high during early-mid period, but low during mid-late period of the epidemic, which suggested that mutability of s gene decreased as viral transmission increased. one possible explanation for this observation is that during early-mid period of the epidemic, as the gene encoding protein for the recognition of receptors of the host and for the mediation of viral entry into host cells, s gene had to change at a high frequency in order to quickly fulfill its biological roles. once the viral adaptation to human cells completed or reached its equilibrium, genetic changes were less important or no longer needed. thus, genetic information of s gene became relatively stable during mid-late period of the outbreak [ ] . another example is orf lab that encodes the polyprotein of sars-cov. like s gene, orf lab was also actively involved in genetic mutations. however, in contrast to s gene, mutability of orf lab was low at the beginning, but high during midlate period of the epidemics. this observation can be explained well by the fact that the toxicity of sars-cov was weakened in mid-late period. other structural genes including e, m, and n genes were more conserved at beginning of the outbreak, but underwent genetic changes at the end of transmission. this pattern of genetic mutation obviously reflects biological roles of these structural genes in viral particles assembly, which in turn crucial for the virus to fight with increasing immune pressures from the hosts. genetic analysis of non-structural genes showed that they intended to keep genetic information conserved throughout the entire process of transmission. therefore, these genes may prove to be ideal targets for the diagnosis of sars-co.v, screening antiviral drugs, and perhaps developing antiviral vaccines. patterns of genetic mutations of certain viral genes were linked to geographic locations from where the virus isolated. mutations at residuals and within the x and e genes could clearly set the taiwan isolates apart from others. thus, these two positions may be used as molecular signatures in the identification of taiwan isolates. similar phenomena were also found in three viral strains (sod, fra, and frankfurt) isolated from europe during mid-late period of the outbreak. these viral strains had mutations at the same residuals ( , , and , ), while all isolates from other regions did not show any changes at these positions. this kind of specific mutation pattern may reflect relatively independent geographical locations of taiwan and europe. we speculated that population in these regions perhaps developed unique immunity due to their unique locations, for which the virus had to make specific genetic mutations in order to invade these populations. in addition, based on genome-wide mutation analysis, some viral strains isolated from beijing had a close relationship to isolates identified from southern china during early-mid period of the outbreak. it could be translated to that at least these sars-cov isolates found in beijing were originally from southern china. much have to be done in order to understand thoroughly the evolution, transmission, origin, and infection of sars-associated coronavirus. it is interesting to recognize that genome-wide mutation analysis could provide new insights into our understanding the route of viral transmission and predication or perhaps prevention of future sars epidemics. our study would provide a rational and hypothesis-driven approach to study these questions, develop rapid diagnostic tests, and design measurement to prevent this fatal disease. in addition, fully understand molecular mechanism of genetic mutations would provide insights into understanding plausible transmission route of sars-cov from animal to humans as well as from human to human, and trends of changing in pathogenecity of sars-cov during its rout of transmission and path of evolution. cumulative number of reported probable cases of severe acute respiratory syndrome (sars) department of communicable disease surveillance and response. who consensus document on the epidemiology of severe acute respiratory syndrome (sars) this research was supported by the sars special grant of wuhan university. key: cord- -i t ce authors: chen, xi; yang, jinxian; yu, fusong; ge, junqing; lin, tianlong; song, tieying title: molecular characterization and phylogenetic analysis of porcine epidemic diarrhea virus (pedv) samples from field cases in fujian, china date: - - journal: virus genes doi: . /s - - -x sha: doc_id: cord_uid: i t ce the outbreak of porcine epidemic diarrhea virus (pedv) has been a big problem of swine industry in china in recent years. in this study, we investigated molecular diversity, phylogenetic relationships, and protein characterization of fujian field samples with other pedv reference strains. sequence analysis of the s and sm genes showed that each sample had unique characteristics, and the sample p may be differentiated from the others by the unique deletions and insertions of sm gene. phylogenetic analysis based on s or sm gene, which have high levels of variations, indicated that each sample was related to the specific reference strain, and this finding was consistent with the protein characterization prediction analysis. the study is useful to better understand the prevalence of pedv and its prevention and control in fujian. porcine epidemic diarrhea (ped) is a devastating swine disease that is characterized by acute enteritis and lethal watery diarrhea, followed by dehydration, and frequently leading to a high mortality in piglets [ ] [ ] [ ] . most of the incidence farms found the disease first in farrowing barns and subsequently % mortality of newborn piglets. the disease was first reported in england in [ ] , and since then, outbreaks of the disease have been reported frequently in europe and asia [ ] [ ] [ ] . since s, the disease has continuous outbreak in pig farms of major cities and provinces in china, causing tremendous economical losses to the swine industry [ ] . the causative agent of ped, the porcine epidemic diarrhea virus (pedv), was first described in [ ] . then, a cell culture system was developed for pedv isolation and propagation [ ] . pedv is a member of coronavirus genus and the family coronaviridae. the genome consists of a positive-sense, single-stranded rna, with - kb in size, which can transcribe into several subgenomic mrnas, and encode structure or non-structure proteins in a conserved order [ ] . the polymerase gene, which covering % of the genome, encodes the replicase polyproteins. the genes for major structural proteins including the membrane protein (m), the phosphorylated nucleocapsid protein (n), the small membrane protein (sm), and the spike protein (s) are located downstream of the polymerase gene [ ] . the s glycoprotein makes up the large surface projections of the virion and plays an important role in the attachment of viral particles to the receptor of the host cell [ ] [ ] [ ] . thus, the s glycoprotein would be a primary target for the development of vaccines against pedv. it is also the major envelope glycoprotein of the virion, which serves as an important viral component to understand genetic relationships of different pedv strains and the epidemiological status of pedv in the field [ , , ] . the sm gene is the only accessory gene of pedv. accessory genes are generally maintained and their loss mainly results in attenuation of the virus in the natural host [ ] . for pedv, virulence of the virus can be reduced by altering the accessory gene region in a manner similar with tgev [ ] , and its differentiation could be a marker of virus attenuation [ ] and a valuable tool for the study of molecular epidemiology of pedv [ ] . in china, pedv was first isolated in [ ] , its prevalence has been a big problem of swine industry in recent years, although a periodic vaccination strategy has been applied nationwide to prevent the disease [ ] . thereby, a comprehensive study is necessary to better understand the genetic relationships between different strains, and would be helpful to find out the reason of the continuously outbreak of pedv and develop new strategy to control and prevent pedv infection. in this study, we investigated the molecular epidemiology and analyze phylogenetic relationships of fujian pedv field samples with other pedv reference strains. the study mainly focused on s and sm gene due to their vital roles in viral function and higher variation. partial of intestine or stool specimens were taken individually from the acute enteritis and watery diarrhea piglets of different big swine farms in fujian province in , and designated as p , p , and f , respectively. intestinal samples were homogenized with times of phosphatebuffered saline (pbs). the suspensions were then vortexed and centrifuged for min at , g. the supernatants were stored at - °c before utilization. in order to determine the sequences of the pedv samples, primers were designed based on the sequence of reference pedv strains ( table ) . partial of s gene, i.e., s , was amplified for investigation because of its long length. in brief, viral rna was extracted from the supernatants of the homogenized samples with the rnaiso plus agent (takara, japan) according to the manufacturer's instructions. rt-pcr was conducted individually to amplify each fragment from the isolated rna using primescript Ò one step rt-pcr kit ver. (takara, japan) according to the manufacturer's protocol under the following conditions: reverse transcription at °c for min, denaturation at °c for min, cycles of denaturation at °c for s, annealing at °c for s, and extension at °c for min. the rt-pcr products were analyzed by . % agarose gel electrophoresis and visualized by ultraviolet illumination after ethidium bromide staining. bands of the corresponding size of the gene were excised, and the synthesized dna was purified using a qiaquick gel extraction kit (qiagen, germany) according to the manufacturer's instructions, then sequenced by takaka company. the nucleotide and deduced amino acid sequences of s and sm genes of pedv samples were independently used for sequence alignments. the multiple-sequencing alignments were generated with clustalw method by megalign . [ ] . phylogenetic tree were constructed with deduced amino acid sequences by the bootstrap neighbor-joining method. in the study, the characterizations of deduced amino acid sequences, including pi value, antigenic peptides, hydrophobic positions, and transmembrane motif, were analyzed by danman program. sequence analysis of s region the nucleotide sequences of the sl region are , bp for p , , bp for p , and , bp for f in length (accession number: jq , jq , and jq ). sl protein of p is aa in length with a predicted mr of . kda, sl protein of p and f is aa in length with a predicted mr of . kda. twelve homolog sequences were found in the genbank and shared the similarity of % (table ) . however, mutations were frequently occurred in s gene. the alignment analysis indicated that five sequences including p , f , ch/ interestingly, most of the mutations were observed in the n-terminal region. these variations of p and f were probably due to mutation of the gene with filed strains. p and dr consists of another group (group , fig. ) with specific nucleotide changes, and the mutations occurred in the middle of s gene, interestingly, the purine (c/g) and pyrimidine (a/t) was found interchanged (c/g$a/t). the relationships of group and group were testified by their deduced amino acids. the sequences of group were found to have a long deletion at the initial followed by a short deletion. the mutations of group were found to have a deletion at position and a substitution at position (s?f). in terms of potential asparagine (n)linked glycosylation sites, only sites were found in group , much less than group ( for p and for dr ). unlike the result by lee et al. [ ] , neither gtaaac nor similar sequence was found upstream of the initiator atg of the s gene in all of the chinese and english (cv ) strains. the sm gene of fujian pedv field samples were sequenced (accession number: jq for p , (fig. ) . p and f own and unique point mutations, respectively. however, besides the long deletion in p , only one amino acid was changed by those mutations (f?l at in f , fig. ). in addition, p have one less asparagine (n)-linked glycosylation sites than the others. all the pedv strains including the fujian samples except the sm strain (accession number: gu ) have a conserved sequence (ctagac) at nucleotides upstream of the initiator atg. in order to analyze the phylogenetic relationships between the fujian samples and other pedv strains isolated in various regions worldwide, we constructed phylogenetic trees using the deduced amino acid sequences of s and sm, respectively (fig. ) . the phylogeny based on the s glycoprotein indicated all the strains were clustered into major groups, including one big mixed group (group ) and chinese groups (group and ). p and f formed a subgroup (subgroup ) to differentiate with other strains. the subgroup comprising dr and p (subgroup ) located in group . the result was correlated with the finding from sequence analysis. quite different from the results from s protein, phylogenetic analysis based on the sm protein fragment divided the strains into groups, one of which included p and ch/gsjiii/ (fig. b) . the reason might be the deletions occurred in the p and ch/gsjiii/ . f had a close relationship with dx and formed a subgroup, while p formed another subgroup. the characterization of s protein confirmed the results from phylogenetic analysis ( table ). the characterizations of p and dr , except antigenic peptide number, were shown to be greatly different from those of other strains; and the strains f , p , ch/fjnd- / , cnu- - , and cv shared the similar antigenic peptide, but had one for the sm protein, pi varied from about . to among the chosen strains (table ) , indicating the potential variation of the protein. it was noteworthy that high identities between f and dx were indicated by same characterizations except one hydrophobic region. the identities between p and cv were less than dx and f , differences of which involved in little pi variation, one variation in hydrophobic and transmembrane segments and positions' amino acid mutations (table , underlined). consistent with the phylogenetic analysis, the characterizations of p and ch/gsjiii/ were similar and extremely different from the other strains. since the sm determines the virulence of pedv [ ] , our results would benefit the research on the variation of virulence of pedv in china. the diversities in s and sm were observed to be significant among different strains. although there were so many mutations in this segment, the first unique characteristic was the deletion in the sm gens of ch/gsjiii/ and p . compared to ch/gsjiii/ , p was found to be more viable due to the existence of insertion within the c-terminus domain, the unique point mutations and less asparagine (n)linked glycosylation sites. the long deletion of sm gene, which was also found in the field strain dr (accession number: jq ) and its attenuated strain (accession number: jq ) [ ] , led to reduced pathogenicity and induced protective immune response in pigs [ ] . remarkably, similar results were found in p and there were no significant mutations found in the sequences of other structural protein genes including m, n (data not shown), and s gene, whether the mutated strain reduced its pathogenicity or not needs further study. the loss of sm resulted in attenuation of the virus in the natural host. however, we found that the pedv with long deletion of sm gene also caused typical clinical signs of pedv infection, the pathogenesis mechanism of the virus and how the sm mutant strain comes from also need to be clarified. in general, the variation in sm gene different from the various diversities of sm gene, the s region of the fujian samples have unique mutations in common. coronaviruses have transcription regulatory sequences (trss) that include a highly conserved core sequence -cuaaac- or a related sequence at upstream of encoding genes [ ] . though the sequence ataaac, agaaac, and ctagac were found respectively upstream of the initiator of m, n (data not shown), and sm gene, the sequence gtaaac reported in the korean strains [ ] was not found upstream of the s gene of the fujian pedv samples. however, the neutralizing epitome was conserved in s that is responsible for mediating the production of anti-viral neutralizing antibodies. phylogenetic trees based on the protein sequence were constructed to analyse the relationship between the fujian samples and the other strains. phylogenetic analysis based on sm protein indicated that the strain ch/gsjiii/ was relatively close to p , but distantly related with group . however, park et al [ ] found that ch/gsjiii/ was in group , which was different from our research. the reason for these might be due to the nucleotides sequences were used in the previous study, but amino acid sequences were used in this study. the location of p and f in the tree based on s protein suggested high variation of fujian samples. dr and p were within the same subgroup. as dr was used to develop the pedv vaccine in korea [ ] , it might be interesting to know whether p can be used to develop the pedv vaccine in china. table . a tree based on amino acid sequences of s protein. b tree based on amino acid sequences of sm protein the results of protein characterization prediction confirmed the relationship and demonstrated specific differences between the close strains obtained from sequence and phylogenetic analysis, which might be useful in further functional exploration. it was noteworthy that the unique hydrophobic region in the n-terminus of s protein of cv , cnu- - , dr , and p that might related to the variation of protein structure and function. in conclusion, the fujian pedv samples were classified into different group. both of p and f were found to have close relationship with isolated strains from china, but still have some unique characterizations. the p had highest variation and a close phylogenetic relationship with filed strain ch/gsjiii/ . the underlines indicate amino acid mutations between strains experimental infection of pigs with a new porcine enteric coronavirus, cv porcine epidemic diarrhea, in diseases of swine porcine epidemic diarrhoea virus as a cause of persistent diarrhoea in a herd of breeding and finishing pigs letter to the editor an immunoelectron microscopic and immunofluorescent study on the antigenic relationship between the coronavirus-like agent, cv , and several coronaviruses chinese-like strain of porcine epidemic diarrhea virus an outbreak of swine diarrhea of a new-type associated with coronavirus-like particles in japan molecular epidemiology of porcine epidemic diarrhea virus in china a new coronavirus-like particle associated with diarrhea in swine propagation of the virus of porcine epidemic diarrhea in cell culture the genome organization of the nidovirales: similarities and differences between arteri-, toro-, and coronaviruses the coronavirus spike protein is a class i virus fusion protein: structural and functional characterization of the fusion core complex identification of the epitope region capable of inducing neutralizing antibodies against the porcine epidemic diarrhea virus major receptorbinding and neutralization determinants are located within the same domain of the transmissible gastroenteritis virus (coronavirus) spike protein sequence analysis of the partial spike glycoprotein gene of porcine epidemic diarrhea viruses isolated in korea coronaviruses: structure and genome expression the group-specific murine coronavirus genes are not essential, but their deletion, by reverse genetics, is attenuating in the natural host efficacy of a transmissible gastroenteritis coronavirus with an altered orf- gene cloning and further sequence analysis of the orf gene of wild-and attenuated-type porcine epidemic diarrhea viruses porcine epidemic diarrhea molecular characterization and phylogenetic analysis of membrane protein genes of porcine epidemic diarrhea virus isolates in china the clustal_x windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools heterogeneity in spike protein genes of porcine epidemic diarrhea viruses differentiation of a vero cell adapted porcine epidemic diarrhea virus from korean field strains by restriction fragment length polymorphism analysis of orf complete genome sequences of a korean virulent porcine epidemic diarrhea virus and its attenuated counterpart complete genome sequence of transmissible gastroenteritis coronavirus pur -mad clone and evolution of the purdue virus cluster molecular characterization and phylogenetic analysis of porcine epidemic diarrhea virus (pedv) field isolates in korea cloning and further sequence analysis of the spike gene of attenuated porcine epidemic diarrhea virus dr acknowledgments this work was supported by project management for agricultural science and technology achievements transformation fund ( gb c ), science and technology major project of fujian ( nz - ), and national spark program ( ga ). we also thank hualong feedstuffs technology and development group company in fujian province for sample collection and advices. key: cord- -nnx nwf authors: ren, xiaofeng; li, pengchong title: development of reverse transcription loop-mediated isothermal amplification for rapid detection of porcine epidemic diarrhea virus date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: nnx nwf in this study, a reverse transcription loop-mediated isothermal amplification (rt-lamp) was developed for detection of porcine epidemic diarrhea virus (pedv). six primers were designed to amplify the nucleocapsid (n) gene of pedv. the optimization, sensitivity, and specificity of the rt-lamp were investigated. the results showed that the optimal reaction condition for rt-lamp amplifying pedv n gene was achieved at °c for min. the rt-lamp assay was more sensitive than gel-based rt-pcr and enzyme-linked immunosorbent assay. it was capable of detecting pedv from clinical samples and differentiating pedv from porcine transmissible gastroenteritis virus, porcine rotavirus, porcine pseudorabies virus, porcine reproductive and respiratory syndrome virus, and avian infectious bronchitis virus. porcine epidemic diarrhea (ped) is an infectious enteric disease characterized by acute enteritis and diarrhea in pigs, and the infection is more severe in neonates [ ] . at present, ped has been a major concern in the swine industry, particularly, in the asia and europe, resulting in large economic losses [ ] [ ] [ ] . the causative agent of ped is porcine epidemic diarrhea virus (pedv), an enveloped and single-stranded rna virus that belongs to the family coronaviridae [ ] . coronavirus comprises three major viral structural proteins: spike (s, - kda), membrane (m, - kda), and nucleocapsid (n, - kda) proteins [ , ] . the s protein is a major viral antigen, binds to a cellular receptor for virus attachment to enter target cells and mediates viral attachment to target cells [ ] [ ] [ ] [ ] [ ] [ ] [ ] . the m protein is a trans-membrane protein [ ] and it is involved in the assembly process of viral nucleocapsid and membrane [ , ] . the n protein of coronaviruses is a phosphorylated protein that interacts with virus genomic rna, forming a helical ribonucleoprotein [ ] . therefore, it plays important roles in viral genome transcription, core formation and virus assembly [ ] . viral n protein is also conserved and can be used as a diagnostic target for detecting viral infection. loop-mediated isothermal amplification (lamp) is a recently developed dna amplification method [ ] . lamp uses four to six primers that recognize six to eight regions of target dna, in conjunction with the enzyme bst polymerase, which has strand-displacement activity. the synchronization dna synthesis by these primers maintains the specificity of the method. interestingly, the amplification step can be performed under isothermal conditions, resulting in the synthesis of a large amount of dna. lamp proceeds when the forward inner primer (fip) anneals to the complementary region in the target dna and initiates the first-strand synthesis. next, the outer forward primer (f ) hybridists and displaces the first strand, forming a loop structure at one end. the resulting single-stranded dna serves as a template for backward inner primer (bip)-initiated dna synthesis and subsequent outer backward (b )-primed strand-displacement dna synthesis. the formed dumbbell-shaped dna stem loop structure serves as a template for subsequent hybridization between one inner primer and the loop, initiating the displacement dna synthesis. the lamp method may form the original stem loop and a new stem loop that is twice as long as the original one. the final products are stem loop dnas with several inverted repeats of the target dna and cauliflower-like structures bearing multiple loops [ ] . at present, the lamp approach has been applied for detecting of infectious pathogens. examples include h n avian influenza virus [ ] , hepatitis b virus [ ] , foot-andmouth disease virus [ ] , etc. in this study, the authors developed a reverse transcription (rt)-lamp using primers directed toward the n gene of pedv. the convenience, sensitivity, and specificity of the established rt-lamp indicate its advantages and utility in detecting pedv. pedv isolate hljby, porcine transmissible gastroenteritis virus (tgev), porcine pseudorabies virus (prv), porcine rotavirus (prv), porcine reproductive and respiratory syndrome virus (prrsv) and avian infectious bronchitis virus (ibv) are propagated in susceptible cells. based on the n gene sequence of pedv (genbank accession number: gu ), a total of six primers targeting the n gene were designed using the primer explorer version (http://primerexplorer.jp/lamp . . /index.html). they include an outer pair (f , b ), an inner pair (fip, bip), and a loop pair (f-loop, b-loop). a pair of primers (named ped and ped ) was used for rt-pcr amplifying the n gene. information regarding the primer names and sequences is shown in table . pedv propagation and rna extraction pedv was propagated in african green monkey kidney (vero) cells according to reference with modification [ ] . in brief, vero cells were cultured in dulbecco's modified eagle medium (dmem) supplementary with % newborn bovine serum (excell bio, china) in six-well plates at °c to allow the formation of cell monolayer. the cells were washed with pbs and infected with pedv ( ll/ well) at an multiplicity of infection (moi) of at °c for h in the presence of edta-free trypsin at a final concentration of lg/ml. dmem containing edta-free trypsin ( lg/ml) was then added into the wells ( . ml/ well) and the culture was maintained at °c for - h. the titer of pedv was . tcid /ml. the total rnas were extracted from the culture supernatants of pedv, tgev, prv, prrsv, and ibv using the rna extraction kit (keygen biotech, china) and the genomic dna of prv was extracted from virus-infected vero cell culture using the dna extraction kit (omega, norcross, usa) according to the manufacturer's instructions. the extracted rna was subjected to reverse transcription (rt) to synthesize the cdna using reverse primer ped and a cdna synthesis kit (haigene, china) according to the manufacturer's instructions. the reaction mixture contained rna template ( lg sterile water was used as a negative control template. the amplified dna products from the rt-lamp were analyzed by separating ll of rt-lamp reaction mixture in ethidium bromide-stained % agarose gel electrophoresis, where the positive reaction mixtures showed a characteristic ladder of multiple bands. the relative quantification of the dna was performed using the gel documentation system (uvitec, cambridge, uk) and determined with gel analyzer software (copyright by dr. istvan lazar) according to the manufacturer's instructions. the reaction result was also observed directly without staining because of the white precipitate of magnesium pyrophosphate or the green color produced by the intercalating dye picogreen Ò (invitrogen, wisconsin, usa) in positive reactions. after the amplification was completed, ll of coloring agent ( loading buffer:gene finder = : ) was added to each test tube and mixed. the test tubes were then examined visually. to determine the optimal reaction temperature, the rt-lamp reaction mixtures were incubated at , , , or °c for min. the optimal reaction time was determined by performing the rt-lamp at the optimal temperature for , , , , or min. finally, the reaction was terminated by heat inactivation at °c for min. the amplified dna products from the rt-lamp assays were visualized by agarose gel electrophoresis as above. the concentration of pedv rna was determined using an ultra-violet photometer (type , shanghai spectrum instrument company) according to the manufacturer's instructions. then the tenfold serial dilutions of the rna ( lg/ml) were used as template for rt-lamp and a conventional rt-pcr. the rt-lamp was performed as above. the rt-pcr was performed using a rt-pcr kit (haigene, china). elisa was performed to compare its sensitivity with rt-lamp. in brief, purified pedv particles ( . lg/ll) were serially diluted in carbonate-bicarbonate buffer ( mm na co , mm nahco , ph . ) and the viruses were coated into elisa plates ( ll/well) at °c overnight. the next day, the plates were blocked with % non-fat dry milk in pbs- . % tween (pbst) at °c for h. subsequently, the wells were incubated with serially diluted anti-pedv polyclonal antibody ( : dilution) or control serum from a non-immunized rabbit at °c for h, after triple wash with pbst. the plates were incubated with horseradish peroxidase-conjugated goat anti-rabbit igg (boster, china : dilution in pbst) at °c for h. the wells were incubated with o-phenylenediamine dihydrochloride (opd) substrate for min after complete washing with pbst. the od value was examined using an elisa reader. the od value of anti-pedv serum positive well (p)/the od value of control serum well (n) [ was regarded as positive. twenty clinical feces of piglets (approx. weeks) with diarrhea symptom were collected from a pig farm in heilongjiang province in . the samples were prepared as a % (w/v) suspension in pbs (ph . ) and centrifuged at g at °c for min. the supernatant was subjected to rna extraction with above-mentioned rna extraction kit. the resulting rna was used as a template for rt-pcr and rt-lamp according to above-mentioned protocols. at the same time, the equal supernatant was used as coating antigen in elisa as above. the detection limit of the three methods was compared. to analyze the specificity of the rt-lamp, pedv, tgev, prv, prv, prrsv, and ibv were used as templates and subjected to rt-lamp as above. using pedv rna and six primers targeting the pedv n gene, an rt-lamp was done at °c in a water bath for h. the resulting amplified dna products showed a characteristic ladder of multiple bands, indicating that the final products were the mixtures of stem loop dnas with various stem lengths (fig. a) . in contrast, the negative control did not show the characteristic bands. the results of virus genes ( ) : - the rt-lamp reaction were also determined directly by visual inspection. if the reaction product is positive, the gene finder dye inserts into the double-stranded dna after the reaction and the product becomes green; otherwise, the dye does not insert into the double-stranded dna and the reaction sample remains blue (fig. b) . the effect of reaction temperature and incubation time on the rt-lamp was investigated. as shown in fig. a , the dna products of the rt-lamp at different temperatures showed multiple of characteristic ladder bands; however, the intensity of dnas determined by gel analyzer software from the reactions at °c was stronger than that at other reaction temperatures, which was judged as the optimal temperature for rt-lamp amplifying pedv n gene. the rt-lamp was then performed at °c for different time points. the results indicated that the dna products showed the highest intensity when the reaction was performed for min (fig. b) . therefore, the optimal reaction condition of the current rt-lamp for pedv was °c for min. the sensitivity of the rt-lamp assay was first compared with the conventional rt-pcr amplifying the tenfold serial dilutions of rna templates of pedv. the detection limit of rt-pcr was . - lg/ml which equal to a virus titer of . tcid /ml, while, the rt-lamp had a detection limit of . - ( . tcid /ml) which was much higher than that of rt-pcr (fig. ) . after applying the same concentration of pedv particles in rt-lamp and elisa, the minimal required virus template amount for the both assays was analyzed. the results showed that the detection limit of rt-lamp was - lg. in contrast, elisa had a detection limit of - lg (fig. ) . table ). the rt-lamp had a similar sensitivity with elisa and was somewhat sensitive than rt-pcr in detection of clinical samples. to analyze the utility of the rt-lamp, several related porcine viruses (i.e., tgev, prv, and prv) and an avian coronavirus, ibv were used as templates and included in the rt-lamp. the result indicated that no positive dna products of the rt-lamp assay were observed among these control viruses. when the pedv was used as template, the positive bands were amplified as expected (fig. ) . the result demonstrated that the rt-lamp assay is specific and can be applied in discriminating elisa for distinguishing pedv from other viruses. there are ped epidemics in china, although inactivated vaccines are used in some regions in china. establishment of rapid, sensitive, and cost-effective diagnostic assays for detecting pedv is highly desirable. virus isolation has been a popular detection method; nevertheless, the virological diagnosis is somewhat difficult for detecting pedv, since it was not possible until to propagate porcine epidemic diarrhea virus in cell culture [ ] . even now, the viral titer of pedv in cell culture is still low. other diagnostic methods for detecting pedv include immunohistochemistry, in situ hybridization, dot-blot hybridization, rt-pcr, and real-time rt-pcr [ ] [ ] [ ] [ ] [ ] . these methods may require either high-precision instruments or complicated procedures. therefore, they are unsuitable for detecting pedv in fields and in less well-equipped laboratories. the rt-lamp method established in this study is a valuable alternative for detection of pedv, since the novel dna amplification technology owns numerous advantages such as simplicity, rapidity, and inexpensiveness. the isothermal conditions required for lamp can be provided with a conventional water bath or heat block. therefore, the current method can be applied less in well-equipped laboratories and fields for rapid detection of pedv. in general, the lamp can be carried out under isothermal conditions ( - °c). in this study, the authors optimized the reaction conditions of the rt-lamp by performing the test at different temperatures and time points. subsequently, its sensitivity was compared with that of rt-pcr. the results showed that the rt-lamp specific for pedv was approx. , times sensitive than the rt-pcr. nevertheless, it is necessary to screen other optimal primers to further compare the sensitivity between rt-lamp and rt-pcr in the future. moreover, the sensitivity between the rt-lamp and conventional elisa was compared using the inactivated pedv as template. the former is more sensitive than the latter. two reports have pointed out that the detection limit of rt-pcr for pedv was . tcid /ml [ , ] . detection limit of a commercially available elisa kit (jinma, shanghai) used in china was . ng/ml, which was the same as the detection limit of the rt-lamp developed in this study. this result further confirmed the sensitivity of the rt-lamp for amplifying the n gene of pedv. nonetheless, when the authors used these methods to detect pedv from clinical samples, the sensitivity of rt-lamp was somewhat higher than rt-pcr and had a similar sensitivity with elisa. more experiments are needed to clarify this point in the future; however, the rt-lamp still has advantages including simplicity, rapidity, and convenience. to analyze the specificity of the rt-lamp for pedv, several related or unrelated viruses were used as control templates. for example, pedv and tgev belong to the group i coronaviruses which are closely related [ ] . ibv and prrsv belong to the group iii coronavirus and arterivirus, respectively; however, both viruses belong to the order nidovirales [ , ] . the structural similarity between the n proteins of ibv and prrsv suggests that members of the coronaviridae and arteriviridae families share a mechanism of filamentous nucleocapsid formation, with suitable alterations necessary to interact specifically with their respective genomes [ , ] . prv and prv are members of the families herpesviridae and reoviridae, respectively. these viruses such as pedv, tgev, prv, prv, or prrsv may cause co-infection in pigs. the results showed that the rt-lamp is successful only if the pedv served as template, indicating that the established method is specific and applicable for differentiation diagnosis. to the knowledge, this is the first report regarding the establishment and optimization of a rt-lamp for pedv n gene. the assay may be useful for the clinical diagnosis of pedv infection. proceedings of the international pig veterinary society congress veterinary virology the coronaviridae acknowledgments the authors acknowledge funding supported by program for new century excellent talents in heilongjiang provincial university ( -ncet- ). key: cord- -a mjj dt authors: abid, nabil ben salem; chupin, sergei a.; bjadovskaya, olga p.; andreeva, olga g.; aouni, mahjoub; buesa, javier; baybikov, taufik z.; prokhvatilova, larisa b. title: molecular study of porcine transmissible gastroenteritis virus after serial animal passages revealed point mutations in s protein date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: a mjj dt porcine respiratory coronavirus is related genetically to porcine transmissible gastroenteritis virus with a large deletion in s protein. the respiratory virus is a mutated form that may be a consequence of the gastroenteritis virus’s evolution. intensive passages of the virus in its natural host may enhance the appearance of mutations and therefore may contribute to any attenuated form of the virus. the objective of this study was to characterize the porcine transmissible gastroenteritis virus tmk strain after passages in piglets from until . a typical experimental infection, molecular characterization, and serological analysis were also carried out to further characterize and to evaluate any significant difference between strains. the sequence analysis showed two amino acid deletions and loss of an n-glycosylation site in transmissible gastroenteritis virus s protein after passages in piglets. although these deletions were positioned at the beginning of the antigenic site b of s protein, no clinical differences were observed in piglets infected experimentally either with the native virus or the mutated one. serological tests did not show any antibody reactivity difference between the two strains. in this article, we report that the s protein deletion did not affect the virus’s pathogenicity. the variety of the virus’s evolutionary forms may be a result, not only of the multiple passages in natural hosts, but also of other factors, such as different pathogens co-infection, nutrition, immunity, and others. further studies need to be carried out to characterize the mutated strain. porcine transmissible gastroenteritis virus (tgev) is an enteropathogenic coronavirus [ , ] . it infects pigs of all ages; however, infection is most severe in newborn piglets, resulting in a fatal diarrhea [ , ] . the disease outbreaks were observed in many countries such as japan, canada, russia, and causing considerable economic damages in swine industries [ ] . the mortality of the newborn piglets can reach % [ , ] . tgev has three major structural proteins: the spike (s), the nucleoprotein (n), and the membrane (m) [ , ] . protein s is the major inducer of tgev neutralizing antibodies (abs). studies of tgev mutations were enhanced by the detection of porcine respiratory coronavirus (prcv) in the winter of - [ ] . this virus was closely related to tgev [ , , ] and the main difference, which was found between the two viruses, was the large deletion in the s gene ( - bp for the american strains and bp for the european strains) [ , ] . as a result, prcv had a deletion of amino acids (european strain) or amino acids (american strains) and the loss of the antigenic sites c and b in the spike s protein [ , , , , ] . furthermore, some prcv strains differ from tgev by two minor deletions [ , , ] . based on sequence and phylogenetic analysis, sanchez and collaborators proposed that the european prcv strains have been derived by a -nt deletion from an enteric tgev and the origin of prcv may be the consequence of tgev evolution by sequence deletion in the s protein [ ] . although the nature of the events responsible for the genomic diversity between prcv and tgev remains an open question, both homologous and heterologous rna recombinations, and an accumulation of point mutations were proposed to be a driving force for coronavirus evolution [ ] . unlike tgev, prcv replicated in the respiratory tract yet with low extent in the gut. this deletion changed the primary affinity of the virus from the digestive to the respiratory tract. in this respect, mutants of mouse hepatitis virus encoding a point-mutated or a truncated s protein have been shown to be neuron-attenuated for mice [ ] . prcv constitutes a good example of genotypic and phenotypic changes as a result of sequence deletion. the infection with prcv induced the production of abs which were able to neutralize both tgev and prcv. as a result, the incidence and the severity of tgev infection have been decreased dramatically since the widespread of prcv infection in swine herds. it was proposed that prcv behaved as a natural vaccine against tgev, which makes the study of its origin and evolution interesting [ ] . the mutation of tgev s gene could be enhanced by intensive passages of the virus in its natural host under the influence of many environmental factors. the objective of this study was to carry out a typical experiment to compare two tgev tmk strains, one of which was cell culture, adapted and maintained in the laboratory for about years, and the other virus was passed in animals during the same period of time. cell culture adapted tgev tmk strain (the collection of microorganism strains of the federal governmental institute, all-russian research institute for animal health, fgi arriah) was used in this study. it was detected in a -week-old piglet with diarrhea in a swine farm in the mid s. the strain was cell culture adapted and propagated in swine testis (st) cells as described previously [ ] . these cells were found to be permissive to porcine hemagglutinating encephalomyelitis virus (hev), swine influenza virus (siv), and porcine tgev. in addition, st cells were shown to be not permissive to porcine reproductive and respiratory syndrome virus (prrsv). st cells are almost used for the isolation of tgev. the continuous st cell culture was started from trypsinized testicular tissue from swine fetuses, as described previously [ ] . cells were seeded in earle's balanced salt solution with . % lactalbumin hydrolysate (lah) and % pig serum. after a period of adjustment, the cells were trypsinized and sub-cultured at weekly intervals. tgev was propagated in st cells with mem medium buffered with mm hepes and . % (w/v) sodium bicarbonate, supplemented with % fetal bovine serum, % (v/v) antibiotics, and mm l-glutamine. cells were examined daily for cytopathic effects (cpes). when cpe was shown in % of cell monolayer, cells and supernatant fluids were frozen and thawed three times to release intracellular virus into the medium. the fluid was clarified by low-speed centrifugation ( g for min). the virus was then titrated by growth of serial -fold dilutions in vero cells. the virus titer was estimated by reed and muench method [ ] , and the virus amount per ml volume was calculated. tgev tmk strain was passaged annually in -weekold piglets during the period from until . these piglets were sacrificed, and the intestinal contents were taken, clarified, and stored at - °c until use. this tgev strain was referred in this study to as ptmk . rna extraction and rt-pcr rna extraction was carried out according to the procedure of gribanov et al. [ ] with modifications using gf/f nitrocellulose glass filters (whatman, england), as described previously [ ] . the tgev genome was detected using the reverse-transcription polymerase chain reaction (rt-pcr) method, as described in the original article [ ] . rt-pcr and dna sequencing were carried out using primers s and s . the primers were located in prcv s gene region, which were able to differentiate prcv and tgev viruses. the nucleotide and the amino acid sequences of tgev tmk strains were compared with the corresponding sequences of tgev strains, available in genbank database. sequences were analyzed with the computer program mega version . [ ] . in this study, go-taq polymerase (promega, moscow, russia) was used and showed high fidelity by sequencing of more than pcr products for the detection of porcine epidemic diarrhea virus and porcine rotavirus. nevertheless, dna fragments of ptmk strain were sequenced three times to be sure of the given mutations and to exclude any spontaneous mutations introduced by the dna polymerase. to compare the pathogenicity of the two tgev strains, six -week-old piglets were used in this typical experiment, whereas four -week-old piglets were used to study the neutralizing abs production. before use, all animals were confirmed negative for coronavirus abs by a blocking enzyme linked immunosorbent assay (elisa) (ingezim corona differential, spain). for the first lot ( -week-old piglets): two piglets were inoculated with cell culture adapted tgev tmk strain; the next two piglets were inoculated with ptmk strain, and the remaining two piglets were used as a control group. the piglets were inoculated orally with ml of cell culture supernatant, containing tgev tmk strain or ptmk at a titer of tcid /ml. the control piglets were inoculated with uninfected cell culture supernatant. for the second lot ( -week-old piglets): one piglet was inoculated with cell culture adapted tgev tmk strain at a titer of tcid /ml, one piglet was inoculated with ptmk strain at a titer of tcid /ml, one piglet was inoculated with a mixture of the two strains, and the remaining one piglet was used as control. all piglets were housed individually and fed with a commercial sterile milk substitute. clinical signs and rectal temperature were recorded daily. the stool samples were collected after the manifestation of diarrhea. major organs, including liver, lung, kidney, spleen, and intestine were collected aseptically post mortem for virus genome detection using rt-pcr analysis. the detection of the tgev antigens in stool samples was carried out using a commercial kit (anigenrapid tge ag test kit, south korea) according to manufacturer's protocol. the elisa test is an useful diagnostic method for the differentiation between tgev and prcv viruses [ ] . there are several available tests using mabs based on the s protein epitopic differences that have been developed to differentiate tgev and prcv abs [ , , , , ] . sera were collected before infection and every day during the first days after inoculation, and then samples were taken every days. abs against tgev in serum was established using blocking elisa test (ingezim corona differential, spain). in brief, the coronavirus antigen was fixed to a solid support (polystyrene plate) [ , , ] . serum sample was added in two wells. after incubation, a specific peroxidaseconjugated mab against one common epitope of both coronaviruses was added to one of the wells. if the serum sample contains abs against any one of the viruses, they will not permit the binding of the labeled mab to the antigen, whereas if it does not contain specific abs, the mab will bind to the antigen on the plate. in the other well, a specific peroxidase-conjugated mab against one specific epitope of the tgev was added. the experiment was performed as described above, the different combinations of the results in both wells permitted us to know if the serum sample contains abs against tgev or against prcv, or do not contain abs against either of them. the presence of anti-tgev abs was considered confirmation that the tgev infection had succeeded. after elisa testing was complete, sera were subsequently heated at °c for min and evaluated for virus neutralizing ab against porcine coronaviruses using standard microtiter virus neutralizing assay (vn). serial -fold dilutions of sera were made in -well microtiter plates. tgev strain was added to each well (approximately tcid ). after h of incubation at °c, st cells were added, and the plates were incubated at °c in a % co atmosphere for - days. each well was then examined for cpe. the antibody titer was determined to be the dilution of sera where % of the wells were infected. positive and negative controls were included in each reaction. results were presented as the reciprocal of the highest dilution of the test sample capable of neutralizing the virus. sequence analysis of tgev ptmk strain revealed a deletion of six nucleotides in the s gene fragment. the deduced amino acid sequence of ptmk strain revealed two amino acid deletions in two positions and of the s protein and two amino acid substitutions in position (threonine by lysine) and (asparagine by threonine) resulting in a loss of the glycosylation site upstream the b antigenic site (fig. ) . although no amino acid deletion was detected in ptmk s protein (data not shown), virus may undergo other nucleotide changes elsewhere in the genome. the nucleotide sequence of the amplified s gene fragment of the mutated tgev was deposited in genbank under accession number gq . to further characterize the mutated strain, typical experimental infections using tmk and ptmk strains were carried out. clinical signs were reproduced successfully, and the virus was recovered from the fecal samples of the infected newborn piglets. overall, the clinical signs observed in piglets, infected with the cell culture adapted tmk and ptmk strains, were consistent with a typical tgev infection and did not show any clinical difference in these typical experiments. for the -week-old piglets: clinical signs showed loss of appetite, vomiting, a yellowish diarrhea with smell of foul steatorrhea due to maldigestion and depression h post infection (hpi). infected piglets developed hyperthermia (rectal temperature [ °c) hpi. severe dehydration and death were observed days post infection (dpi). post mortem, the main pathological findings were the gas-filled stomach and intestine, coagulated transparent milk in the intestine, intestinal swelling, and congestion. the control piglets did not exhibit clinical signs consistent with tgev infection (table ) . for the -week-old piglets: clinical signs were less severe. these piglets were less susceptible to death in comparison to the newborns. the clinical signs were depression, loss of appetite, and diarrhea dpi. rectal temperature increased for the first dpi and it returned to its normal levels. in contrast to -week-old piglets, all the -week-old piglets recovered dpi. the kinetic of these clinical signs were shown to be the same for all piglets infected either by tmk , ptmk , or tmk /ptmk strains. no clinical signs were shown for the non infected piglets (controls) ( table ). as shown in table , the genome of tgev was detected by rt-pcr in stool samples of the all -week-old infected piglets either by tmk or ptmk strains hpi. the isolation of tgev in cell culture failed and was not possible before hpi. the detection of tgev antigen using immunochromatographic test strip was detected in stool samples of the infected piglets before hpi. in the acute phase of infection, tgev neutralizing abs were not detected by virus neutralization assay. all the infected piglets were dead by the virus dpi. no antibody response was seen for the non infected piglets. post mortem, tgev genome was detected in intestine, lung, kidney, spleen, and liver samples of the infected piglets either by tgev tmk or ptmk strains. vn test was carried out using antisera taken from the -week-old infected piglets. blood samples were taken daily after infection to monitor the tgev abs level. table , the highest ab titer was detected during a period from to dpi, and then the titer slightly declined until dpi. although tgev abs were not detected dpi, it may present in serum at low level. in general, tgev ab titer in infected animals during the acute phase of infection was less than this titer in the infected animals during the convalescent phase [ ] . the infected piglets remain healthy after the typical experiment. the infection has produced adequate immunity due to the ability of viruses (tmk and ptmk ) to infect the intestinal tract and consequently stimulate b cells for the production of immunoglobulin class a (iga) [ ] . in addition, cell-mediated immunity plays a direct role in the protection and the recovery from infection, and the production of abs is regulated by various cytokines derived from activated mononuclear cells during the immune response [ ] . although the use of -week-old piglets is limited to study the neutralizing ab production, stool samples were tested for the presence of viral genome. the same results were obtained for the two virus strains (data not shown). genetic variability has been observed for all rna viruses examined, and their potential for rapid evolution is increasingly recognized as the basis of their ubiquity and adaptability [ , ] . the molecular mechanisms underlying rna virus variations are: mutation, homologous and non homologous recombinations, and genome reassortment in viruses with a segmented genome such as reoviruses. the genetic evolution of viruses is an important aspect of the epidemiology of viral diseases and sometimes causes problems in the development of successful vaccines. however, for some viruses, such evolutionary behavior may generate new variations in favor of their natural host as the case of tgev and prcv (in favor of pigs) where abs produced after primary infection with prcv may prevent pigs to die after infection with tgev. the incidence and the severity of tgev infection have decreased dramatically in the world since the widespread of prcv infection. however, tgev outbreaks still occur at several prcv antibody-positive farms [ ] . such report enhances investigators to use tgev as a model to study gastroenteritis infections. experimental studies provided further signs of the ability of extensive virus passages in animals to generate new mutants and new variants [ ] . however, the degree of tgev mutation in infected pigs over time is not known. to address this question, we examined the genetic and antigenic changes that occurred in tgev tmk strain using pcr and dna sequence analysis, for viral rna variability study, we can detect and characterize recombination events with extreme precision [ , , , , ] . in this study, we have focused on nucleotide deletion that occurred in tgev tmk strain after passages on animals for a long period of time resulting in two amino acid deletions in s protein. although the homologous rna recombination between virus genomes (tmk and ptmk ) in piglets infected experimentally was not proved, it is not excluded. it is possible that rna recombination among virus particles of the same strain occurs naturally and under experimental conditions. our results showed no difference in pathology caused by either virus (tmk and ptmk ) in either piglet population ( -week-old and -week-old piglets) in these typical experiments. although no clinical differences were observed in piglets infected experimentally with native and mutated strains, tgev s gene showed some degrees of change. furthermore, the mutated strain undergoes a modified n-glycosylation site upstream the antigenic site b. it has been shown in other studies that site b is fully dependent on glycosylation for proper folding [ ] . the loss of n-glycosylation at the beginning of the b antigenic site needs to be further investigated by the analysis of the in silica d protein modeling of the s protein. further investigation needs to be undertaken to analyze host-pathogen interaction by studying protein-protein and protein-rna interactions, and experimental co-infection of piglets with gastroenteritis pathogens to better characterize the tgev mutated strain and to explore any interference phenomena. according to results of this study, we cannot suggest that tgev s gene mutations is responsible for changing any biological function (host-pathogen interactions) of the virus unless we carry out further biochemical, structural, and proteomic analysis. nd not detected the coronaviridae pathogenesis of transmissible gastroenteritis of swine viral infections of the gastrointestinal tract viral diarrhea of man and animals diseases of swine diseases of swine we are grateful to dr. nikolay zinyakov from viral molecular diagnostics laboratory of avian diseases for sequencing the dna samples. key: cord- -kl hoa x authors: farkas, tibor; fey, brittney; hargitt, edwin; parcells, mark; ladman, brian; murgia, maria; saif, yehia title: molecular detection of novel picornaviruses in chickens and turkeys date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: kl hoa x fecal specimens, including swabs and litter extracts, collected from chickens, domestic ducks, turkeys, and canadian geese were tested using degenerate primers targeting regions encoding for conserved amino acid motifs (ygdd and dy(t/s)(r/k/g)wdst) in calicivirus rna-dependent rna polymerases. similar motifs are also present in other rna viruses. two fecal specimens and litter extracts collected from chickens and turkeys yielded rt-pcr products. blast search and phylogenetic analysis revealed that all amplicons represented picornaviruses that clustered into two major groups. four chicken and one turkey samples yielded bp amplicons with – % nucleotide identity to the recently described turkey hepatitis viruses, while and bp amplicons obtained from chicken and turkey samples represented novel picornaviruses with the closest nucleotide identity to kobuviruses ( – %) and turdiviruses ( – %). analysis of . – . kb extended genome sequences including the partial p ( c) and complete p ( a, b (vpg), c(pro), and d(pol)) regions of selected strains indicated that viruses yielding the / bp amplicons represent a putative new genus of picornaviridae. the ′-non-translated region (ntr) of the turkey hepatitis-like viruses described in this study was significantly longer ( – nt) than that of any of the other piconaviruses and included a putative short open reading frame (orf). in summary, we report the molecular detection of novel picornaviruses that appear to be endemic in both chickens and turkeys. picornaviruses are small, non-enveloped, single-stranded positive strand rna viruses with a * - kb genome. all known picornavirus genomes encode a single long open reading frame (orf) from which a long polyprotein is translated and cleaved by virus encoded proteases to yield the individual structural and non-structural viral proteins. the long orf has been divided into three regions: p , p , and p . the p region encodes the viral capsid proteins while the p and p regions encode proteins involved in protein processing or genome replication: a pro , b, c and a, b (vpg), c pro , d pol , respectively. picornaviruses have been described in humans and different animal species and can be the causative agents of a wide variety of diseases. the picornaviridae family currently consists of genera: enterovirus, cardiovirus, aphtovirus, hepatovirus, parechovirus, erbovirus, kobuvirus, teschovirus, sapelovirus, senecavirus, tremovirus, and avihepatovirus [ ] . several other picornaviruses including the recently described turdiviruses and turkey hepatitis viruses still await species or genus assignment. turdiviruses were discovered in tracheal and cloacal swabs obtained from dead wild birds of the genus turdus in the family turdidae [ ] . two distinct groups representing two proposed new genera (ortho-and paraturdivirus) have been described. these viruses could not be propagated in cell culture or in chicken embryos and their prevalence, host range, and disease burden are unknown. picornaviruses that were tentatively named turkey hepatitis viruses (thv) were recently discovered in liver samples collected from diseased turkey poults with turkey hepatitis. thv was also detected in bile, intestine, serum, and cloacal swabs of diseased animals and is the candidate causative agent of turkey hepatitis [ ] . based on the morphological descriptions of small round viruses in healthy and diseased avian species [ ] [ ] [ ] [ ] [ ] we initiated a study for the molecular detection of caliciviruses in avian fecal specimens. this study utilized a broadly reactive primer set targeting conserved amino acid motifs encoding regions present in calicivirus rna-dependent rna polymerases (rdrp) and are partially also present in other viral rdrps. as part of the study here we report the serendipitous detection of novel picornaviruses in chicken and turkey samples that included diagnostic cases with runting-stunting syndrome (rss). fecal swabs collected from broiler chickens, domestic ducks, turkeys, canadian geese in delaware, and litter extracts collected from chicken and turkey farms in north carolina were tested (table ) . twenty-eight of the chicken swabs represented diagnostic cases with rss. all of the other samples were collected from healthy animals. swabs were soaked in ml sterile pbs. litter samples were saturated with sterile pbs and washes were collected. all samples were aliquoted and stored at - °c. equal volumes of - samples from the same sample group were pooled together and viral rna was extracted by the qiaamp viral rna mini kit on a qiavac plus vacuum manifold (qiagen inc., valencia, ca), according to the manufacturer's instructions. twenty-two sample pools along with negative (deionized water) and positive (recovirus) controls were extracted at a time. the titer of tissue culture-adapted ft recovirus strain was adjusted to pfu/ml and ll aliquots were made and stored at - °c. extracted rna was eluted in ll buffer and stored at - °c. rna from individual samples of rt-pcr positive pools were extracted as described above. rt-pcr screening and dna sequencing viral rna was amplified from ll of extracted rna template in ll reactions using the accessquick rt-pcr system (promega, madison, wi) according to manufacturer's instructions with p /p as it was described in our previous studies [ ] [ ] [ ] . reactions were analyzed on % agarose gels in the presence of ethidium bromide. rt-pcr products were excised from agarose gels, recovered by the wizard sv gel and pcr-clean up system and cloned into pgem-t vector (promega, madison, wi) according to the manufacturer's protocols. positive clones were identified by pcr. plasmid dna was isolated from ml cultures by the wizard plus sv miniprep dnapurification system (promega, madison, wi) according to the manufacturer's instructions and sequenced using m forward and reverse primers by the chain termination method on an abi prism Ò dna analyzer (applied biosystems inc., foster city, ca). each sample was sequenced in both directions from two-independent clones. the genome of selected picornavirus strains representing each group was amplified to the end (* , nt) with strain-specific forward primers (ctccactacctcaa cactatcc for group , tgtgatgattggyggyatg for group , and atgagatggaaggaggratg for group viruses, respectively) and an oligo-dt primer. further extension of the p region, encoding a, b (vpg), c pro , and d pol proteins was achieved by primer walking using strain-specific reverse primers and degenerate primers targeting nucleotide sequences encoding for conserved amino acid motifs ddxgq (ttcatcgaygacatcgg icar) in the c and gxcg (ccttcsagggyitst gygg) in the c pro regions. sequence and phylogenetic analysis blast analyses of sequences without the primers were run against ncbi databases. multiple sequence alignments of nucleotide and amino acid sequences were created using the omiga v . software (oxford molecular ltd, oxford, uk). dendrograms were constructed by the unweighted pair group method with arithmetic mean (upgma) and the neighbor-joining clustering methods of the molecular [ ] . the confidence values of the internal nodes were obtained by performing , bootstrap analyses. picornavirus sequences representing all established or proposed genera were included in the analyses (accession numbers are listed in fig. ). predictions of -ntr secondary structure secondary structure predictions for -ntr regions were generated using the webserver for aligning non-coding rnas (war, http://genome.ku.dk/resources/war/) [ ] . war was used to generate consensus alignments and secondary structures for the terminal nt of each of the -ntrs (chk , chk , trk , trk , aichi virus, and turdivirus ), as well as the extended -portions of chk and trk viruses. war submissions are limited to nt, and the greatest homology among the -ntrs was in the -terminal * nt. the size of nt for the -consensus was chosen due to this being the size of the aichi virus -ntr, the smallest of the viruses examined. for the extended -ntrs of chk and trk , the -most nt were used for secondary structure predication. fasta files of rna alignments were uploaded to the war server using dnastar megalign software (dna-star, lasergene . , madison, wi) and consensus alignments and secondary structure predictions were generated using simultaneous rna structural prediction programs (listed at http://genome.ku.dk/resources/war/). predictions were ranked according to free energies, highest covariance scores, average bp probability, and the fraction of canonical base-pairing. structures predicted by more than one program are reported. owing to differences in relatedness, consensus structures were developed using comparisons of alignments of chk and trk , and alignments of aichi virus, turdivirus , chk , and trk strain -ntrs. tissue culture rt-pcr positive samples were centrifuged at , g for min and sterile filtered through . lm syringe filters (millipore, billerica, ma). filtered samples were confirmed for the presence of picornaviruses by rt-pcr and sequencing and inoculated onto primary chicken embryo liver/fibroblast, lmh (chicken liver; atcc crl- ), vero (african green monkey; kidney; atcc crl- ), ma (african green monkey kidney; atcc crl- ), and llc-mk (rhesus monkey kidney; atcc ccl- ) cells at - % confluent in -well tissue culture plates. cultures were monitored daily for cpe, and harvested (medium and cells) at day post-inoculation. after two cycles of freezing and thawing cell debris was removed by centrifugation and the supernatants were passed to fresh cultures. sub-culturing was performed five times regardless of cpe. passages were evaluated for the presence of picornavirus rna by rt-pcr. pre-screening of pooled samples containing diagnostic cases of chicken specimens and chicken or turkey litter extracts yielded rt-pcr amplicons of the approximate expected size (* bp). none of the fecal samples collected from healthy broiler chickens, turkeys, domestic ducks, or canadian geese yielded amplicons with similar [ ] . none of the amplicons revealed calicivirus sequences. according to phylogenetic analysis of the deduced short d pol aa sequences, the sequences obtained in our study fell into two distinct clusters, with distances suggesting the existence of two new genera (genus and genus in this study) within picornaviridae. genus included viruses yielding and bp amplicons and genus was comprised of the thv-like viruses, yielding bp amplicons (fig. ) . analyses of the p regions and -ntr genome amplification of selected strains from each group was extended from the p region to the poly-a tail. the p region of picornaviruses encode proteins a, b (vpg), c pro (protease), and d pol (rna-dependent rna polymerase). for strains chk , chk , trk , and trk a segment (* kb) stretching from the p primer binding site of d pol to the poly-a tail was amplified. for strains chk , chk , trk , and trk a segment (* . - . kb) stretching from the gxcg motif of c pro to the poly-a tail was amplified. finally, for strain chk a segment (* . kb) stretching from the ddxgq motif of the c protein to the poly-a tail was amplified. phylogenetic and distance analyses of the complete d pol amino acid sequences placed genus viruses closer to turdivirus (''orthoturdivirus'') than our original analyses based on partial rdrp sequences, indicating that genus viruses may represent a highly divergent species of ''orthoturdiviruses'' rather than a new genus (fig. a) . however, analyses of the partial c and the a- c region supported their classification as a new genus ( fig. b ; table ). the -ntr sequences of genus (thv-like) viruses obtained in this study were significantly longer than any of the other picornaviruses (table ). these long -ntrs contained a short open reading frame (orf) encoding a putative protein ( - aa) (fig. ) . blast searches of these orfs did not reveal any similarity to known proteins in public databases. secondary structure predication analysis was performed on these -ntr sequences using a structural and alignment-based collection of rna structure prediction programs (webserver for aligning non-coding rnas, war) (figs. , ). the -most nt of chk and trk were predicted to form a nearly identical series of stemloops predicted by rnaforester ( fig. b ; dg = - . ), murlet (dg = - . ), and mafft-rnaalifold (dg = - . ). the -ntr of aichi virus, and nt of the ntrs of chk , trk , and turdivirus were predicted to form a common set of stem-loops predicted by mafft-rnaalifold (dg = - . ) (fig. c) . the additional * nt upstream of the -ntr stemloop structures in chk , , and trk , were similarly examined for structurally homologous stem-loops (fig. ) . this region was found to form a series of stemloops predicted by rnaforester (dg = - . ), mafft-rnaalifold (dg = - . ), and lara (dg = - . ) programs. the function of these stem-loops upstream of those common to the picornavirus -ntr stem-loops associated with replication, is currently unknown. after inoculation with rt-pcr positive swabs or litter extracts, no cytopathic effect (cpe) was observed in the non-human primate cell line cultures tested (llc-mk , ma , and vero) up to five blind passages. in some of the primary chicken embryo liver/fibroblast and lmh cultures, based on the previous description of calicivirus-like particles in avian species [ ] [ ] [ ] , including chickens with rss [ , ] the initial goal of our study was the molecular detection of caliciviruses in avian fecal specimens. fecal swabs collected from broiler chickens, domestic ducks, turkeys, and canadian geese in delaware, and litter extracts collected from chicken and turkey farms in north carolina were tested using p /p (table ) . with the exception of the swabs collected from rss positive chickens in delaware, all of the fecal specimens represented healthy animals. data on the health status of north carolina flocks (litter extracts) were not available. the primers (p /p ) used in this study are targeting nucleotide sequences encoding for conserved amino acid motifs (ygdd and dy(t/s)(r/k/g) wdst) in the calicivirus rdrps. however, rdrps of rna viruses because of their common evolutionary origin share several conserved motifs. indeed, using p /p for the detection of caliciviruses in several previous studies resulted in the unintentional detection of rna viruses including rotavirus, porcine kobuvirus and astrovirus [ ] [ ] [ ] . similarly, in this study novel picornaviruses were serendipitously amplified with amplicons that were indistinguishable from the calicivirus positive control by size (fig. ) . analysis of the complete d pol sequences of chk , chk , chk , trk , tr , and the thvs revealed * - nt match in the nt p binding site, with dyscfdst and dys-cfdss amino acid motifs for genus and genus viruses, respectively. recently, two reports describing the molecular detection of caliciviruses in avian species were published. day et al. [ ] reported a partial calicivirus sequence ( nt) identified in a metagenomic analysis of turkey gut rna virus community and wolf et al. [ ] reported the full genome sequence ( nt) of a chicken calicivirus detected in two clinically normal and one rss chicken. both of these caliciviruses are genetically related to but distinct from sapovirus and represent two putative new genera of caliciviridae. unfortunately, there is no published data on the prevalence of these avian caliciviruses and their role in disease still needs to be established. surprisingly, despite the relatively large number and diverse samples tested, caliciviruses were not detected in any of the samples including the chicken samples collected from chickens with rss. the primers used in our study could not be evaluated directly for their ability to detect the avian caliciviruses but sequence analysis of the chicken calicivirus [ ] indicated a good match for primer binding at the sites encoding for the dysgwdst and ygdd amino acid motifs. picornaviruses were detected both in chicken and turkey samples including two fecal swabs collected from chickens with rss, litter samples collected from egg layers and litter samples collected from turkey farms ( table ) . phylogenetic analyses of partial d pol sequences divided the picornaviruses into two distinct clusters (fig. ) . both clusters contained viruses detected in both chicken and turkey samples suggesting that these picornaviruses can infect both avian species. fifteen samples including the two positive swabs from chickens with rss, litter samples collected from egg layers, and litter samples collected from turkeys contained novel picornaviruses (genus ) with no closely related sequences in public databases. recently, in a metagenomic analysis of the turkey gut rna virus community day et al. [ ] reported the identification of rna sequences with homology to seven of the nine recognized picornavirus genera with the largest number of sequences bearing homology to kobuvirus. unfortunately, these sequences are not available from public databases for comparison with sequences obtained in our study. phylogenetic analysis of the entire p region including a, b (vpg), c pro , and d pol of chk placed genus viruses closer to ''orthoturdivirus'' than our original analysis of the partial d pol sequences (fig. ) . phylogeny of the complete d pol sequences separately indicated that genus viruses might represent a highly divergent species of ''orthoturdivirus'' (fig. a) , however, this was not supported by analyses of the partial c and the a- c regions which placed genus viruses further apart from ''orthoturdivirus'' supporting their classification as a new genus ( fig. b ; in accordance with the results of the phylogenetic distance analysis, alignments of the separate p proteins revealed that chk d pol region alone shared a higher ( %), while the a- c pro region of p and the available partial c region of p shared a lower ( and %, respectively) amino acid identity with turdivirus (table ) . ortho-and paraturdiviruses were discovered recently in tracheal and cloacal swabs obtained from dead wild birds of the genus turdus in the family turdidae [ ] . turdiviruses could not be propagated in cell culture or in chicken embryos and their prevalence, host range, and disease burden are unknown. based on our analysis, genus viruses described in this study represent a putative new picornavirus genus with the closest evolutionary roots to orthoturdivirus. since genus viruses were described in chicken and turkey samples we propose the tentative name ''gallivirus'' for the genus. for the final classification and nomenclature of genus viruses, analysis of complete genome sequences, their host range, pathogenicity, and antigenic relationships needs to be determined. the remaining five picornavirus sequences (genus ) clustered separately from genus viruses and together with the recently described thvs [ ] (fig. ) . pairwise amino acid alignments of the complete d pol revealed a high ( - %) homology between trk , chk , and the thvs. the published sequences of the turkey hepatitis viruses did not include the complete and -ntrs. in this study complete -ntr sequences of three genus viruses (chk , chk , and trk ) were obtained revealing a significantly longer -ntr region ( - nt) than that of any other picornaviruses ( table ) . aligments of the -ntrs with the partial ( and nt) -ntr regions that were published for thvs clearly separated the chicken and turkey viruses into two groups. chk and chk had a - % nucleotide identity with thv and thv d, while trk exhibited a - % identity to thv and thv d, respectively. moreover, an eight nucleotide deletion was clearly conserved among thv and thv d and trk . the full length -ntrs of genus viruses obtained in our study contained a putative short orf: nt ( aa) for chk , and chk and nt ( aa) for trk , respectively (fig. ) . the putative short orf sequences of chk and chk had % nucleotide and % amino acid identity to each other but only % nucleotide and % amino acid identity and % amino acid similarity to the trk short orf. none of these proteins showed homology to any viral proteins available in public databases. whether these orfs encode for a functional protein or the relevance of the unusually long -ntr of these viruses remains to be established in future studies. structures at the extreme -ends of picornavirus genomes define the orir ( -ntr origin of replication) typically include stem-loops (x, y, and z) important for circularization of the genome during minus strand (antigenome) replication. secondary and tertiary structures in the -ntr vary in overall complexity, but have been described as having trna-like folds with the ''kissing'' of stem-loops in a higher order folded pseudoknot [ , ] . in our analysis of predicted consensus rna secondary structures of the viruses reported here, complex secondary structures identified for both the extreme -ntrs (fig. ) and the additional sequences found in chk and trk -ntrs (fig. ) . as sequences at the extreme -ends of picornaviruses, arteriviruses, and coronaviruses are important for antigenome and subsequent genome synthesis [ ] , and given the limited sequence identity among the strains examined, we used a structural alignment-based set of programs to predict common structural elements of these -ntrs. as several common structures were predicted using different algorithms, it seems likely that these structures provide the basis for future structure/function studies with respect to their role in genome replication. while the thvs were described in samples collected from turkey poults with symptoms of turkey hepatitis [ ] , in our study similar viruses were detected in litter samples collected from egg layer chickens and in litter sample collected from turkey poults. since thvs are the proposed causative agents of turkey hepatitis, evaluation of the pathogenicity of these viruses in chickens is important. preliminary studies indicate that both chk and trk can be propagated in embryonated chicken eggs (unpublished data). efforts for the tissue culture adaptation of the picornaviruses described in this study were unsuccessful. virus isolation from fecal material can be difficult due to low virus load, toxicity of the material, and the abundance of diverse viral agents that often overgrow the target virus. many enteric viruses require polarized epithelial cells for replication and may have species-specific requirements. more cell lines and primary cell cultures should be evaluated in future studies. the secondary structure alignment of chk and trk -ntrs (terminal nt), the predicted stem-loop structure (below right), and derived thermodynamic and statistical values for the proposed structure (dg, avg covariation, avg bp probability and canonical bp). the structure shown was predicted by rnaforester using the webserver for aligning non-structural rnas (war, http://genome.ku.dk/resources/war/). c the alignment based on structural prediction for the -ntrs of chk , trk , aichi virus, and turdivirus, the predicted stem-loop structure (bottom right) and values generated in the derivation of this structure (bottom left). structural alignments were generated using the iupac nucleotide ambiguity system. boxed sequences in alignments b and c correlate with the boxed loops in the secondary structure predications and are provided for reference and orientation in summary, we described the molecular detection of novel picornaviruses in chicken and turkey samples, including viruses that were recently suggested to be the causative agents of turkey hepatitis. these viruses represent two possible new genera of picornaviridae that appear to be endemic in both chickens and turkeys. further characterization of these viruses including their host range and prevalence and studies to link infection to clinical disease such as hepatitis or rss are necessary. virus taxonomy: classification and nomenclature of viruses: ninth report of the international committee on taxonomy of viruses ). c the predicted stem-loop structure (below right), and derived thermodynamic and statistical values for the proposed structure (dg, avg covariation, avg bp probability, and canonical bp). boxed sequences in the alignment in b correlate with the boxed loop in the secondary structure acknowledgments we thank dr. carolyne price for providing the lmh cell line, bryan donnelly for providing the primary chicken embryo fibroblast cells and nicole farkas for helping with sample transport. we also thank dr. margaret k. hostetter for her support. the infectious disease scholar fund of cchmc to t. f. was used to fund this study. key: cord- -fc f authors: zhou, ling; tang, qinghai; shi, lijun; kong, miaomiao; liang, lin; mao, qianqian; bu, bin; yao, lunguang; zhao, kai; cui, shangjin; leal, Élcio title: full-length genomic characterization and molecular evolution of canine parvovirus in china date: - - journal: virus genes doi: . /s - - -y sha: doc_id: cord_uid: fc f canine parvovirus type (cpv- ) can cause acute haemorrhagic enteritis in dogs and myocarditis in puppies. this disease has become one of the most serious infectious diseases of dogs. during in china, there were many cases of acute infectious diarrhoea in dogs. some faecal samples were negative for the cpv- antigen based on a colloidal gold test strip but were positive based on pcr, and a viral strain was isolated from one such sample. the cytopathic effect on susceptible cells and the results of the immunoperoxidase monolayer assay, pcr, and sequencing indicated that the pathogen was cpv- . the strain was named cpv-ny- , and the full-length genome was sequenced and analysed. a maximum likelihood tree was constructed using the full-length genome and all available cpv- genomes. new strains have replaced the original strain in taiwan and italy, although the cpv- a strain is still predominant there. however, cpv- a still causes many cases of acute infectious diarrhoea in dogs in china. canine parvovirus type (cpv- ) belongs to the genus protoparvovirus and the parvoviridae family and was first observed by electron microscopy in [ ] . cpv- has three antigenic variants: types a, b, and c. two antigenic variants, cpv- a and cpv- b, are now distributed worldwide [ ] . a third cpv- variant, which was initially believed to be a glu- mutant and subsequently renamed cpv- c, was detected in italy in [ ] and is now circulating there together with types a and b [ ] [ ] [ ] . the new type c has also been reported in vietnam by nakamura et al. [ ] , who developed monoclonal antibodies specific for that type [ ] . type c has also been reported in the united states and south america [ ] . the genome of cpv- is a single-stranded dna molecule that contains two open reading frames (orf and orf ). orf is located in the region of the genome and encodes non-structural protein (ns ) and (ns ). orf is located in the right part of the genome and encodes the structural proteins vp and vp . the middle region of the genome also contains a bp sequence [ , ] . the non-structural proteins ns and ns are associated with viral replication. ns functions as a nickase and helicase and can form a covalent edited by william dundon. ling zhou, qinghai tang, lijun shi, and miaomiao kong have contributed equally to this work. bond at the end of dna. structural proteins vp and vp compose the nucleocapsid of cpv- [ ] . in in china, many acute cases of infectious diarrhoea were reported in dogs. to identify the pathogen, we collected and examined faecal samples from symptomatic dogs. we report here that the disease is associated with a strain of parvovirus, and phylogenetic characterization of the causal isolate showed that isolates from china have long branch lengths; this may indicate an extensive process of accumulation of mutations and substitutions in chinese lineages. this study provides the basis for further exploration of the cpv- variation, the selection of vaccines, and the effective prevention and control of cpv- infections. faeces were collected from dogs exhibiting acute haemorrhagic enteritis. the faecal samples ( g) were immersed in ml of sterile phosphate-buffered saline (pbs) with antibiotics, and the suspension was centrifuged at , g for min. the supernatant was filtered and frozen at - °c. to determine whether cpv- was present in the samples, the filtered supernatants were added to the colloidal gold test strips using the bionote rapid test kit (korea, bio-note). the samples were mixed with buffer and then tested for the antigen at room temperature according to the manufacturer's instructions. the negative sample was prepared for pcr as follows. the pcr assay was performed in a -ll reaction volume that included . ll of template, . ll each of vp -f and vp -r (table ) , . ll of dntps, . ll of kod fx neo polymerase ( u/ll) (toyobo biotechnology company, shanghai, china), . ll of pcr buffer for kod fx neo (toyobo biotechnology company, shanghai, china), and sufficient ddh o to increase the volume to ll. amplification was carried out in a pre-heated thermocycler (applied biosystems thermal cycler) as follows: one cycle at °c for min; followed by cycles at °c for s, °c for s, and °c for s; and a final extension at °c for min. amplicons were detected by electrophoresing -ll aliquots in % agarose gels in tae [ mm tris-acetate (ph . ), mm edta]. identification of cpv- with cell culture and ipma feline kidney cell line f was obtained from the american type culture collection, usa and was used to isolate viruses from clinical samples and to observe cytopathic effects associated with viral replication. the treated samples were inoculated at a concentration of ml per cm flask in a cell monolayer. after adsorbing for h at °c, the inoculum was removed, dmem with % foetal bovine serum was added, and the cells were again incubated at °c. cell cultures were observed daily for - days to monitor the appearance of cytopathic effects (cpe). subsequently, an ipma was conducted, and the other flask was frozen at - °c and submitted to further passages following the same procedure after freezing and thawing for three times, until the eventual appearance of cpe [ , ] . cloning the full-length genomic sequence of cpv- dna and rna were extracted from the homogenized samples (faeces from infected dogs) with the tianamp virus dna/rna kit (beijing tiangen biotech company, beijing, china) according to the manufacturer's protocol. dna and rna were analysed by pcr or rt-pcr, and the presence of other potential pathogens, such as canine distemper virus (cdv), canine adenovirus (cav), canine coronavirus (ccv), and canine rotavirus (crv), was investigated. table summarizes the sequences of primers used to amplify the virus, and the pcr was conducted as previously described [ ] [ ] [ ] . these primers were the same set used in calderon et al. [ ] , chaturvedi et al. [ ] , and pratelli et al. [ ] dna fragments were then cloned into the pmd -t sample vector (takara biotechnology co., ltd. japan). the same set of pcr primers was used to sequence the full-length genome of the isolate ny- . sequencing reactions were performed by a commercial corporation (huada, beijing). sequences were aligned using clustal x software version . [ ] . after the first alignment, the sequences were manually edited to maintain the reading frames using the se-al programme version . (http://evolve.zoo.ox.ac.uk/ software/). the following cpv- references were used for the phylogenetic analyses: maximum likelihood trees and bootstrap values were obtained using phyml software [ ] . the khy model and gamma distribution (c) was selected according to the likelihood ratio test (lrt) using jmodeltest software [ , ] . the resulting trees were visualized and edited using figtree (http://tree.bio.ed.ac.uk/software/figtree/). to determine the extent of recombination in the cpv- sequences, rdp v. software [ ] was used for analyses, which utilizes a collection of methods. an excellent and detailed explanation of each method implemented in the rdp programme can be found in the user's manual (http:// darwin.uvigo.es/rdp/rdp.html). the ill dogs showed acute infectious diarrhoea, and the faecal samples were bloody stools (fig. a) . the colloidal gold test strip showed positive (fig. b ) and negative ( fig. b ) results, while the pcr results were both positive (fig. c) . one viral strain was isolated from the faeces of dogs that tested negative in the rapid test but positive using pcr. cpe was observed in f cells after three passages. cpv- -infected cell cultures showed vast regions of cell detachment (fig. ) . the ipma was conducted with positive sera (fig. ) . pcr amplification showed a fragment of bp corresponding to the vp gene of cpv- . pcr amplification detected the presence of cpv- but was negative for other pathogens in the samples analysed. the sequencing results showed that the vp gene of cpv-ny- is bp long with no insertions or deletions in the coding region. the results were once again positive for cpv- . three experimental animals were inoculated with ny- and exhibited clinical symptoms at days post inoculation. cpv- was re-isolated from the faeces of the experimental dogs. the whole genome was sequenced, and the cpv-ny- complete genome structure was analysed and is shown in fig. . a maximum likelihood tree was constructed using the vp sequence and full-length genomes of cpv-ny- and all available cpv- genomes. figure shows that all the cpv- sequences from china clustered in a group, and the cpv-ny- isolate is also located in the chinese clade. interestingly, located at the base of the chinese clade is one isolate from russia (jn ) sampled in . all sequences used to construct the above-mentioned tree were absent for recombination. to further examine cpv- isolates, we used a larger dataset composed only of vp protein sequences. the maximum likelihood tree again showed a monophyletic cluster containing all cpv- isolates from china. cpv-ny- belongs to the cpv- a type, and it was inferred that cpv- a is still circulating in china and is also the main agent of acute haemorrhagic enteritis in dogs. in addition, both trees showed that isolates from china have long branch lengths; this may indicate an extensive process of accumulation of mutations and substitutions in chinese lineages. since it emerged in , canine parvovirus has spread in the domestic and wild canine population, where it is continuously evolving into new adaptive viral variants. the variability and the intrinsically high mutation rate of the cpv- genome allowed the diversity of cpv- to rapidly increase as the virus spread through canine populations [ ] . because some viral variants replicate more successfully than others, the virus population changes over time [ ] . cpv- a and cpv- b are the best examples of variants with a fast spread and replacement capacity since the emergence of cpv- [ , ] . during , there were many cases of acute infectious diarrhoea in dogs. some samples tested negative for the cpv- antigen by a colloidal gold test strip, while they were positive for cpv- by pcr. one viral strain was isolated from the faeces of dogs that tested negative in the colloidal gold test strip but positive with pcr. cpv-ny- , which was described in this study, was found in dogs that exhibited acute haemorrhagic enteritis. based on the pcr results, the only virus consistently detected in the samples was cpv- . some samples tested with the serology method (colloidal gold test strip) failed to detect the virus, while pcr was able to detect cpv- , which is a valid and important objective. this may be due to differences in the test sensitivity (pcr is more sensitive than the serological test) or due to a change in the viral genome that leads to a major change in the type of the antigen (viral protein) and the failure of the antibodies to recognize the viral antigen in the serology test. this question needs to be addressed by future experiments. the isolated virus was then cultivated in f cells, and we next inoculated adult dogs with the isolated cpv- . these animals presented clinical conditions characteristic of cpv- infection. pcr was used to demonstrate that cpv- was the agent causing the disease in these experimental animals. cpv- is pandemic, and the frequencies of the different antigenic types of cpv- vary in different countries. in uruguay, for example, cpv- c is the major epidemic strain [ ] . in the usa and southern africa, cpv- b is the main viral type that causes most outbreaks of cpv- infection [ , ] . in the uk, both cpv- a and cpv- b are present, and germany and spain have similar frequencies of isolation [ , ] . in india, the major epidemic strains are cpv- a and cpv- c. the new strains have also replaced the original one in taiwan and italy, although the cpv- a strain is still predominant there [ , ] . however, cpv- still causes many cases of acute infectious diarrhoea in dogs in japan [ ] . in this study, molecular phylogenetic analysis of cpv-ny- and other cpv- isolates in genbank revealed that cpv-ny- is closely related to isolates s , sc - , lz , lz , and nj - , especially isolate lz . cpv- a still causes many cases of acute infectious diarrhoea in dogs in china. the sequence analysis showed that there is little variation, and this is important in choosing a vaccine to prevent this disease. given the results obtained in the current study, the continuous surveillance of cpv- in china is imperative for determining whether cpv- a will colonize and spread into new territories. diarrhea in puppies: parvovirus-like particles demonstrated in their feces clustal w and clustal x version . proc. natl. acad. sci. usa acknowledgments this work was partly supported by the agricultural science and technology innovation programme of china (astip-ias ). conflict of interest the authors declare that they have no competing interests. key: cord- -o j d j authors: page, kevin w.; britton, paul; boursnell, michael e. g. title: sequence analysis of the leader rna of two porcine coronaviruses: transmissible gastroenteritis virus and porcine respiratory coronavirus date: journal: virus genes doi: . /bf sha: doc_id: cord_uid: o j d j the leader rna sequence was determined for two pig coronaviruses, tranmissible gastroenteritis virus (tgev), and porcine respiratory coronavirus (prcv). primer extension, of a synthetic oligonucleotide complementary to the ′ end of the nucleoprotein gene of tgev was used to produce a single-stranded dna copy of the leader rna from the nucleoprotein mrna species from tgev and prcv, the sequences of which were determined by maxam and gilbert cleavage. northern blot analysis, using a synthetic oligonucleotide complementary to the leader rna, showed that the leader rna sequence was present on all of the subgenomic mrna species. the porcine coronavirus leader rna sequences were compared to each other and to published coronavirus leader rna sequences. sequence homologies and secondary structure similarities were identified that may play a role in the biological function of these rna sequences. transmissible gastroenteritis virus (tgev) and porcine respiratory coronavirus (prcv) belong to the family coronaviridae, a large group of pleomorphic enveloped viruses with a positive-stranded rna genome. tgev causes gastroenteritis in pigs, resulting in a high mortality in neonates ( ) . prcv was isolated in several european countries between and ( - ), does not cause diarrhea, and has been shown to replicate in the respiratory tract with little or no clinical signs, but is very similar antigenically and serologically to tgev ( , ) . virions from both viruses contain two envelope glycoproteins of relative molecular mass (mr) , (spike) and m r , - , (membrane protein) and a phosphorylated nucleoprotein of m r , . cdna probes to the structural protein genes of tgev hybridized to the appropriate mrna species of prcv, suggesting a high degree of homology at the rna level (unpublished data). coronavirus proteins are expressed from a "nested" set of subgenomic mrnas with common ' termini but different ' extensions. the sequence of each mrna that is translated to produce viral proteins appears to correspond to the '-terminal region that is absent on the preceding smaller mrna species. it has been shown for the coronaviruses, mouse hepatitis virus (mhv) and infectious bronchitis virus (ibv), the subgenomic mrna species possess short "leader sequences" at their ' ends. these sequences are not transcribed as a contiguous mrna species, but are derived from the ' end of the genomic rna and are probably joined to the ' end of each mrna by a process of discontinuous transcription ( ) ( ) ( ) ( ) ( ) . the leader sequence appears to be produced by a mechanism termed leader-primed transcription, in which the leader rna is transcribed independently, dissociated from the template, and then binds to the template (negative-sense strand) at specific transcriptional start sites (i , ) . the mechanism appears to involve the recognition of consensus sequences identified on the genomic rna at those points corresponding to the ' ends of the subgenomic mrnas. these consensus sequences may act as a binding site for the rna polymeraseleader complex ( ) ( ) ( ) ( ) ( ) ( ) . it has been previously postulated that a heptameric sequence, actaaac ( ) ( ) ( ) , or a hexameric sequence, ctaaac ( ) ( ) ( ) , may be involved in the binding of the tgev rna polymerase leader. in this paper we describe the elucidation of the leader rna sequences from the porcine coronaviruses tgev and prcv, the first leader sequence to be described from the tgev serogroup of coronaviruses. comparison of the leader rnas of tgev and prcv with published leader rnas of other coronaviruses was used to identify areas of conserved sequence and potential secondary structure that may be involved in the transcription of coronavirus subgenomic mrna species. confluent cultures of a pig kidney cell line llc-pk were infected with a virulent british field isolate of tgev strain fs / or a british isolate of prcv strain / at a moi of - pfu per cell. after hr at ~ the inoculum was removed and replaced with medium containing ixg/ml actinomycin d to inhibit host-cell rna synthesis ( ) . after a further -hr incubation, r of [ , - h]uridine (amersham international plc, trk. , - ci/mm) was added per culture bottle and the cells were incubated for a further hr. the cells were lysed with guanidinium thiocyanate, the rna pelleted through . m cesium chloride and poly(a)-containing rna isolated by poly(u) sepharose affinity chromatography, as described previously ( ) . two oligonucleotides were synthesized by the phosphoramidite method using an applied biosystem a synthesizer. one oligonucleotide, oligo ( '-tggatt-catccccccaacta-y), was complementary to the nucleoprotein gene bp downstream from the initiation atg codon ( ) , as shown in fig. , and was used for primer extension. the second oligonucleotide, oligo ( '-agagata-tagccacgctacactcactttac-y), was complementary to the ' end of the leader rna ( fig. ) and was used for northern blot analysis of viral mrna. gel-purified oligo ( ng) was '-end-labeled ( ) using u of t polynucleotide kinase (gibco-brl, paisley) and ixci [~/- p]atp (amersham international plc, pb , ci/mm. poly(a)-containing rna ( . p~g) isolated from tgev-and prcv-infected cells was resuspended in water and heated at ~ for min. a further incubation was carried out using the two mrna preparations in p.l reaction volumes containing u of rnasin (promega biotec, liverpool), mm tris-hc (ph . ), mm mgc , mm kc , mm -mercaptoethanol, mm dithiothreitol, mm dntps, '-end-labeled oligo ( ng), and u of amv reverse transcriptase (super-rt, anglian biotech ltd, colchester) for min at ~ formamide dye ( % formamide, mm naoh, mm edta, . % xylene cylanol blue, . % bromophenol blue) was added and the mixture boiled for min and electrophoresed on a cm buffer gradient sequencing gel ( ) . the wet gel was autoradiographed for hr to locate the primerextended products, which were excised from the gel. the labeled fragments were eluted from the polyacrylamide gel and chemically cleaved ( ) . samples of the cleaved products from each of the primer extended products were electrophoresed on % polyacrylamide gels at w constant power for two different lengths of time. tgev and prcv poly(a)-containing rna was glyoxylated and separated on a % agarose gel ( ) . the rna was transferred onto biodyne a membranes (pall p/n bnng r . ~m, gallenkamp) in x ssc (x ssc = . m naci, . m trisodium citrate, ph . ) for hr and baked at ~ for hr. the membrane was boiled in mm tris-hcl ph . for min to remove glyoxal groups from the rna and prehybridized in the presence of % formamide for hr at ~ ( ) . the viral mrna species were hydribidized with p-labeled oligo in the presence of % formamide for hr at ~ the membrane was washed four times in x ssc containing . % nadodso for rain at room temperature and autoradiographed. following primer extension, using oligo at the ' end of the nucleoprotein gene from the porcine coronaviruses tgev and prcv, labelled fragments of approximately bases were produced and purified from gels. larger molecular weight species were also observed (data not shown) in minor amounts, presumably corresponding to read-through sequences upstream of the nucleoprotein gene primed from the larger mrna species. the nucleotide sequences of the two fragments, determined by chemical cleavage, were identical. the resulting nucleotide sequence of the tgev leader rna sequence is shown in relation to the tgev nucleoprotein gene in fig. . the leader rna sequence diverges from the genomic sequence bp upstream of the nucleoprotein gene, corresponding to the first nucleotide of the membrane protein gene stop codon ( ), indicating a length of nucleotides of unique sequence (fig. ). the nucleotide leader sequence of tgev and prcv has a low content of g ( %) and c ( %), and a high a ( %) and t ( %) content, with % of the t residues grouped in threeto four-nucleotide motifs (fig. ) . these values are similar to those observed from the tgev genome so far sequenced, except that the values for a ( . %) and t ( . %) are more similar on the genome than on the leader sequence. analysis of the tgev nucleoprotein nucleotide sequence ( ) revealed a potential rna polymerase-leader complex binding site. the site, actaaac, is seven nucleotides upstream of the nucleoprotein initiation codon and has also been found to precede all the tgev structural protein genes and two of the three potential genes shown to be at the ' end of mrna species ( ) ( ) ( ) . this consensus sequence is found two nucleotides downstream of the nucleotide where the leader rna and tgev genomic sequences diverge, indicating that this sequence is involved in the leader-primed transcription oftgev mrna molecules. as can be seen from fig. , of the mrna species from the fs / strain of tgev have the sequence aactaaac, of which the '-end adenosine residue is the next base down from the divergence point. in fact, the consensus sequence at the spike/orf -orf gene junction has the sequence gaactaaac and at the nuc/orf gene junction has the sequence cgaactaaac, indicating that the region of the leader sequence ' to the homology motif, actaaac, may vary between and nucleotides depending on the tgev gene. computer analysis has also detected a homology between the leader rna sequence and the ' end of the negative strand (i.e., the reverse complement of the noncoding region at the ' end of the positive strand). this is shown in fig. . the nucleotides on the leader rna sequence, bases - , and on the negative strand, bases to counting from the first base after the poly(a) tail, have an overall homology of % and include the sequenc~ ctaaac, which is part of the postulated tgev rna polymerase-leader complex binding site. this is very similar to the observation for ibv ( ) involving sequences present at the ' end of the ibv genome, and on the ibv leader rna sequences, with the ' end of the ibv negative strand. the homology observed included the sequence cttaac, which is part of the postulated ibv rna polymerase-leader complex binding site ct(t/g)aacaa. an oligonucleotide, oligo , was synthesised that was complementary to the ' end of the tgev and prcv leader rna sequences (fig. ) . the oligonucleotide was end-labeled and used to probe tgev and prcv mrna species that were northern blotted onto biodyne membranes. as can be seen from fig. , the labeled probe hybridized to all of the tgev and prcv mrna species. the intensity of the bands corresponding to labeled probe hybridized the spike mrna species, and genomic rna was lower than that observed for the smaller mrna species due to less of these larger species being isolated from the poly(u) sepharose column used in the isolation of mrna. the fact that the probe hybridized to all of the mrna species showed that the leader rna sequence was present on the other rna molecules of tgev and both strains of prcv was not unique to the nucleoprotein mrna species. the two porcine coronavirus leader sequences were identical, indicating that the two viruses probably use the same rna polymerase-leader complex binding site, actaaac, for the synthesis of subgenomic mrna species. the seqhp comparison program of the los alamos ( ) package was used to compare the leader rna sequences determined in this paper and those published for five other coronaviruses belonging to two different serogroups. the sequences were compared from the ' ends to the point of divergence from the genomic sequences. the percentage homologies, table , were expressed as the number of bases matched to the longer of the two sequences being compared. the homology of the leader sequences fell into three groups. leader rnas from coronaviruses belonging to different serological groups had homologies in the region of - %. serologically related viruses like human coronavirus (hcv) (strain oc ) and mhv (strains a and jhm) have about % homology. the third group involved different strains of mhv, a , and jhm, which showed a homology of %. this observation indicates that tgev and prcv, which have a homology of %, are probably different strains of the same virus or that prcv has very recently diverged from tgev. in order to identify common areas of homology, the leader rna sequences from seven coronaviruses were aligned. as can be seen from fig. , these fell into two groups. one group consists of mhv (strains a and jhm) with hcv (oc ), which have a fairly high degree of homology along their lengths. the other group consists of tgev and prcv (not shown on the diagram) with hcv ( e) and ibv, which have high homologies at their ' ends and areas of homology at their ' ends. there are good homologies towards the ' ends, involving the postulated rna polymerase-leader complex binding sites and sequences upstream of these sites, between the groups, but very little if any homology between the ' ends. ( ) and strain jhm ( ); avian, ibv strain beaudette ( , ). as seen from fig. simple alignment did not reveal very much information about the homologies of the leader rna sequences from the different coronaviruses, except at the ' ends involving the consensus sequences. in order to identify any potential similarities in these sequences, the secondary structure of the rna sequences in fig. were analyzed. potential secondary structures of the leader rna sequences were determined using the computer program fold ( ) from the uwgcg dna analysis programs ( ) . the coordinates determined by the fold program were displayed graphically using the uwgcg program squig-gles. the potential secondary structures obtained were compared and, as can be seen from fig. , the overall shape of these sequences are very similar, except for the avian coronavirus ibv. all the molecules appear to be composed of two stem-loop structures. the two mhv molecules are very similar in shape and, as seen from fig. and table , are very homologous, %, at base sequence. the secondary structures of the coronavirus leader rna sequences are probably influenced by their biological function, which results in the similarity of these potential structures. this paper presents evidence that the nucleoprotein mrna species of tgev and the closely related porcine respiratory variant of tgev, prcv, contain an identical leader rna sequence of about nucleotides. sequencing studies on tgev have shown that the heptameric sequence actaaac occurs on the genome upstream of the genes and is believed to be the binding site for the leader of the genomic rna. this mechanism has been termed leader-primed transcription and involves not only the leader rna primer, but also consensus sequences along the genome found upstream of the genes, which act as binding sites for the leader rna primer. comparison of tgev and prcv viral products has shown very little difference between the two coronaviruses, and until recently is was impossible to differentiate between the two viruses using antisera. prcv is fully neutralized by antisera prepared against tgev, and the majority of monoclonal antibodies (mabs) raised against tgev virion proteins cross-react with prcv. however, mabs, raised against antigenic determinants of the spike protein from either the virulent british isolate fs / ( ) or the avirulent purdue strain of tgev ( ) have been identified that do not recognize prcv. these observations and the fact that the leader rna sequences from tgev and prcv are identical supports the evidence that the two viruses are very similar and that prcv may have evolved as a tgev variant. comparison of the tgev leader rna sequence with the genomic sequence upstream of the nucleoprotein indicates that the length of the unique sequence of the leader sequence is nucleotides. the point of divergence is two bases upstream of the actaaac sequence, supporting the evidence that the tgev rna polymerase-leader complex binding site is actaaac. four out of the six mrna species from the fs / strain of tgev have the sequence aactaaac, and the '-end adenosine residue is the next base down from the divergence point in the nucleoprotein mrna (fig. ) . the differences in the homologies between the leader rna and sequences upstream of the consensus sequence on the genomic rna may play a role in the levels of transcription of a particular mrna species. the mrna species of . kb has been shown to have an open reading frame at the ' end encoding a potential polypeptide of m r ( ) . this particular mrna does not have the heptameric consensus sequence but has the hexameric ctaaac sequence, and it is interesting to note that it is the least abundant tgev mrna species (observed from tgev mrna in total cell lysates). hybridization of oligo to the . -kb mrna species showed that this species does contain the tgev leader rna, confirming that it is a true mrna species, even though it is the only tgev species not to have the heptameric consensus sequence. comparison of the seven coronavirus leader rna sequences against each other identified three groups (table ) : non-serologically related viruses had about - % homology; serologically related viruses had about % homology; viral strains had about - % homology. however, tgev and hcv ( e) have been placed in the same serological group, but have only % homology within their leader rna sequences, suggesting that the two viruses are not particularly related. tgev and hcv ( e) have been shown to have % homology at the amino acid level within their derived nucleoprotein sequences ( ) , whereas the homology between the derived nucleoprotein amino acid sequences for different viruses within the mhv serological group are between % and % homology. this indicates that the serological grouping of coronaviruses is not a particularly useful test, as similar epitopes may exist on the viral structural proteins. comparisons of nucleic and amino acid sequences from the viruses will provide a more accurate method for grouping the viruses. it will be interesting to compare the leader sequences of bovine coronavirus (bcv), which is serologically related to hcv (oc ) and mhv (a and jhm), with feline infectious peritonitis virus (fipv) and canine coronavirus (ccv), which are serologically related to tgev, once their sequences have been determined. the large variation in sequence length and content made the alignment of the different leader sequences difficult. however, alignment of the six different coronaviruses revealed that they fell into two groups. there appears to be some conservation of short sequence motifs between the seven leader sequences. toward the ' end of the sequences, a tag motif is conserved in all the leaders, followed by a string of ts. in five out of seven of the sequences, this motif is taganntt. about ten nucleotides downstream of this region is a conserved ct motif, which is followed by a series of nucleotides differing in number, depending on the coronavirus, followed by the postulated rna polymerase-leader complex binding site. the largest number of nucleotides between the ct motif and the consensus sequence are found on tgev and prcv, the shortest is found on hcv ( e) and ibv. it is interesting to note that there is a five-base insert in mhv strain jhm when compared to mhv strain a , which is also present in hcv (oc ) within this region. all the mammalian coronaviruses appear to have the motive ctaaac, except hcv (oc ), which has ctaaat. recent sequence data suggest that coronaviruses fipv and bcv have actaaac as their mrna consensus sequence. upstream of the tag motif there is an act motif occurring in six out of seven sequences. toward the ' end of the leader rna sequences, the homologies are patchy and limited to short matches, occurring only between pairs of sequences. the area upstream of the consensus sequence has been suggested to be involved in the binding of nucleoprotein to the leader rna sequence at nucleotides - in mhv ( ) . it was suggested that mrna species and genomic rna form a complex with the nucleoprotein by the protein binding to or near the leader sequence attached to the rna molecules ( ) . secondary structure analysis of the leader rna sequences showed that all the sequences except for ibv possess a putative double stem-loop structure (fig. ). in the case of the mammalian coronaviruses, the consensus sequences and upstream regions of homology are on the second stem-loop structure, leaving the possibility that the rna-dependent rna polymerase could interact with the first stem-loop structure. the ibv consensus sequence is present on the free ' end of the single stem-loop structure, possibly leaving the single stem-loop structure to interact with the polymerase. virus infections of vertebrates (eds) coronaviruses molecular cloning: a laboratory manual we thank miss k. mawditt, of this laboratory, for synthesizing oligos and and dr. s. f. cartwright, central veterinary laboratory, weybridge for prcv strains / and / . this work was supported by a research contract from the biomolecular engineering programme of the commission of the european communities, contract no. bap- -uk(hi). key: cord- -ejoufvvj authors: binder, florian; reiche, sven; roman-sosa, gleyder; saathoff, marion; ryll, rené; trimpert, jakob; kunec, dusan; höper, dirk; ulrich, rainer g. title: isolation and characterization of new puumala orthohantavirus strains from germany date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: ejoufvvj orthohantaviruses are re-emerging rodent-borne pathogens distributed all over the world. here, we report the isolation of a puumala orthohantavirus (puuv) strain from bank voles caught in a highly endemic region around the city osnabrück, north-west germany. coding and non-coding sequences of all three segments (s, m, and l) were determined from original lung tissue, after isolation and after additional passaging in veroe cells and a bank vole-derived kidney cell line. different single amino acid substitutions were observed in the rna-dependent rna polymerase (rdrp) of the two stable puuv isolates. the puuv strain from veroe cells showed a lower titer when propagated on bank vole cells compared to veroe cells. additionally, glycoprotein precursor (gpc)-derived virus-like particles of a german puuv sequence allowed the generation of monoclonal antibodies that allowed the reliable detection of the isolated puuv strain in the immunofluorescence assay. in conclusion, this is the first isolation of a puuv strain from central europe and the generation of glycoprotein-specific monoclonal antibodies for this puuv isolate. the obtained virus isolate and gpc-specific antibodies are instrumental tools for future reservoir host studies. electronic supplementary material: the online version of this article ( . /s - - - ) contains supplementary material, which is available to authorized users. puumala orthohantavirus (puuv) is the most important hantavirus in europe [ ] . it causes the majority of human hantavirus infections and hemorrhagic fever with renal syndrome (hfrs) cases [ ] . in central and western europe hantavirus outbreaks occur in two to five year intervals and are driven by massive increase of the bank vole (myodes glareolus) population, the reservoir of this orthohantavirus species [ ] . human hantavirus disease is notifiable in germany since and the majority of recorded cases is mainly due to puuv infections in southern and western parts of germany, whereas dobrava-belgrade orthohantavirus (dobv) with the striped edited by detlev h. kruger. the online version of this article (https ://doi.org/ . /s - - - ) contains supplementary material, which is available to authorized users. field mouse as reservoir causes infections in the northeastern part of germany [ ] . the characterization of the pathogenicity and identification of virulence markers are highly dependent on adequate puuv isolates. currently, the number of puuv isolates is very limited and does not represent the real diversity of puuv strains in europe. in particular, no central european puuv isolate exists [ ] . the majority of puuv isolates, and hantaviruses in general, was obtained based on passaging in reservoir animals or veroe cells and is highly adapted [ ] [ ] [ ] . previous investigations indicated that veroe cell adaptation of puuv kazan strain results in the inability of the adapted strain to infect the bank vole reservoir [ ] . the recent development of bank vole-derived primary or permanent cell lines may allow the isolation of reservoir-adapted puuv strains [ ] [ ] [ ] [ ] . hantavirus proteins are usually detected in infected cells by monoclonal antibodies. nucleocapsid (n) protein-specific monoclonal antibodies have been developed against a large range of hantaviruses [ ] [ ] [ ] . in contrast, the number of glycoprotein precursor (gpc), as well as gc-and gn-specific monoclonal antibodies is rather low [ ] [ ] [ ] . the majority of these antibodies were raised by infection of bank voles or immunization with recombinant n protein or heterologous virus-like particles (vlps). the generation of envelope protein-specific monoclonal antibodies with reactivity to virus proteins in infected cells is highly dependent on structural constraints [ ] . autologous vlps represent a useful tool to generate highly efficient immune responses against a variety of viruses and for the generation of monoclonal antibodies in particular [ ] . puuv strain astrup [ ] gpc-derived vlps were generated in this study as previously described for maporal orthohantavirus [ ] . lower saxony, north-west germany, and district osnabrück in particular, is a well-known endemic region for puuv infections [ , ] . this endemic region was also again heavily affected by the hantavirus outbreak year [ ] . here, we aimed to isolate a central european puuv strain from bank voles in the district of osnabrück using standard veroe cells and the recently established carpathian lineage bank volederived kidney cell line (mgn- -r [ ] ). complete genome determination by shot-gun and hybrid-capture-mediated highthroughput sequencing (hts) was used to follow the potential adaptation of the puuv isolates in veroe and reservoir cell lines. finally, the reactivity of the isolates was determined with novel monoclonal antibodies raised against puuv gpc vlps. bank voles were trapped in spring in the puuv endemic region around osnabrück following a standard snap trapping protocol [ , ] . in the field, a small piece of lung was taken for virus isolation and rt-qpcr analysis. thereafter, carcasses were frozen, transported to the laboratory and completely dissected according to standard protocols. chest cavity lavage was collected by rinsing the chest cavity by ml phosphate-buffered saline (pbs) and investigated for the presence of puuv-reactive antibodies. the presence of hantavirus rna was analyzed from lung tissue and were, in part, previously published in a surveillance study [ ] . for virus isolation and further infection studies, veroe and bank vole kidney (mgn- -r; [ ] ) cells were used in parallel. virus titration was done on veroe cells only. mgn- -r cells were grown in an equal mixture of hams' f and iscove's modified dulbecco's medium (imdm) + % fetal calf serum (fcs) and passaged two times per week at a : ratio. veroe cells were passaged twice a week in minimal essential medium (mem) + % fetal calf serum (fcs) and a split ratio of : . for virus isolation, × mgn- -r or veroe cells were seeded in . cm flasks one day before rodent sampling in the field. the cells were carried to trapping sites in an isolation box with heat packs (around °c constant for days with outside temperature of - °c). after collecting voles from traps, a small incision in the chest area was made and a piece of lung (pea-sized) was taken and transferred into ml dulbecco's modified eagle's medium (dmem) + % fcs + penicillin/streptomycin (ps) in a ml safe lock tube. lung tissue material was homogenized in the field by grinding it through a fine metal grid against the tube wall. the homogenized tissue material was sterile filtered ( . µm) directly onto the cells resulting in approximately µl tissue/medium suspension per . cm flask. after - h incubation in the isolation box, ml dmem + % fcs + ps was added. upon arrival in the laboratory flasks were incubated in a cell culture incubator at °c and % co for days until first passage. in parallel, a pinhead-sized piece of lung was taken for rna isolation in ml trizol (qia-gen, hilden, germany). after days, trypsinized cells were resuspended in ml dmem + % fcs + ps. for puuv rna screening, µl of each cell suspension was taken for rna extraction and analyzed by rt-qpcr (see below). fresh veroe cells were resuspended in ml dmem + % fcs + ps and µl were mixed : with µl of the inoculated cell suspension in a new . cm flask. afterwards, ml dmem + % fcs + ps were added and cells were incubated for days until next passage. in parallel, one uninfected flask of veroe or mgn- -r cells was passaged as a control. this procedure was continued until rt-qpcr-positive samples were detected. after first screening, only the flasks of the rt-qpcr-positive samples were further passaged. for detection of puuv nucleic acid, rna was extracted from homogenized lung tissue, or cell culture passages using qiazol lysis reagent (qiagen, hilden, germany) followed by a novel puuv s segment-specific rt-qpcr. for rt-qpcr, primers puuv-nss-s ( ′-gwnata rcy cgy cat garc- ′) and puuv-nss-as ( ′-art gct gac act gty tgt tg- ′) and the probe ( ′- -fam-crg tgg rrrt-gkacc crg atga-bhq- - ′) were used. the pcr was done according to the quantitect probe one-step rt-qpcr mix (qiagen, hilden germany) protocol and contained pmol/µl of each primer and pmol/µl probe (eurofins, hamburg, germany). the following cycler protocol was used: min of reverse transcription at °c; min initial denaturation at °c; cycles of sec at °c, sec at °c and sec at °c. for quantification of the number of rna copies/µl and sample, an in vitro transcribed rna was used. the in vitro transcription of a plasmid coding for nucleotides - of the s segment of a puuv strain from baden-wuerttemberg (binder et al., unpublished) was done according to the protocol of the manufacturer (riboprobe® in vitro transcription system t , promega gmbh, mannheim, germany). the transcribed rna was serially diluted from - to - ng/ml with rna copies/µl limit of detection (lod). initial tissue samples were screened for puuv rna and viral load as rna copies/µl was determined in triplicates for organs of isolated positive animals. rna from the cell culture adapted strains puuv sotkamo and tulv moravia were used as positive and negative control for the rt-qpcr, respectively. for metagenomics, we extracted rna from either a pinheadsized piece of lung tissue or µl cell culture supernatant using µl qiazol lysis reagent (qiagen, hilden, germany) in combination with rneasy mini kit (qiagen, hilden, germany). for generation of complete genomes of cell culture supernatants, a previously published workflow was used [ ] . double-stranded, non-directional cdna libraries from lung tissue for sequencing on the illumina platform were prepared from total rna using the nebnext ultra ii rna library prep kit for illumina (new england biolabs, ipswich, ma, usa). per reaction, a total of ng rna was used as an input. rna was fragmented for min and final cdna libraries were amplified by cycles of pcr to complete adapter ligation and to generate enough material for target sequence enrichment. a custom-made mybaits target capture array (arbor biosciences, ann arbor, mi, usa), containing biotinylated rna probes against all available puuv sequences deposited in ncbi genbank database (august, ), was employed to capture puuv-containing sequences from total cellular cdna sequencing libraries. the hybridization-based sequence enrichment (chemistry v ) was performed according to the manufacturer's instructions (arbor biosciences, ann arbor, mi, usa). the enriched cdna sequencing libraries were amplified with pcr cycles to produce enough dna material for hts on the illumina platform. the enriched cdna libraries were quantified with the nebnext library quantification kit (new england biolabs, ipswich, ma, usa), pooled in equimolar amounts, and sequenced with a cycle miseq reagent kit v (illumina, san diego, ca, usa) using paired-end sequencing ( × cycles) on a miseq sequencer (illumina, san diego, ca, usa). the resulting reads were trimmed and assembled against the known complete genome of strain astrup from the osnabrück region [ ] with geneious r . . (https ://www.genei ous.com). for sequences lacking the ′ and ′ ends of the m segment, rna ligation was done using t rna ligase (thermo fisher scientific, waltham, ma, usa) and subsequent in vitro transcription with a first strand cdna synthesis kit (thermo fisher scientific, waltham, ma, usa). sequences were obtained by conventional dideoxy-chain termination sequencing after pcr with primers puuv os m fwd- ′ tga ggg caa tta tta tgt aa ′ and puuv os m rev ′ cca att gta tgt ggg cat tcc ′. the obtained sequences were deposited at gen-bank, accession numbers mn -mn . phylogenetic trees were reconstructed with four novel and published concatenated s, m, and l coding sequences or partial s segment sequences of nucleotides length. published sequences of other hantaviruses were obtained from genbank. analysis was performed by bayesian algorithms via mrbayes v. . . (https ://sourc eforg e.net/proje cts/ mrbay es/files /mrbay es/) on the cipres online portal [ ] . a mixed nucleotide substitution matrix was specified in independent runs of generations. phylogenetic relations are shown as a maximum clade credibility phylogenetic tree with posterior probabilities for major nodes. for immunofluorescence assay (ifa), veroe and mgn- -r cells were inoculated with µl puuv osnabrück/v or puuv osnabrück/m supernatant in dmem + % fcs as described previously [ ] . infected cells were fixed days post infection with a : mixture of acetone and methanol for min at − °c. after fixation cells were dried, re-hydrated with phosphate-buffered saline (pbs) and incubated with nucleocapsid (n) protein-specific antibody e [ ] diluted : in pbs for h at room temperature (rt). a secondary anti-mouse alexa fluor conjugated antibody (abcam, cambridge, uk) was used for detection of hantavirus proteins. nuclei were stained with ′, -diamidino- -phenylindole (dapi, thermo fisher scientific). for titration studies of puuv, mgn- -r and veroe cells were inoculated with µl of the puuv osnabrück/v or puuv osnabrück/m virus isolate and passaged three times as described above. supernatants of both cell lines were collected after passage three and frozen at − °c. subsequently, supernatants were serially diluted from - to - in dmem containing % fcs in a -well plate with three replicates each. a volume of µl of each dilution was added to h old cell monolayers of veroe cells in a -well plate. after incubation for days, the virus titer was calculated using ifa for puuv n protein detection as described above. titers were calculated as % tissue culture infectious dose (tcid )/ml by the spearman/kärber method [ ] and mean titers of three experiments are given. titers after isolation (passage of original lung tissue-derived sample) were used for comparison. for expression and generation of vlps in hek cells, a codon-optimized synthetic gene of the puuv gpc of the strain astrup [ ] was purchased (geneart, regensburg, germany). the gene encoding the glycoproteins was pcr amplified using primer pair o grs /o grs (aat-taaggt acc tcc aga ggc gac acc cgg aacc and aattattaag ctt tca ggg ctt gtg ttc ttt gg) and the pcr product and the acceptor vector phan- (roman-sosa, unpublished) were digested with the restriction endonucleases kpni and hindiii. the expression plasmid phan- was generated by standard molecular biology protocols. in this plasmid, the endogenous signal sequence of the puuv gn is substituted by the igg-light chain signal sequence and a double strep-tag with a glycine/serine-rich linker between the tags. then a permanently transfected hek cell line was generated upon transfection of the cells and selection in the presence of geneticin at . mg/ml. the vlps were affinity purified from the cell supernatants essentially as described [ ] . recombinant vlps were used for five immunizations of four weeks apart of female balb/c mice. hybridoma cells producing monoclonal antibodies (mabs) were generated by standard fusion procedure [ , ] and screened using a µg/ml stock solution of vlps according to an in-house elisa protocol [ ] and buffers without tween. resulting mabs were analyzed by ifa and western blot test for their reactivity to puuv osnabrück/v , puuv sotkamo, puuv vranica and tulv moravia. veroe cells were infected with puuv osnabrück/v , puuv sotkamo, puuv vranica or tulv moravia at moi . in dmem + % fcs. cells were harvested (puuv osnabrück/v , sotkamo) or (puuv vranica, tulv moravia) days post infection in sds sample buffer ( . mm trishcl ph . , % sds, % glycerol, m urea, . % bromophenol blue, . % phenol red) and proteins were separated by sds page, blotted onto polyvinylidenfluorid (pvdf) membranes. after blocking, the membranes were cut into strips and incubated over night with the antibodies e ( : ), f ( : ), b ( : ), b ( : ), h ( : ), g ( : ), b ( : ), g ( : ), g ( : ), h ( : ), h ( : ) or n protein-specific antibody e ( : , [ ] , all diluted in pbs-tween . %) at °c. a horseradish peroxidase (hrp) labeled secondary goat anti-mouse igg antibody diluted : in pbs-tween . % (bio-rad, hercules, ca, usa) was used for detection of hantaviral proteins. a rabbit anti-β-tubulin antibody (abcam, cambridge, uk) was used as a loading control. investigation of chest cavity lavage samples from bank voles was done by igg elisa using recombinant puuv strain bawa n protein, as described earlier [ ] . the monoclonal antibody e was used as a positive control [ ] , chest cavity lavage of a igg elisa-and rt-pcr-negative bank vole was used as negative control. chest cavity lavage samples with an optical density (od) value below the lower cut-off value were considered as negative. positive and doubtful samples were retested a second time. when the od value of the elisa was in a range between the lower and upper cut-off value defined according to our standard protocol [ ] , animals were considered doubtful. when the od value was above the upper cut-off value, the samples were considered as positive. rodent trapping at five sites from april th to th, in the osnabrück region resulted in the collection of bank voles [ ] . dissection on site and inoculation of veroe and bank vole mgn- -r cells with homogenized lung samples resulted after three blind passages in four potential isolates that were detected by a novel puuv rt-qpcr (table s , fig. ) . two of the potential candidates showed only low levels of puuv rna and were not able to consistently infect further passages (m , m ). quantification by rt-qpcr analysis of different tissues from these four bank voles confirmed lung tissue for most of the samples as having the highest puuv rna load, although it was detected in almost all other tissues investigated (fig. s ). rt-qpcr investigation of lung tissues of all bank voles resulted in the detection of hantavirus rna in animals (tables , s , [ ] ). puuv rna-positive animals originated from all five trapping sites. serological analysis of chest cavity lavages detected puuv n protein reactive antibodies in of bank voles (tables , s ). five additional animals, positive for puuv rna, were found to be equivocal in our serological test. all antibody-positive animals were also found to be puuv rna positive, indicating a high number of persistently infected voles. fifteen additional bank voles were only positive for puuv rna, but not for anti-puuv antibodies, indicating a high number of acutely infected animals in spring in this region (table ) . interestingly three of the four potential isolates originated from seronegative bank voles (table s ). two isolates (osnabrück v and osnabrück m ) were obtained by passaging in veroe or mgn- -r cells, which reached titers of almost tcid /ml ( fig. a and b , titer after isolation). shot-gun and hybrid-capture-mediated hts of both isolates resulted in the generation of complete genome sequences which are identical in sequence to the respective original strain in bank vole lung tissue except for one amino acid (aa) exchange each in the rna-dependent fig. ). the genome organization of the novel puuv isolates indicated the typical sequence elements for puuv: the small (s) segment encodes an n protein of aa residues and a putative nss protein of aa in an + overlapping reading frame, the medium (m) segment codes for the aa gpc and the large (l) segment for the rdrp of aa (see fig. , genbank accession numbers: mn -mn ). phylogenetic analysis of the concatenated s, m and l segment coding sequences grouped the novel isolates together with astrup prototype strain in sister relationship to puuv sequences from france (fig. a) . the phylogenetic analysis of a partial s segment sequence of the novel isolates and representative strains of all puuv clades and subclades from germany confirmed the close relationship of the new isolates to the osnabrück hills subclade (fig. b) . the puuv osnabrück m isolate was found to be contaminated by a bank vole reovirus; hts derived sequences of the passaged reovirus (genbank accession numbers: mn -mn ) showed a strong similarity to a bank vole reovirus strain, but much lower similarity to a common vole reovirus [ ] ). the non-reovirus contaminated isolate osnabrück v from veroe cells was found to have an insertion of nucleotides in the ′ non-coding region (ncr) when compared to the other isolate and the astrup reference sequence (fig. ) . however, this insertion was also found in the original lung sample and therefore no cell culture-specific adaptations were observed in the ncrs of both virus isolates (fig. ). figs. and ). this passaging resulted in no further mutations (genbank accession numbers: mn -mn ). however, the virus isolate passaged in veroe cells is accompanied by an increase in the virus titer to tcid /ml (fig. ) . in contrast, the passaging of the osnabrück v strain in mgn- -r cells resulted in a decreased virus titer. as no cytopathic effect was observed, virus detection for titration in both cell lines was done by immunofluorescence assay using an n proteinspecific monoclonal antibody (fig. a) . eleven monoclonal antibodies were produced in this study by immunization of mice with puuv strain astrup gpcderived vlps. evaluation of the virus isolate osnabrück v using these monoclonal antibodies resulted in typical immunofluorescence patterns in the cytoplasm (fig. ) . further analysis by western blot test using a lysate of isolate osnabrück v from veroe cells suggested that the majority of anti-gpc antibodies are directed against conformational epitopes; however, some recognize linear epitopes in gc or gn (table ). subsequent evaluation of the reactivity of these monoclonal antibodies with other puuv strains and tulv strain moravia indicated some level of crossreactivity for some of them (table ) . here, we describe the first isolation of a central european puuv strain. this strain of the central european lineage increases the available panel of puuv isolates: currently available isolates sotkamo, umea, vranica, and kazan, belong to the clades finnish, north scandinavian, most likely north scandinavian, and russian, respectively [ ] . the puuv-like hokkaido virus strain kitahiyama originates from japan [ ] . in our study, the isolation was based on an in-field dissection and inoculation of cells to prevent freeze/thaw cycles. the subsequent investigation of all bank voles indicated that three of four isolates originated from anti-puuv-seronegative voles. this finding illustrates that a serological test in the field might be misleading in selection of samples for successful virus isolation. instead, an on-site molecular assay may enhance the chance for a successful virus isolation. nevertheless, the approach used here still indicates the challenges of hantavirus isolation; only four isolates were obtained from a total of acutely infected bank voles. in addition, the determination of the complete genome sequences of two isolates including the ncrs expands our knowledge on the sequence diversity of puuv strains within the different regions of the genome. moreover, the hybrid-capture-based enrichment of puuv sequences allows a rapid determination of the complete genome and underlines the value of this workflow for hantavirus surveillance and molecular evolution studies [ ] . a phylogenetic analysis of partial s segment nucleotide sequences confirmed the previously reported subclades of puuv in germany; the novel isolates belong to the subclade osnabrück hills within the central european clade. the position within the phylogenetic tree also confirms the local evolution pattern of puuv reported before [ , ] . the observed high level of rt-qpcr-positive bank voles ( / ; %) confirms the district of osnabrück in spring as a hantavirus outbreak region [ ] . the puuv rna detection rate was similarly high at all five trapping sites of bank voles. although was identified as a hantavirus outbreak year in germany, the distribution of notified human puuv cases was not as homogeneous as in previous outbreak years [ ] . the passage of the puuv strains for isolation resulted in non-synonymous nucleotide exchanges in the l segment responsible for single amino acid exchanges in the rdrp (i m in m and d y in v ). the substituted amino acid residues are each very similar in their properties and, presumably, might not influence protein function. a more divergent adaptation at position s f has previously been observed for puuv strain kazan [ , ] . although in this previous study nucleotide exchanges in the ncr of the s segment were observed [ ] , here we did not find relevant mutations in this region after passaging in cell culture. the v strain showed an insertion in the ′ ncr, but this insert was also found in the original lung material used for isolation. additionally, this sequence insert was found in another sequence from the same region (jn . , [ ] ). the isolate v was shown to replicate in veroe and a bank vole kidney cell line. the low titer in the bank vole mgn- -r cell line might be due to the evolutionary lineage origin of this cell line (carpathian lineage); in central europe puuv is harbored by the western evolutionary lineage with spillover to the carpathian lineage in regions with sympatric occurrence of both [ ] . in line with the assumption of an association of a puuv clade with an evolutionary bank vole lineage, the vranica puuv strain replicated in mgn- -r cells, but not in bank vole kidney cells of another evolutionary lineage [ , ] . interestingly, replication of puuv-like hokkaido virus in cells of its host, the gray red-backed vole, was comparable to puuv infection [ ] . future investigations in cell lines and animals of different bank vole lineages are required to confirm this conclusion directly. the orthoreovirus contamination of one of the puuv isolates illustrates that bank voles may harbor additional reactivity of novel puuv gpc-specific monoclonal antibodies with hantavirus-infected veroe cells in immunofluorescence assay (ifa). antibodies were generated by immunization of balb/c mice with gpc-derived virus-like particles of puuv strain astrup. after screening and subcloning, monoclonal antibodies were tested in ifa. veroe cells were infected with puuv osnabrück v iso-late on coverslips and fixed for ifa after days. the monoclonal antibodies were administered for h at rt. detection of the specific antibody binding was done using an anti-mouse alexa fluor conjugated antibody. after staining, coverslips were mounted on glass slides for imaging infectious agents that may influence the susceptibility to puuv infections or their outcome. of note, in bank voles several viruses have been detected, i.e., polyoma-, herpesand hepaciviruses [ ] [ ] [ ] [ ] , but also bacterial agents and endoparasites [ ] [ ] [ ] . similarly, a hantavirus isolation approach was previously hampered by the coinfection by a striped field mouse adenovirus [ ] . future investigations are needed to evaluate potential influences of coinfections in bank voles. it has been shown that hantavirus gn and gc form complex spike-shaped structures [ ] that build conformational epitopes [ , ] . therefore, we selected an immunization procedure using puuv-gpc-derived vlps, as the organization of the glycoproteins resembles the one of the virion. a panel of eleven monoclonal antibodies was produced here and all of them were reactive with the new puuv isolate in immunofluorescence assay. the staining pattern, which is reminiscent of the one of the secretory pathway organelles, i.e., the golgi apparatus and the endoplasmic reticulum, suggests that the epitopes recognized by these antibodies are already accessible during the maturation process of the proteins. interestingly, some of the monoclonal antibodies recognize linear epitopes as revealed by a western blot assay. although preliminary results suggest that the antibodies do not neutralize the virus when tested individually, synergistic effects with a protective effect cannot be ruled out yet as shown for anti-ebola virus monoclonal antibodies [ ] . therefore, the novel antibodies represent a useful tool for further experimental, diagnostic, and therapeutic applications. in conclusion, the puuv isolate described here replicates in a bank vole cell line and its n and gpc proteins can be detected by specific monoclonal antibodies. therefore, this isolate will be useful for further studies on the virulence markers of central european puuv, its reservoir host association and the route of pathogenicity in the bank vole model. the novel gpc-specific monoclonal antibodies will enable future studies on virus entry and important domains for exposed immunogenic regions. funding florian binder acknowledges intramural funding by the friedrich-loeffler-institut. additional funding was provided by the bundesminsterium für bildung und forschung through the research network zoonotic infections (robopub consortium, fkz ki a, awarded to rgu; fkz ki h, awarded to laves) for trapping and rodent screening, the rapid project within the infect control veroe cells were inoculated with puumala virus (puuv) osnabrück/v , puuv sotkamo, puuv vranica or tula virus (tulv) strain moravia. infected cells were fixed (puuv osnabrück/v , sotkamo) or (puuv vranica, tulv moravia) days post infection for immunofluorescence assays or collected in sample buffer for western blot analysis. after fixation or western blot transfer, novel gpc-specific mabs e , f , b , b , h , g , b , g , g , h , and h were administered. gn-and gcreactive mabs were assigned where possible according to molecular weight of the immunoreactive bands in western blot analysis − negative; (+) weak reactivity; + positive; ++ strongly positive conflict of interest the authors declare that they have no competing interests. ethical approval all animals were handled according to the applicable institutional, national and international guidelines for the care and use of animals. bank vole trapping was conducted in line with the regular pest control of the laves veterinary task-force in lower saxony, germany (department of pest control, oldenburg) according to german federal law ( § , gesetz zur verhütung und bekämpfung von infektionskrankheiten beim menschen). the immunization of mice was done in line with the general immunization program of the friedrich-loeffler-institut (landesamt für landwirtschaft, lebensmittelsicherheit und fischerei, mecklenburg-vorpommern, permit: / ). open access this article is licensed under a creative commons attribution . international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/ . /. hantavirus infections hantaviruses-globally emerging pathogens weiss s ( ) molecular and epidemiological characteristics of human puumala and dobrava-belgrade hantavirus infections coding strategy of the s and m genomic segments of a hantavirus representing a new subtype of the puumala serotype isolation of the causative agent of hantavirus pulmonary syndrome propagation of nephropathia epidemica virus in cell culture isolation and characterization of puumala hantavirus from norway: evidence for a distinct phylogenetic sublineage cell culture adaptation of puumala hantavirus changes the infectivity for its natural reservoir, clethrionomys glareolus, and leads to accumulation of mutants with altered genomic rna s segment a new permanent cell line derived from the bank vole (myodes glareolus) as cell culture model for zoonotic viruses common vole (microtus arvalis) and bank vole (myodes glareolus) derived permanent cell lines differ in their susceptibility and replication kinetics of animal and zoonotic viruses more novel hantaviruses and diversifying reservoir hosts-time for development of reservoirderived cell culture models? viruses isolation of hokkaido virus, genus hantavirus, using a newly established cell line derived from the kidney of the grey red-backed vole (myodes rufocanus bedfordiae) characterization of monoclonal antibodies against hantavirus nucleocapsid protein and their use for immunohistochemistry on rodent and human samples sensitive detection of hantaviruses by biotin-streptavidin enhanced immunoassays based on bank vole monoclonal antibodies novel serological tools for detection of thottapalayam virus, a soricomorpha-borne hantavirus bank vole monoclonal antibodies against puumala virus envelope glycoproteins: identification of epitopes involved in neutralization the use of chimeric virus-like particles harbouring a segment of hantavirus gc glycoprotein to generate a broadly-reactive hantavirusspecific monoclonal antibody human recombinant neutralizing antibodies against hantaan virus g protein hantavirus gn and gc envelope glycoproteins: key structural units for virus cell entry and virus assembly virus-like particles: a versatile tool for basic and applied research on emerging and reemerging viruses. viral nanotechnologies complete genome of a puumala virus strain from central europe protocadherin- is essential for cell entry by new world hantaviruses spatiotemporal dynamics of puumala hantavirus associated with its rodent host myodes glareolus host-associated absence of human puumala virus infections in northern and eastern germany heterogeneous puumala orthohantavirus situation in endemic regions in germany in summer aphaea/ewda species card: voles and mouses a versatile sample processing workflow for metagenomic pathogen detection a restful api for access to phylogenetic tools via the cipres science gateway beitrag zur kollektiven behandlung pharmakologischer reihenversuche antigenic and cellular localisation analysis of the severe acute respiratory syndrome coronavirus nucleocapsid protein using monoclonal antibodies indirect elisa based on hendra and nipah virus proteins for the detection of henipavirus specific antibodies in pigs phylogenetic analysis of puumala virus subtype bavaria, characterization and diagnostic use of its recombinant nucleocapsid protein isolation and complete genome characterization of novel reassortant orthoreovirus from common vole (microtus arvalis) phylogeography of puumala orthohantavirus in europe secondary contact between diverged host lineages entails ecological speciation in a european hantavirus multiple synchronous outbreaks of puumala virus adaptation of puumala hantavirus to cell culture is associated with point mutations in the coding region of the l segment and in the noncoding regions of the s segment evidence for novel hepaciviruses in rodents identification of two novel members of the tentative genus wukipolyomavirus in wild rodents identification of novel rodent herpesviruses, including the first gammaherpesvirus of mus musculus molecular detection and characterization of the first cowpox virus isolate derived from a bank vole leptospira genomospecies and sequence type prevalence in small mammal populations in germany high prevalence of rickettsia helvetica in wild small mammal populations in germany. ticks and tick-borne diseases occurrence of gastrointestinal parasites in small mammals from germany. vector borne zoonotic dis a novel cardiotropic murine adenovirus representing a distinct species of mastadenoviruses molecular organization and dynamics of the fusion protein gc at the hantavirus surface cooperativity enables non-neutralizing antibodies to neutralize ebolavirus acknowledgements open access funding provided byprojekt deal. the authors would like to thank sönke röhrs for help with rodent trapping, stephan drewes for help with phylogenetic analysis and sven sander and patrick zitzow for excellent technical support with generation of monoclonal antibodies and sequencing of puuv isolates. the authors thank martin beer, klaus osterrieder and nicole tischler for constant support and helpful discussions.author contributions rgu and fb designed the study and wrote the manuscript. fb did virus isolation, infection studies, sequence analysis, phylogenetic analysis, and testing of monoclonal antibodies. sr and fb generated and screened the monoclonal antibodies. grs produced the vlps for immunization. ms and fb performed rodent trapping. dh, jt, and dk did the complete genome sequencing of puuv isolates. rr developed the puuv-specific rt-qpcr assay. all authors gave significant ideas for the presented work and were involved in writing and proof reading of the manuscript. key: cord- -o jdt j authors: callison, scott andrew; jackwood, mark w.; hilt, deborah ann title: infectious bronchitis virus s gene sequence variability may affect s subunit specific antibody binding date: journal: virus genes doi: . /a: sha: doc_id: cord_uid: o jdt j the s gene of several strains of infectious bronchitis virus (ibv) belonging to the arkansas, connecticut, and florida serotypes was sequenced. phylogenetic analysis of the s gene nucleotide and deduced amino acid sequence data resulted in groups of strains that were the same as groupings observed when s sequence data was used. thus, it appears that s subunits are conserved within a serotype but not between serotypes. although the sequence differences were small, we found that only a few amino acid differences were responsible for different secondary structure predictions for the s subunit. it is likely that these changes create different interactions between the s and s subunits, which could affect the conformation of the s subunit where serotype specific epitopes are located. based on this sequence data, we hypothesize that the s subunit can affect specific antibody binding to the s subunit of the ibv spike glycoprotein. infectious bronchitis (ib) is an acute, highly transmissible, upper respiratory tract disease in chickens. clinical signs include tracheal rales, nasal exudate, coughing, and sneezing. infectious bronchitis affects both sexes and the disease may spread to the reproductive and renal systems ( ) . it is of economic importance because it can cause poor weight gain and reduced feed ef®ciency in broilers and a decline in egg production and egg quality in layers ( ) . infectious bronchitis virus (ibv), the causal agent of ib, is a member of the coronaviridae family. the virion is pleomorphic (diameter ± nm) and enveloped with club-shaped surface projections (spikes) on the surface of the virion. it contains a single stranded, positive-sense rna genome approximately . kb in length ( ) . the virion contains four major structural proteins: a nucleocapsid (n) protein associated with the viral rna, the integral membrane (m) glycoprotein, a small membrane (sm) protein, and the spike (s) glycoprotein. the s glycoprotein is a polypeptide of approximately amino acids. it is proteolytically cleaved after translation into two subunits, s and s ( ) . both subunits are glycosylated with high mannose, n-linked oligosaccharides ( ) . the virion spike is thought to be an oligomeric protein composed of two polypeptides each of the s and s subunits. the two subunits associate by noncovalent forces and retain their three-dimensional shape by way of intrapeptide, but not interpeptide, disul®de bridges ( ) . the s subunits, which form the stalk portion of the spike, anchor it in the membrane, whereas the s subunits form the globular head of the spike glycoprotein ( ) . the s subunit encodes amino acids involved in the induction of neutralizing, serotype speci®c, and hemagglutination inhibiting antibodies ( , ) . although the s subunit of ibv has been examined extensively, the s subunit remains enigmatic. based on the highly conservative nature of the s subunit among different members of the coronavirus genus *corresponding author. e-mail: mjackwoo@arches.uga.edu and different strains of ibv, it would appear that it plays little or no role in the induction of a host immune response ( ) . however, it has been shown for ibv that an immunodominant region localized in the n-terminal half of the s subunit can induce neutralizing, but not serotype speci®c antibodies ( ) . a dna-binding protein region or leucine zipper motif has also been identi®ed in the s subunit of other coronaviruses ( ) . leucine zipper motifs are thought to be involved in transcriptional activation. furthermore, site-directed mutagenesis of the s subunit of another coronavirus, mouse hepatitis virus (mhv), inhibited the binding of a virus neutralizing monoclonal antibody to the unchanged s subunit ( ) . last, a monoclonal antibody neutralization resistant mutant was reported to have an s gene sequence identical to the parental virus, suggesting that the mutant escapes neutralization due to changes in the s gene sequence ( ) . thus, we are interested in examining the s gene and its deduced amino acid sequence of ibv strains in an attempt to determine if it plays a role in the binding of s subunit speci®c antibodies to the virus. we selected four strains belonging to the arkansas serotype, ark , ark dpi, - , and gav because their s deduced amino acid sequences were very similar, %. strains - and gav were determined to be ark-``like'' strains by restriction fragment length polymorphism (rflp) analysis and later con®rmed by serology studies. ( ) . we also selected connecticut and florida for s gene sequencing because these strains are known to share . % deduced amino acid identity for their s subunits, yet remain serologically distinct ( , ) . infectious bronchitis virus strains used in this study are listed in table . viruses were inoculated into embryonating eggs for propagation ( ) . the allantoic uid was harvested and stored at À c until needed. the boehringer mannheim (bm) high pure pcr template preparation kit (indianapolis, in) was used to extract viral rna from allantoic¯uid per the manufacturer's directions. the s gene of the ibv strains was ampli®ed using primers that¯anked both sides of the entire s gene. the h pcr primer ( h -ttgaatcattaaacagac- h ) was designated s - h ark, and the h pcr primer ( h -gtaggtattcttacttcacgta- h ) was designated s - h ark. the relative primer positions using the atg start site for the beaudette strain s gene (m ) as , were to for s - h ark and to for s - h ark. the reverse transcriptase (rt) and polymerase chain reaction (pcr) were conducted as previously described ( ) . the amplicon was puri®ed and concentrated using genelute tm spin columns (supelco, bellefonte, pa - ) and microcon tm columns (amicon, beverly, ma ), respectively. ( ) ) and hydrophobicity plots (hopp and woods ( ) and kyte and doolittle ( )) using s deduced amino acid sequence data were done with computer algorithms using macdnasis pro v . and lasergene v . . there was a high nucleotide similarity for the s genes from the ibv strains used in this study ( table ). the s gene sequence for the related arkansas serotype strains ark and ark dpi were identical, while - and gav were respectively . % and . % similar to both ark and ark dpi. the s gene nucleotide sequences of the florida and connecticut strains were . % similar. the deduced amino acid sequence of the s subunit was also compared ( there were few amino acid differences among all the ibv strains (fig. ) . the strains - and gav had and amino acid substitution differences, respectively, when compared with the ark and ark dpi strains. the florida and connecticut strains had only two differences between themselves, and both were nonconservative. sequence data for the s genes of other ibv strains was used to construct a phylogenetic tree for the deduced amino acid sequence of the s subunit (fig. ) . in the alignment, members of the u.s. serotypes arkansas, mass, connecticut, florida, and foreign s gene sequence variability serogroups b and c, fall into the same groupings as observed when deduced amino acid sequence data for the s subunit is used for phylogenetic analysis (fig. ) . however, the range of percent similarities was much less for the s subunit sequence than that observed for the s subunit sequence data. there were no amino acid differences within the immunodominant region of the s subunit for the arkansas serotype strains (approximately the ®rst residues). there were also no differences between the connecticut and florida strains in the immunodominant region, however, there were differ- hydrophobicity plots using the hopp and woods ( ) algorithm gave identical values of À . + . for each strain. however, there were differences in the predicted secondary structures using the method of chou and fasman ( ) . the predicted secondary structure of the s subunit of the ark and ark dpi strains were identical due to their identical protein sequence. the predicted secondary structure of the s subunit of the - strain differed from that of the ark and ark dpi strains due to amino acid substitutions at position (e to g) and (h to n) that resulted in the addition of two turns (fig. ) . the gav strain differed tremendously due to an amino acid substitution at position (e to g), resulting in an odd number of turns between amino acids and . the odd number of turns resulted in a ¯i p in the middle of the predicted secondary structure. the predicted secondary structure for the s subunit of the florida and connecticut strains was remarkably different (fig. ). there were two nonconservative amino acid changes at positions and . the alanine residue at position for the connecticut strain was changed to a threonine residue for the florida strain. this resulted in the changing of some amino acid residues from helix to sheet and the addition of a turn in the predicted secondary structure of the s subunit for the florida strain. the histidine residue at position for the connecticut strains was changed to a tyrosine residue for the florida strain. this resulted in the changing of some amino acid residues from helix to sheet and the addition of a coil in the secondary structure of the s subunit for the florida strain. we analyzed six strains of ibv in the arkansas, connecticut, and florida serotypes. although s sequence data are more conserved among different strains of ibv than s sequence data, it appears that strains can be grouped into serotypes based on s gene nucleotide sequence data, as well as deduced amino acid sequence for the s subunit. this agrees with s gene phylogenetic trees for u.s. and international viruses. the only exception for grouping is between the connecticut and florida serotypes, which cannot be grouped into different serotypes using s gene or deduced amino acid sequence data, but can be separated serologically ( , ) . based on the secondary structure predictions using the chou and fasman ( ) algorithm it appears that only a few amino acid changes in the correct location can alter the shape of the s subunit. one change in the gav s deduced amino acid sequence ( position e?g) led to a ¯i p in the secondary structure prediction of the s subunit. the two nonconservative amino acid changes between the florida and connecticut strains led to radically different secondary structure predictions. it is plausible that these s subunit secondary structure changes could affect the tertiary structure of the s subunit. therefore, creating different interactions between the s and s glycoproteins that could change the quaternary structure of the spike glycoprotein. such changes would affect antibody binding and therefore account for serologic differences between gav , - , and arkansas viruses as well as the serotype differences between the connecticut and florida strains. the s and s subunits are known to interact by noncovalent attractive forces ( ) . other research on a different coronavirus, mouse hepatitis virus, by grosse et al., showed that a single amino acid change in the s subunit could create a s subunit speci®c monoclonal antibody resistant mutant ( ) . this suggests that the interaction between s and s subunits may determine the shape or availability of s subunit speci®c epitopes. whether the s subunit is actually involved in s subunit speci®c antibody recognition, sterically hinders antibody from binding to the s subunit, or effects the presentation of s subunit epitopes is not known. however, from our sequence data we hypothesize that the s subunit can affect binding of s subunit speci®c antibody due to s gene variability and subsequent secondary structure differences. the nucleotide sequence data reported in this paper have been submitted to the genbank nucleotide sequence data base and have been assigned the following accession numbers: arkansas , af ; arkansas dpi, af ; - , af ; gav , af ; connecticut , af ; florida , af . the coronaviridae diseases of poultry. iowa state university press fundamental virology the coronaviridae arch of virol , ± a laboratory manual for the isolation and identi®cation of avian pathogens secondary structure prediction of the s glycoprotein for the connecticut and florida strains of ibv using the chou and fasman proceedings from the national academy of science key: cord- - pzjyrdf authors: lima, francisco esmaile de sales; campos, fabrício souza; kunert filho, hiran castagnino; batista, helena beatriz de carvalho ruthner; carnielli júnior, pedro; cibulski, samuel paulo; spilki, fernando rosado; roehe, paulo michel; franco, ana cláudia title: detection of alphacoronavirus in velvety free-tailed bats (molossus molossus) and brazilian free-tailed bats (tadarida brasiliensis) from urban area of southern brazil date: - - journal: virus genes doi: . /s - - -x sha: doc_id: cord_uid: pzjyrdf a survey was carried out in search for bat coronaviruses in an urban maternity roost of about specimens of two species of insectivorous bats, molossus molossus and tadarida brasiliensis, in southern brazil. twenty-nine out of pooled fecal samples tested positive by reverse transcription-pcr contained fragments of the rna-dependent rna polymerase gene of coronavirus-related viruses. the sequences clustered along with bat alphacoronaviruses, forming a subcluster within this group. our findings point to the need for risk assessment and continued surveillance of coronavirus infections of bats in brazil. electronic supplementary material: the online version of this article (doi: . /s - - -x) contains supplementary material, which is available to authorized users. bats (order chiroptera, suborders megachiroptera and microchiroptera) are one of the most diverse and widely distributed groups of mammals, representing * % of all known mammalian species [ ] . about a different viruses have been identified in bats of different species in asia, europe, north america and africa. therefore, such species may be natural reservoirs for a large variety of potentially zoonotic rna viruses, such as lyssaviruses, paramyxoviruses, ebola and marburg viruses as well as the recently emerged severe acute respiratory syndrome coronavirus (sars-cov) [ ] [ ] [ ] [ ] [ ] . a variety of other coronaviruses have been detected in many bat species from asia, including specimens of the genus rhinolophus, which were found to be infected with sars-like cov. phylogenetic analyses of such viruses revealed that those form a large clade within betacoronavirus genus, along with sars coronaviruses from palm civets and the sars coronaviruses recovered from humans during the outbreak [ , ] . these data suggested that the agent responsible for the - pandemic might have originated from bats. in addition, in , a new human coronavirus (hcov-emc), which has been associated to clinical disease that resembles sars, emerged in the middle east. this new virus appears to have originated from bats, raising the possibility that hcov-emc jumped species directly from bats to humans [ ] . in brazil, most studies looking for associations between bats and viruses have focused on the role for those species as reservoirs for rabies virus [ ] . however, to date, more than bat species have been detected in brazil, comprising members of the families phyllostomidae, vespertilionidae, and molossidae. it is estimated that at least bat species live in the state of rio grande do sul, southern brazil, where the predominantly sub-tropical climate seems to favor the settlement of such species [ ] . in view of the potential role that bats may play in the transmission of new viral infections to humans and other species, this study was set up in search for coronavirus genomes in bats from the urban area of porto alegre ( ° s; ° w), a town with about . million inhabitants and capital of the state of rio grande do sul, brazil. with that purpose, coronavirus rna was searched in feces of two species of synanthropic insectivorous bats collected in a maternity roost within the urban area of the city. a maternity roost of bats known to have direct contact with people and domestic animals was identified in the summer of in the attic of a residence in the central area of porto alegre, southern brazil. the colony was estimated to harbor about bat specimens of insectivorous bats of two species, velvety free-tailed bats (molossus molossus) and brazilian free-tailed bats (tadarida brasiliensis). speciation was confirmed by amplification and sequencing of the mitochondrial cytochrome b (cytb) gene as described [ ] . one hundred and fifty fecal samples were collected from the attic floor as follows: a plastic film was spread on the ground of the attic compartment and fresh droppings were collected with clean disposable forks in the following night. each sample consisted of five fecal droppings, which were immediately sent to the laboratory and stored at - °c. the samples were then submitted to total rna extraction with trizol (invitrogen tm ). cov rna screening was performed by reverse transcription-polymerase chain reaction (rt-pcr) in a total volume of ll reaction using conserved primers for the rna-dependent rna polymerase gene (forward: -ggttgggactatc ctaagtgtga- and reverse: -ccatcatcagatag aatcatcata- ). this pair of primers is expected to give rise to amplicons of bp [ ] . the cycling conditions were: min at °c followed by cycles of min at °c, min at °c and min at °c, followed by a final extension time of min at °c. bovine coronavirus (bcov) rna was used as a positive control to optimize the assay. standard precautions were taken to avoid pcr contamination; blank controls without template were included in every set of five rt-pcr assays. five microliters of the pcr products were electrophoresed in . % agarose gels and the products visualized on uv light after staining with ethidium bromide. the amplicons obtained were cloned into pcr Ò . -topo Ò cloning kit (invitrogen) before being submitted to nucleic acid sequencing. sequencing was performed with the big dye terminator cycle sequencing ready reaction (applied biosystems, uk) in an abi-prism genetic analyzer (abi, foster city, ca), following the manufacturer's protocol. sequence analyses were performed with the blast software [ ] . nucleotide sequences were aligned and compared to human and animal cov sequences available at genbank database with the program clustalx . [ ] . alignments were optimized with the bioedit sequence alignment editor program version . . [ ] . the protocol to generate the phylogenetic trees was selected with the program modeltest . [ ] . phylogenetic analysis was carried out using mega . ; pairwise genetic distances were calculated by the tamura -parameter model and phylogenetic trees were constructed using the neighbourjoining method. bootstrap values were determined by , replicates to assess the confidence level of each branch pattern. pcr amplicons with the expected size of the targeted region were obtained from out of the ( . %) pools of bat fecal samples. the nucleotide sequences of sixteen randomly selected amplicons were determined and submitted to genbank (accession numbers kc to kc ). genetic analyses provided evidence that the viruses circulating in these two species of insectivorous bats belong to the genus alphacoronavirus. when compared with each other, all the obtained sequences showed a high nucleotide and amino acid identity ( . to % and to %, respectively) (supplemental material). the rdrp sequences examined here were distantly related (\ % nt identity) to other known alphacoronaviruses. the closest bat coronaviruses rdrp sequences found in genbank were the asian (btcov/a / ) and north american (rm-btcov and rm-batcov ) bat coronaviruses (fig. ) . the percentage of nucleotide similarity between the sequences described here and those of asian and north american coronaviruses ranged from . to %, whereas at the amino acid level, the similarity ranged from to % (data not shown). during the last two decades, several studies have shown that various important human and animal pathogens are of bat origin; these species have become targets for several surveillance studies aiming the detection of other potentially pathogenic viruses for humans and other animals. the association of these pathogens and possible disease outbreaks caused by direct or indirect contact of humans with bats stimulated the development of research activities on bat-borne viruses. in addition, the advances of molecular techniques offer opportunities for the discovery of novel dna and rna bat viruses without the need for virus isolation and bat pooled fecal samples being used as source for viruses, preventing animal manipulations [ , ] . in our study, we detected rdrp sequences of bat cov at a frequency of . % in the examined samples; such frequency is comparable to previous results obtained in similar studies from different bat species in other countries (ranging figure) . the tree was generated based on the neighbor joining method in the mega program. the nucleotide sequence of the equivalent genome fragment of sars-cov was included as outgroup (fig. ) . these results show that similar coronaviruses are found in different bat species that are distributed in geographically distant regions, suggesting a low degree of host restriction for coronavirus in those bat populations. in contrast to the enormous diversity of cov genomes found in old world bats [ , ] , in this study and in several others concerning the cov detection in new world bats, only alphacoronaviruses were detected [ , , , ] . based on these results, it has been hypothesized that covs found in new world bats are less diverse than those detected among old world bats [ ] . in this initial study, samples were restricted in location and variety of bat species, and we found only alphacoronaviruses. such findings do not reflect data on incidence or prevalence of such infections in bat populations. however, one cannot exclude the possibility that a greater diversity may become apparent in brazilian bats as long as larger numbers of samples from a wider spectrum of species are examined. to our knowledge, this is the first report of cov detection in feces from presumably healthy insectivorous bats in brazil. however, it is very likely that other bat species might also be infected with similar viruses. additional studies with larger numbers of bats and bat species, as well as the continued vigilance on the occurrence of viral infections in bats over the years is required to follow the evolution of bat coronaviruses in its interactions with the different bat host species. in addition, the detection of covs in brazilian bat populations in close proximity to human inhabitants may represent a risk to human health. our findings point to the need to identify the prevalence of covs in brazilian bats, to perform risk assessment studies and continued surveillance of coronavirus infections from both urban and rural environments. mammal species of the world: a taxonomic reference vector borne zoonotic dis proc. natl. acad. sci. usa bioedit: a user-friendly biological sequence alignment editor and analysis program for windows / /nt acknowledgments we would like to thank the government agencies finep, cnpq, and capes by the financial support. p.m. roehe, a.c. franco, and f.r. spilki are cnpq research fellows. key: cord- -ujp authors: gu, wen-yuan; li, yan; liu, bao-jing; wang, jing; yuan, guang-fu; chen, shao-jie; zuo, yu-zhu; fan, jing-hui title: short hairpin rnas targeting m and n genes reduce replication of porcine deltacoronavirus in st cells date: - - journal: virus genes doi: . /s - - -y sha: doc_id: cord_uid: ujp porcine deltacoronavirus (pdcov) is a recently identified coronavirus that causes intestinal diseases in neonatal piglets with diarrhea, vomiting, dehydration, and post-infection mortality of – %. currently, there are no effective treatments or vaccines available to control pdcov. to study the potential of rna interference (rnai) as a strategy against pdcov infection, two short hairpin rna (shrna)-expressing plasmids (pgenesil-m and pgenesil-n) that targeted the m and n genes of pdcov were constructed and transfected separately into swine testicular (st) cells, which were then infected with pdcov strain hb-bd. the potential of the plasmids to inhibit pdcov replication was evaluated by cytopathic effect, virus titers, and real-time quantitative rt-pcr assay. the cytopathogenicity assays demonstrated that pgenesil-m and pgenesil-n protected st cells against pathological changes with high specificity and efficacy. the % tissue culture infective dose showed that the pdcov titers in st cells treated with pgenesil-m and pgenesil-n were reduced . - and . -fold, respectively. real-time quantitative rt-pcr also confirmed that the amount of viral rna in cell cultures pre-transfected with pgenesil-m and pgenesil-n was reduced by . and . %, respectively. this is believed to be the first report to show that shrnas targeting the m and n genes of pdcov exert antiviral effects in vitro, which suggests that rnai is a promising new strategy against pdcov infection. porcine deltacoronavirus (pdcov) is a new member of the deltacoronavirus genus coronavirus that causes intestinal disease. clinical symptoms include vomiting, diarrhea, dehydration, and even death of piglets [ ] [ ] [ ] . since the first report of pdcov in hong kong in [ ] and the outbreak of pdcov in the usa in [ ] , the novel porcine coronavirus has been detected in canada, south korea, thailand, mexico, and china [ , [ ] [ ] [ ] . pdcov has become an important pathogen affecting healthy development in the pig industry. however, there are currently no vaccines or treatments that can effectively control pdcov [ ] . pdcov is an enveloped, single-stranded, and positivesense rna virus. the entire genome contains about , nucleotides [ ] . pdcov has four major structural proteins: spike (s), envelope (e), membrane (m), and nucleocapsid (n) [ , , ] . the detailed function of each pdcov proteins is unknown. according to studies on other coronaviruses, m protein is the most abundant component of the viral envelope. it plays an important role in viral assembly process and budding [ , ] . the m protein can also induce production of protective antibodies [ , ] . the n protein of the coronavirus forms a helical nucleocapsid with genomic rna and protects the viral genome from external interference [ , ] . rna interference (rnai) is a process that effectively silences or inhibits the expression of a gene of interest, which is achieved by double-stranded rna (dsrna), which edited by zhen f. fu. wen-yuan gu, yan li and bao-jing liu have contributed equally to this work. selectively inactivates the mrna of the target gene. rnai has been successfully used in infection inhibition studies of animal viruses, such as porcine epidemic diarrhea virus (pedv) [ ] , influenza virus a [ ] , porcine transmissible gastroenteritis virus (tgev) [ ] , and porcine reproductive and respiratory syndrome virus [ ] . however, whether rnai inhibits the replication of pdcov has not been reported. rnai interferes with viral replication through short hairpin rnas (shrnas) or small interfering rnas (sirnas). shrna is more stable, long-lasting, and more efficient than sirna. therefore, in this study, we constructed two shrnas (pgenesil-m and pgenesil-n) in a plasmid expression system that targeted the m and n genes of pdcov, and investigated the efficiency of shrna-mediated rnai of pdcov replication in vitro. st cells were cultured in dulbecco's modified eagle's medium containing % heat-inactivated fetal bovine serum (zhejiang tianhang biotechnology co. ltd., hangzhou, china) and % penicillin-streptomycin solution, and incubated in a °c environment containing % carbon dioxide. the pdcov strain hb-bd (genbank no. mf ) was propagated in st cells as previously described [ ] . after % of the cells developed a cytopathic effect (cpe) of viral infection, the culture (cells plus medium) was collected and subjected to three freeze-thaw cycles to lyse the cells. the virus titer was determined by the % tissue culture infectious dose (tcid ) as described previously [ ] . the predicted and analyzed shrnas were obtained by predicting the m and n gene shrnas of pdcov using rnai target finder software (www.ambit ion.com/techl ib/misc/ sirna ). using the blast program, the candidate shrna sequences were aligned with the pig genome and other pdcov sequences submitted to genbank, and the specific strong shrnas were selected. to ensure a similar rnai effect on different pdcov strains, two theoretically effective sequences at nucleotide positions - (m) and - (n) were selected. dsdna sequences encoding the shrnas were synthesized, with -nt ′ single-stranded overhangs complementary to bamhi and hindiii-cleaved dna at the ends. scrambled sirna sequences were designed as negative controls (nc). the sequences are shown in table . each ds-shrna-coding sequence was ligated into the bamhi and hindiii restriction sites of the shrna expression vector pgenesil- (wuhan genesil biotechnology, china) and transformed into escherichia coli competent cells. the recombinant plasmids were named pgenesil-m, pgenesil-n, and pgenesil-nc. st cells were seeded in a -well plate at × per well in μl medium without antibiotics in a conventional manner and cultured for - h at °c in a % co environment. when the degree of monolayer cell confluence reached - %, transfection was started according to the manufacturer's instructions of lipogene™ plus transfection reagent (us everbright lnc., san francisco, us). briefly, μg of transfection reagent was diluted with μl of opti-mem and stand for min at room temperature, then mixed with μl diluted shrnaexpressing plasmid ( μg shrna-expressing plasmid was diluted with μl of opti-mem). after min, the cells of each well were washed three times and overlaid with μl of the transfection mixture. the cells were incubated for h at °c, and the medium was changed. after h, the cells were infected with tcid pdcov. st cells infected with pdcov but had undergone the transfection procedure without plasmid dna added served as mock-transfected controls. expression vector pgenesil- encodes the enhanced green fluorescent protein (egfp). therefore, egfp expression in a cell line can be used as a in this study, we used inverted fluorescence microscopy to capture images to evaluate the cell transfection efficiency and cpe. pdcov cultures in shrna expression plasmid-transfected st cells were harvested h after virus infection. after three repeated freeze-thaw cycles, the virus was diluted from − to − and added to a -well plate with eight wells per dilution of virus. cpe was observed and recorded daily, and viral titer was measured by tcid using the reed-muench method as previously described [ , ] . st cells transfected with shrna recombinant plasmids (pgenesil-m and pgenesil-n) and a scrambled shrna recombinant plasmid (pgenesil-nc) were observed under fluorescence microscopy (fig. ) . normal st cells showed no fluorescence, while st cells transfected with recombinant plasmid showed fluorescence. the shrna recombinant plasmids were successfully transfected into st cells, and the transfection efficiency of the two recombinant plasmids and the scrambled shrna plasmid was similar. to analyze whether shrna can prevent st cells from exhibiting cpe due to pdcov infection, recombinant plasmids pgenesil-m and pgenesil-n were transfected into st cells seeded in triplicate in -well plates. the vector pgenesil-nc expressing the non-specific shrna was used as a negative control. at h after transfection, cells were infected with pdcov at tcid , and cells were examined for cpe every day. cpe was observed h after infection and photographed (fig. ) . st cells infected with virus only or st cells transfected with a negative control plasmid (pgenesil-nc) became enlarged, round, dense granular cells, occurring individually or in clusters. we also observed signs of cell shrinkage and detachment from the monolayer. as for the cells transfected with plasmids pgenesil-n and pgenesil-m expressing specific shrnas, observation showed that the extent of cpe was reduced. to analyze the inhibition of pdcov replication by shrna, viral titers in st cells were calculated by the reed-muench method at h after viral infection (fig. ) . the results showed that the titers of pdcov were significantly different from the virus titers transfected with pgenesil-nc (p < . ), while the difference between pgenesil-nc and mock-transfected cells was not significant. pgenesil-n showed higher inhibition efficiency than that of pgenesil-m shrna. if rnai is successful, replication of pdcov is inhibited, and the amount of the corresponding m and n genes is less. we used the n gene as a standard to analyze the effect of shrna inhibition of pdcov replication. realtime quantitative rt-pcr analysis of n gene level was normalized to the corresponding β-actin in the same sample (fig. ) . the relative amount of n gene in mock cells was regarded as . , whereas the relative amounts of n gene in cells infected with pdcov after being transfected with pgenesil-m, pgenesil-n, and pgenesil-nc were . , . , and . , respectively. analysis of these data revealed that the amount of viral rna in samples transfected with pgenesil-m and pgenesil-n was reduced by . and . %, respectively, compared to the mock control. this suggests potent inhibition of pdcov replication triggered by sequence-specific shrnas in st cells. pdcov is a recently discovered porcine enteropathogenic coronavirus [ , [ ] [ ] [ ] . since , pdcov has emerged in many provinces, leading to significant economic losses in swine husbandry in china. however, there are presently no effective treatments or vaccines available to control pdcov [ ] . so, there is an urgency to develop an effective method for treatment of pdcov. rnai is a gene silencing mechanism at the post-transcriptional level with high specificity and can inhibit gene expression with high efficiency. therefore, rnai has been considered as an effective strategy to protect against bacterial and viral pathogens [ , ] . rnai is triggered by endogenous or exogenous - nt rna duplexes [ ] , and shrna and sirna are two commonly used rna molecules to block gene expression [ , ] . compared with sirna, the interference efficiency induced by shrna was more effective. recently, the interference efficiency of shrna in some coronaviruses has been studied. wang designed and constructed three recombinant plasmids targeting the m gene of porcine tgev. after transfection into pk cells, m gene expression was reduced by , , and % [ ] . shen et al. [ ] have constructed five shrna-expressing plasmids targeting the n, m, and s genes of pedv. pedv rna in vero cells pre-transfected with these plasmids was reduced by - . %. however, because the success rate of pdcov isolation was low [ , , ] , there are no reports on the use of shrna to inhibit replication of pdcov, which belongs to the same virus family as pedv and tgev. in our previous study, pdcov strain hb-bd was successfully isolated and serially passaged in cell culture and characterized. in this study, we designed two shrnas based on the theoretically valid sequences of the m and n genes of pdcov with nucleotide positions of - (m) and - (n), and established three shrna recombinant expression plasmids to study whether shrna-mediated rnai inhibited pdcov replication in vitro. to guarantee a similar rnai effect on different pdcov strains, the two theoretically effective sequences were analyzed by blast to ensure that they did not have any similar sequences in the swine genome, but shared % similarity with the published sequences of different pdcov strains. both the shrnas inhibited pdcov replication in st cells. the interference properties were revealed by reductions in cpe formation, virus tcid titers, and viral rna copy numbers in the infected cells. the cpe and tcid assay of pgenesil-m-and pgenesil-n-transfected cells showed viral suppression h after infection and the titers of pgenesil-m-and pgenesil-n-transfected cells were reduced . -and . fold, respectively. the real-time quantitative rt-pcr assay showed that the viral rna copy number was reduced by . % in pgenesil-m-transfected cells and . % in pgenesil-m-transfected cells. the inhibitory efficiency of pgenesil-n was higher than that of pgenesil-m. although the interference efficiency of shrnas against pdcov in our study was lower than that of the shrnas targeting other coronaviruses [ , ] , these results indicate that rnai against pdcov mediated by shrnas can inhibit pdcov replication in vitro. a disadvantage of this transient transfection/expression system is that virus replication can be inhibited only in cells that are expressing the shrna, and cells not expressing the shrna can be infected. therefore, one explanation for the lower interference efficiency of shrnas against pdcov in this study compared to shrnas targeting other coronaviruses in other studies could be lower transfection efficiency of the shrna expression plasmids. in conclusion, our results indicate that both shrnas plasmids targeting the m ( - ) and n ( - ) genes of pdcov genome inhibit pdcov replication in st cells with high efficiency. therefore, the two nucleotide positions are two potential targets for the inhibition of pdcov replication by rnai in vitro. however, whether the two shrnas can inhibit pdcov replication in vivo, and whether shrnas targeting other nucleotide positions of the m and n genes or other genes also inhibit pdcov replication need further research. isolation and phylogenetic analysis of porcine deltacoronavirus from pigs with diarrhoea in hebei province china full-length genome sequence of porcine deltacoronavirus strain newly emerged porcine deltacoronavirus associated with diarrhoea in swine in china: identification, prevalence and full-length genome sequence analysis discovery of seven novel mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus detection and genetic characterization of deltacoronavirus in pigs isolation and characterization of porcine deltacoronavirus from pigs with diarrhea in the united states functional characterization and proteomic analysis of the nucleocapsid protein of porcine deltacoronavirus occurrence and sequence analysis of porcine deltacoronaviruses in southern china complete genome characterization of korean porcine deltacoronavirus strain kor/knu - identification of a conserved linear b-cell epitope in the m protein of porcine epidemic diarrhea virus a conserved domain in the coronavirus membrane protein tail is important for virus assembly heterogeneity in membrane protein genes of porcine epidemic diarrhea viruses isolated in china modular organization of sars coronavirus nucleocapsid protein identification of a specific interaction between the coronavirus mouse hepatitis virus a nucleocapsid protein and packaging signal effective inhibition of porcine epidemic diarrhea virus by rna interference in vitro small interfering rna targeting m gene induces effective and long term inhibition of influenza a virus replication inhibition of porcine transmissible gastroenteritis virus infection in porcine kidney cells using short hairpin rnas targeting the membrane gene a transgenic marc- cell line of piggybac transposon-derived targeting shrna interference against porcine reproductive and respiratory syndrome virus characterization and evolution of porcine deltacoronavirus in the united states complete genome sequence of porcine deltacoronavirus isolated in thailand in porcine deltacoronavirus: overview of infection dynamics, diagnostic methods, prevalence and genetic evolution in-vitro inhibition of spring viremia of carp virus replication by rna interference targeting the rnadependent rna polymerase gene potential and development of inhaled rnai therapeutics for the treatment of pulmonary tuberculosis short hairpin rnas (shrnas) induce sequence-specific silencing in mammalian cells isolation, genomic characterization, and pathogenicity of a chinese porcine deltacoronavirus strain chn-hn- key: cord- -lf j ic authors: ten dam, edwin b.; pleij, cornelius w. a.; bosch, leendert title: rna pseudoknots: translational frameshifting and readthrough on viral rnas date: journal: virus genes doi: . /bf sha: doc_id: cord_uid: lf j ic ribosomal frameshifting on retroviral rnas has been proposed to be mediated by slippage of two adjacent trnas into the — direction at a specific heptanucleotide sequence. here we report a computer-aided analysis of the structure around the established or putative frameshift sites in a number of retroviral, coronaviral, toroviral, and luteoviral rnas and two dsrna yeast viruses. in almost all cases a stable hairpin was predicted four to nine nucleotides downstream of the shifty heptanucleotide. more than half of the resulting hairpin loops give rise to potential pseudoknotting with sequences downstream of this hairpin. especially in the case of the shifty heptanucleotides u uua aac and g gga aac, stable downstream pseudoknots are present. indications were also found for the presence of pseudoknots downstream of amber stop condons at readthrough sites in some retroviral rnas. translational frameshifting, although generally an abortive event during protein synthesis, is employed by various retroviruses to express the pol gene, encoding the reverse transcriptase and integrase. frameshifting occurs at a defined site in the overlap region of the gag and pol genes and results in the synthesis of a gagpol fusion protein ( , ) . in some retroviral rnas, e.g., mouse mammary tumor virus (mmtv) rna ( , ) , a double-frameshift event takes place, leading to the expression of a third reading frame encoding a protease. the ribosome is generally shifted into the - reading frame, but a shift into the + frame has been noted in one case for the related retroviral-like transposon ty- ( , ) . studying the se-ten dam, pleij, and bosch quence requirements for ribosomal frameshifting during translation of rous sarcoma virus (rsv) rna, jacks and coauthors found indications for a mechanism in which simultaneous slippage occurs of two adjacent ribosome-bound trnas by one nucleotide in the ' direction at the site of frameshifting. comparison of the sequences at the known or suspected frameshift sites in reading frame overlaps of a number of retroviral rnas revealed a consensus heptanucleotide, consisting of a run of three a, u, or g residues followed by the tetranucleotide uuua, uuuu, or aaac ( ) . interestingly, in the case of rsv rna, the presence of such a shifty heptanucleotide appeared to be insufficient; an additional nucleotides downstream of the frameshift site were also necessary for efficient frameshifting. evidence was obtained that the additional nucleotides in rsv rna harbor a stable stemloop structure. furthermore, deletion analysis revealed that, beside this stemloop structure, a downstream stretch of nucleotides is essential ( ) . the authors suggested that these nucleotides may be involved in the formation of a pseudoknot. however, for the human immunodeficiency virus (hiv-l), the stem-loop structure downstream of the frameshift site is dispensable, and efficient frameshifting can be mediated by a short sequence of nucleotides around the frameshift site only. it was suggested that "retroviruses may divide in two broad classes, one using linear 'shifty' sequences (e.g., hiv) and the other using more elaborate mechanisms based on rna secondary structure (e.g., rsv)" ( ) . in this context it seemed of interest to examine the nucleotide sequences harboring frameshift sites in various overlap regions in more detail and to search for possibly tertiary interactions. here we report the results of such a search performed with the computer. beside the stable stem regions just downstream of the suspected frameshift sites already proposed ( , , ) , we also find strong indications for pseudoknotted structures downstream of the shifty heptanucleotide in more than half of the overlap regions examined, including those present in coronaviral, some plant viral rnas, and a yeast dsrna virus. during the course of this work, strong experimental evidence was reported for the presence of a pseudoknotted structure downstream of the frameshift site in infectious avian bronchitis virus (ibv) rna. this pseudoknot was shown to be essential for efficient frameshifting of the ribosome in the orfla/orflb overlap region ( ) . these authors also proposed similar pseudoknots in a number of retroviral rnas, some of which are identical to the ones resulting from our analysis. an analysis of the region downstream of the site, where in some retroviral rnas efficient readthrough of an amber stop codon occurs, also revealed potential pseudoknotted structures. the overlap regions containing established or putative frameshift sites as tabulated ( ) and that have not been discussed before were folded in secondary structures using a program developed in our laboratory by abrahams et al. (manuscript submitted for publication). this program is able to predict pseudoknotted structures involving hairpin loops, also coined h-type pseudoknots ( ) , and has been successfully applied for the prediction of a number of consecutive pseudoknots in the ' noncoding region of foot-and-mouth disease virus (fmdv) rna ( ) and of pseudoknots in various other viral rnas (pleij, unpublished observations) . stretches of about nucleotides surrounding the shifty heptanucleotide were analyzed. because only stem-loop structures downstream of the shifty sequence appear to be important ( ,l l), we have focused mainly on the sequence at the ' side of the frameshift site. some pseudoknots involving bulge loops or multibranched loops are not predicted by this program. we, therefore, searched for these structural elements by visual inspection of the sequences if the proper stem-loop structures were found, taking into account the rules that are imposed by the geometry of the rna-a double helix. the characteristics of rna pseudoknots and their prediction and detection have been reviewed ( , ) . to simplify the description of the various structural elements around the frameshift region we introduce the terminology given in fig. a . essential fea-tures are the heptanucleotide sequence sh, where the frameshift takes place, and the stem region sl separated from sh by the spacer sp. if pseudoknotting involves a simple hairpin loop, as depicted in fig. la , it is fully defined by the connecting loops ll and l and the other stem region or "tertiary interaction" s . except for sh and sp, the symbols are derived from the nomenclature previously used to describe pseudoknots ( , ) . figure b shows a schematic presentation of the structure obtained after coaxial stacking of stem segments sl and s . a characteristic feature of the relatively simple pseudoknot illustrated in fig. ia is that the hairpin loop sequence participating in the tertiary interaction borders directly on the stem region of the hairpin. we call this type of pseudoknot h. when examining the retroviral structures we have not restricted ourself to a search for this particular type only, but have included pseudoknots that meet the more general definition: a structural rna element formed upon basepairing of nucleotides within a loop with nucleotides outside that loop ( ) . an example of a more complicated pseudoknot is the one proposed for rsv rna (see below). potential pseudoknots downstream of frameshif sites in retroviral rnas table presents the results of a computer-aided examination of overlap regions harboring established or putative frameshift sites ( ) . we have included a number of sites from the retro-, luteo-, corona-, and toroviral groups and two yeast viruses not discussed before. for nearly all sequences tested, the computer program predicted a very stable hairpin, starting four to nine nucleotides downstream of sh, in agreement with observations by others for rsv, mmtv, hiv-l, hiv- , and simian immunodeficiency virus (siv mac) ( , ) . similar results were reported using a different rna secondary structure-predicting program ( ) . this particular stem sl was the most stable stem present in the nucleotides surrounding sh, except for the luteovirus barley yellow dwarf virus (bydv) ( ) and the transposable element gypsy ( ) . in the latter two cases, it was the second best. in mouse intracisternal a particle (mouse iap), such a hairpin is found if a g-a mismatch in s is allowed ( ) . s was neither predicted by the program nor found by eye for the transposable element . ( , ) and for the retrovirus siva~~ ( ) . the latter has u uuu uua as the shifty sequence, and the absence of this hairpin is not surprising in view of the experimental results obtained for hiv-l ( ) (see discussion). for hiv- and siv mac we have included for sl the stems as proposed earlier ( ) . interestingly, pseudoknotted structures were predicted directly by the program. a typical result obtained for the gag-pro overlap of saids retrovirus-serotype (srv- ) rna ( ) is shown in fig. a . the pseudoknot predicted here is of the h type (see terminology), which is frequently observed in the noncoding regions of a number of other viral rnas ( , ) . the size of the connecting loops ll and l meets the steric demands that result from stacking the two consecutive doublehelical segments and forming a quasi-continuous helix. accordingly, the single a bfor definition see terminology section and fig. . sp, hl, ll, and l are given in number of nucleotides; sl and s in number of base pairs. 'secondary structure as proposed ( ) . dpresence of substructure in hl, ll, or l , respectively. residue is sufficient for crossing the deep groove of this helix over bp, comparable to the pseudoknotting in the leader of the gene mrna of bacteriophage t ( ) . folding of the corresponding sequence in the gag-pro overlap of the closely related srv- and mason-pfizer monkey virus (mpmv) rna ( , ) yields a fully identical pseudoknot (not shown), which may reflect its functional importance. the sequence conservation in all three viral rnas is absolute, however, which means that covariations in the stem regions, which generally provide support for the proposed structures, are lacking here. the gag@ overlap region of feline immunodeficiency virus (fiv) contains a similar structure ( ) . inspection of the sequence in the corresponding overlap region in another type d retrovirus, smrv-h ( ), suggests a sh, as indicated in fig. b . the sh is located at the same position as can be concluded unambiguously from a sequence alignment. its sequence is identical to that of the three other type d retroviral gagpro shs mentioned above. the length of its sp is seven nucleotides, and its hairpin shows a strong resemblance to that of the other three related retroviruses (e.g., srv- fig. a) . however, the formation of a h-type pseudoknot is not possible anymore due to the g insertion in hl and a u-to-c substitution in the complementary sequence downstream of the hairpin. surprisingly, sl now harbors a potential second sh, g ggc ccc, which in turn is separated by a sp of four nucleotides from the ideal h-type pseudoknot predicted by the program (fig. b ). this possible second frameshift site might compensate for the loss of the pseudoknot after the first site. whether g ggc ccc can function as a sh sequence obviously remains to be seen, let alone that a second frameshift site indeed is active in smrv-h rna. a similar situation was found in both siv rnas (see below). the formation of a h-type pseudoknot is also possible in equine infectious anemia virus [eiav ( ) , not shown], and another example is provided by human t-cell leukemia virus (htlv-i) rna in the pro-p overlap ( ) . in this case, both sl and s are exceptionally long. a stretch of ten nucleotides from the membered loop is complementary to a sequence nucleotides downstream of s . the related simian t-cell leukemia virus (stlv-i) has a g-u pair in s substituted for an a-u pair ( ) . much more substitutions are present in htlv-ii ( ) . sl appears to be completely conserved, but various substitutions are found in ll and l . the three substitutions at the ' side of hl give rise to two mismatches, thereby shortening s and possibly interrupting the coaxial stacking on sl. similar deviations from what might be called the ideal h-type pseudoknot are encountered with a number of other retroviral rnas, such as mmtv (gag-pro), bovine leukemia virus (blv) (pro-pal) ( , ) , and visna virus (visna) ( ) rna (not illustrated; see table and ref. ). a number of hairpins downstream of sh have a hl consisting of less than six nucleotides and in fact are not suitable for pseudoknotting [see hiv-i rna [ , , ] , htlv-i (gag-pro) and htlv-ii (gag-pro)]. the hairpin in the transposable element gypsy contains an eight-membered hl, but no possible pseudoknotting could be detected. non-h type pseudoknots can be more difficult to identify. an example is the one proposed for rsv rna ( ) . the program predicted the same secondary structure downstream of the frameshift site as proposed ( ). we assume, however, that the bottom part of sl does not play a role in the frameshifting event for reasons outlined in the discussion, visual inspection of the resulting hairpin revealed the potential pseudoknotting, as already described ( ) . in our view, this tertiary interaction can even be extended from to bp upon accepting the formation of a bulged u residue (not shown). the relative complexity of the structure in rsv rna is apparently not restricted to this rna, but is found in a number of other retroviral rnas with a similar hl of - nucleotides, often involved in internal hairpin formation themselves. since the computer program is unable to predict pseudoknots harboring such multibranched loops, hairpins have to be inspected by hand, in doing so, no potential pseudoknots could be detected in the case of mpmv (pro-pal) or the related srv- (pro-pal). in the pro-pol overlap of mmtv, a stretch of eight nucleotides of the loop (agccugua) was found to be complementary to a region just downstream of the hairpin (uacaggcu). the significance of this complementarity is doubtful, however, because part of the stretch in the loop is already involved in a small hairpin (results not shown). the group of retroviral rnas having an sh consisting of the heptanucleotide u uuu uua (e.g., hiv-l) shows other complexities. long and stable hairpins with a small hl were proposed for hiv- and sivmac ( ) . we note here that in both cases the sequence agcccc, occurring in hl, is complementary to the sequence ggggcu, seven and nine nucleotides downstream of the stem, respectively ( , ) . the program, however, predicted alternative structures, probably due to the presence of a number of alternating g-and c-rich regions downstream of sh. the results obtained with sivmmac and siv *om are puzzling, since the structure prediction suggested in both viruses a potential second sh, located nucleotides downstream of the u uuu uua sequence. in sivagm a second sh sequence, a aau uuu, is present in the gag gene reading frame, while in sivmmac a potential sh, u uuc ccc, is found at exactly the same position. both sh sequences are followed by a stem region after three nucleotides. the hairpin found downstream of u uuc ccc in sivmac rna is reminiscent of the one present in rsv rna and in some other retroviral rnas, and a potential pseudoknot interaction with the long, single-stranded region further downstream is possible (not illustrated). we note that the situation described here for both siv viruses is analogous to the one described for smrv-h rna in fig. b . for a member of another retrovirus subfamily, human spumaretrovirus (hsrv), surprisingly, no sh could be found in the overlap region, although the arrangement of its gag and pol genes suggests a - frameshift ( ) . furthermore, no stable hairpins, let alone pseudoknots like the ones found in the other overlap regions, were predicted. coronaviruses are plus-stranded rna viruses having large single-stranded rna genomes with replication strategies different from retroviruses. however, it was recently shown for the coronavirus ibv that the overlap of the two open reading frames (orfla and orflb) of the putative polymerase gene contains the shifty sequence u uua aac, which is followed downstream by a pseudoknotted structure ( , , ). site-directed mutagenesis clearly demonstrated that the pseudoknot is involved in the very efficient frameshifting ( - %). the program predicts essentially the same hairpin s as proposed by brierley and coauthors ( l), but we propose a slightly different tertiary basepairing, which enables a better coaxial stacking of sl and s , and is typical for an h-type pseudoknot (fig. ) . this proposal is further supported by a comparison with the possible pseudoknot in the corresponding region of the related coronavirus mouse hepatitis virus (mhv) strain a- ( ) . covariations in both sl and s already prove that the pseudoknot exists in both viral rnas. this is especially clear for s , where three of these covariations are found, including the g-a pair in mhv. note that such a g-a pair also occurs in the otherwise perfect stem sl in ibv. moreover, the shortening of the mhv stem sl at the top is compensated by the formation of an extra base pair (bp) in stem s . it is further remarkable that the loop of the mhv hairpin shows an insertion of the stop codon triplet uaa, just in phase with the upstream orfla coding region. this insertion extends stem s with another or bp (see fig. ). the single-stranded ug(u) stretch left may be just sufficient to cross the deep groove over lo- bp ( ) . we note that the -nucleotide-long connecting loop can be folded internally (not shown), but this does not interfere with the pseudoknotting itself. a similar structure is predicted for a member of the torovirus group, berne virus (bev) ( ) . luteoviruses are plant viruses that have single-stranded plus-sense rna genomes ( ) . recently, the complete nucleotide sequences of three members of this group have been determined ( ) ( ) ( ) . the putative viral rna-dependent rna polymerase gene of bydv is expressed by a - translational frameshift in the rather short overlap of nucleotides. it was proposed that the uuua just upstream of the stop codon signaled frameshifting, analogous to the phenomenon in some retroviruses and the coronavirus ibv ( ) . a possible stem-loop structure starting three nucleotides downstream from the uuua sequence was also presented. we here propose that the sh is formed by the heptanucleotide g ggu uuu, followed after five nucleotides by this stem ( ) . searching for possible pseudoknot formation by the is-membered hl left open several possibilities for alternative, reasonable stable, secondary structures. a definitive proposal for the structure, therefore, cannot be offered. more rewarding was the analysis of the sequence of beet western yellow virus (bwyv) rna ( ) . a potential candidate for a sh sequence in the right frame in the corresponding overlap region was found: g gga aac at position to . it is followed after five nucleotides by a short but stable hairpin, which can form an h-type pseudoknot (fig. ) . the nucleotide sequence of the closely related potato leaf roll virus (plrv) rna also has a potential sh. its position [ - in plrvw*o ( ) or - in plrv sear ( ) ] is identical to that of bwyv rna, as can be concluded unambiguously from aligning both plrv sequences with that of bwyv. its composition, however, is rather different: u uua aau in plrvwag and u uua aau/c in plrvscor. moreover, sl in plrv is shortened by bp, but even more striking is the substitution of the u residue in hl of bwyv for a c in the wageningen plrv rna sequence. this substitution weakens the pseudoknot structure, if existing at all. it is tempting to suggest that the pseudoknot requirement is relaxed because of the transition of sh from u uua aac to u uua aau (see discussion). yeast viruses l-a is a dsrna virus of saccharomyces cerevisiae. its nucleotide sequence revealed two open reading frames, orfl and orf , overlapping by bases. orf is in the - reading frame with respect to orfl. a possible sh, g ggu uua, is present in the overlap, followed after four nucleotides by a hairpin ( ) . seven nucleotides of the eight-membered hl are complementary to a stretch of nucleotides that are bases downstream of s , thus again forming a potential pseudoknot. another yeast dsrna virus, ll, has an identical structure in the overlap region ( ) . some retroviruses express their pol reading frame by suppressing an amber stop codon separating the gag and pal genes ( , ) . a glutamine is inserted at this site, as shown in two cases ( , ) . this very efficient suppression is caused by an corresponding hairpins of akv, mo-mlv, and m . encircled residues indicate base changes with respect to felv. in these viruses hl is shortened by one c residue. intrinsic c&acting component of the viral rna located within nucleotides around the amber stop codon in ak virus (akv) ( ) . a stable hairpin with the uag codon in the loop was proposed as a secondary structure element that could play a role in the readthrough event ( ). recently, studies using sitedirected mutagenesis around the gag-& junction indicated that this stem-loop structure is important for virus activity ( ) . a similar hairpin is present in moloney murine leukemia virus (mo-mlv) ( ) . however, a role of this particular hairpin in the readthrough phenomenon is difficult to reconcile with the position of the amber codon in the loop region (see also ) . moreover, the hairpin appears not to be conserved in m baboon endogenous virus (m ) and feline leukemia virus (felv, results not shown), nor in spleen necrosis virus ( ) . the computer program predicted other stable hairpins around the amber codon of felv and m , which were not conserved in the other rnas either. however, there is a structure motif that is conserved among all four viruses. we here note that one of the most stable hairpins possible in the entire felv genome occurs just downstream of the amber stop codon ( ) . this hairpin is capable of forming a pseudoknot. the loop contains a long stretch of only cs at its ' side, which can form a very stable s with six g residues nucleotides downstream of the hairpin (fig. a) . we emphasize the strong resemblance of this potential pseudoknot with some of those present in viral rnas showing translational frameshifting (see above). moreover, the distance from the uag stop codon to sl (eight nucleotides) reveals another striking resemblance (see also discussion). comparison of the felv sequence with those of three other retroviral rnas, having established or putative suppressed amber codons ( , , ) , gives support to the proposed pseudoknot, though not in a decisive manner (fig. b) . note again the single a residue in ll of akv and m , and also of mo-mlv, if c indeed is a g residue, as was reported recently ( ) . the nucleotide sequence of felv ( ) in fact points to frameshifting as the mechanism of pof expression, because the gag and pol open reading frames are overlapping by five nucleotides, with pol in the + reading frame with respect to ten dam, pleij, and bosch gag. however, no signal for frameshifting can be found around the overlap region. no similarity between the cnucleotide sequence involved in + frameshifting in yeast ty elements ( , ) is present upstream of the stop codon. also no putative sh can be found. removing of the consecutive c residues downstream of the amber codon enables a better alignment with the mo-mlv sequence and would give this region of the felv genome an organization similar to that of the other three type-c retroviruses ( , ) . this, and the presence of the pseudoknot in all four retroviruses discussed above, suggest that the felv expresses its pal gene by a readthrough mechanism. in this paper we have presented a computer-aided examination of the secondary structure and the potential pseudoknotting of the rna region downstream of putative or established ribosomal frameshift sites of various viral rnas. this search was inspired by the suggestion that a pseudoknot is involved in the frameshift event in the gag-pol overlap of rsv rna ( ) . our data, which include viral sequences not tabulated before, indicate that of the overlap regions studied here harbor potential pseudoknots. these pseudoknots are always found four to seven nucleotides downstream of a heptanucleotide sequence, where the translational frameshifting was demonstrated or supposed to take place. there are only three exceptions-eiav, fiv, and sivmiac-where sp is nine, eight, and three nucleotides long, respectively. some of the pseudoknotted structures found were of the same type as described previously for noncoding regions of viral rnas, in which the stretch of nucleotides from hl basepairing with a complementary region outside this hairpin borders immediately on sl, enabling coaxial stacking of the two stem segments (pseudoknots of the h type, compare fig. ). it is noteworthy that a stable hairpin downstream of the sh sequence was predicted for all overlap regions included in table , except for the retrotransposon . and sivag~. the failure to find a hairpin or pseudoknot in the overlap of the latter is consistent with the finding that the stable hairpin in hiv-i rna is dispensable for frameshifting ( ) . these authors proposed that two broad classes of retroviral rnas exist, differing in their mechanism of frameshifting: one class using a short linear shifty sequence (like in hiv-i) and the other using rna secondary structure for efficient frameshifting (e.g., rsv and ibv). in principle, the second class may be divided in two subclasses: one harboring a hairpin, the other a pseudoknot. our results suggest that a substantial, if not a major, part of the viral rnas listed in table use a pseudoknotted structure for optimal shifting. a similar conclusion was recently presented by brierley and coauthors ( l), who reported that out of sequences examined appeared to contain the potential for pseudoknot formation. these authors also provided strong experimental evidence that in ibv the pseudoknotted structure indeed is necessary for efficient frameshifting. it will be interesting to know if the same holds true for the majority of the viral rnas analyzed here. the question is, of course, what is it that makes a pseudoknot so suitable for inducing efficient frameshifting? we assume that it is not merely for formation of a structure more stable than a hairpin alone, because we were unable to find a correlation between the calculated stability of stem sl ( ) and the presence of a potential pseudoknot (results not shown). the number of base pairs in sl was not found to be critical either. two structural features distinguish an rna pseudoknot from a classical rna hairpin: the two connecting loops ll and l , of which the bases point into the deep groove and away from the shallow groove, respectively ( ) , and the quasi-continuity of the double helix. which of these features induces the ribosome to shift into another reading frame remains to be established, however. another factor contributing to the extent of frameshifting could be the length of the spacer region, which varied between four and seven nucleotides. for ibv, changing sp from six to three or nine nucleotides, respectively, reduced or abolished frameshifting ( ) . spacing between sh and the structure involved in frameshifting thus appears to be critical. in this respect it is striking that the distance between the amber stop codon and the pseudoknot at readthrough sites is almost equal to sp in frameshifting. the stem sl as originally proposed for rsv rna is in fact an exception, in that it starts immediately downstream of sh, forming a -bp stem, including a bulged c residue ( ) . we have chosen to disrupt bp up to the bulged c residue, which leaves an sp of six nucleotides. the latter value is in the range of that of all the viral rnas (see table ). moreover, mutations in this six-nucleotide stretch did not alter the frameshift efficiency ( ), which argues against the importance of the bottom part of the stem proposed. no correlation between the sp size and the presence of a pseudoknot became apparent in the present comparison, however. a comparison of the sequences of the sh heptanucleotides is more suggestive (see table ). two sequences stand out in overlaps harboring a pseudoknot: g gga aac and u uua aac. it is tempting to suggest, therefore, that the sequence aaac, where the trna bound to the ribosomal a site is shifting, has to be followed by an elaborate rna structure. the first three nucleotides of the sh heptanucleotide, however, play an additional role, as can be concluded from the group with the a aaa aac sequence, which has some members for which potential pseudoknotting could not be established. an important factor to consider further may be the presence of c and g residues in the sh heptanucleotide, leading to more stable codon-anticodon interactions. in this case a longer stalling of the ribosome may be needed to increase the chance of the slippage event. such a longer stalling may be achieved by an extra structural feature downstream of sh. in this respect, it is interesting to see that pseudoknotted structures may be involved as well in the efficient readthrough of an amber codon. it is conceivable that a common basis for both mechanisms is the need for a stalling of the translating ribosome, which is pro-vided by pseudoknotted structures for reasons we do not yet know. if such a common basis is present, one can predict that a few changes in the nucleotide sequence around sh or the amber codon, respectively, could easily change a frameshifting viral rna into one suppressing an amber codon and vice versa. however, first more information is needed about the actual requirement for amber stop-codon suppression of a downstream stem-loop structure or pseudoknot. the same holds true for a large number of viral rnas having overlapping reading frames in which frameshifting occurs, despite the presently available data on rsv, hiv-l, and ibv. proc nat acad sci positive strand viruses. ucla symposia on molecular and cellular biology, new series. alan r. liss proc nat acad sci proc nat acad sci proc nat acad sci the control of hiv gene expression nucleic acid domains and proteins involved in the replication of coronaviruses the plant viruses proc nat acad sci we would like to thank marianne huisman, mike mayo, wayne gerlach, peter bredenbeek, and willy spaan for communicating data prior to publication; jan pieter abrahams for making available the computer program; and jan van duin for stimulating discussions and reading the manuscript. key: cord- - kistim authors: song, daesub; park, bongkyun title: porcine epidemic diarrhoea virus: a comprehensive review of molecular epidemiology, diagnosis, and vaccines date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: kistim the porcine epidemic diarrhoea virus (pedv), a member of the coronaviridae family, causes acute diarrhoea and dehydration in pigs. although it was first identified in europe, it has become increasingly problematic in many asian countries, including korea, china, japan, the philippines, and thailand. the economic impacts of the pedv are substantial, given that it results in significant morbidity and mortality in neonatal piglets and is associated with increased costs related to vaccination and disinfection. recently, progress has been made in understanding the molecular epidemiology of pedv, thereby leading to the development of new vaccines. in the current review, we first describe the molecular and genetic characteristics of the pedv. then we discuss its molecular epidemiology and diagnosis, what vaccines are available, and how pedv can be treated. porcine epidemic diarrhoea (ped), which was first observed among english feeder and fattening pigs in [ ] , is a devastating enteric disease that manifests as sporadic outbreaks during the winter, leading to damage on breeding farms. characterised by watery diarrhoea, ped resembles transmissible gastroenteritis (tge), but has less of an effect on suckling pigs (\ -to -week old); this is what allowed ped to first be distinguished from the tge virus and other recognized enteropathogenic agents. as it spread through europe, the disease was named 'epidemic viral diarrhoea (evd) .' unlikely what the disease used to outbreak in fattening pigs, different types of evd caused acute diarrhoea in pigs of all ages in . this type of evd was classified as evd type [ ] , different from the previously recognized type [ ] . evd type was turned out to be caused by a coronavirus-like agent in [ , ] using experimentally designed cv which caused enteropathogenic infection in both piglets [ ] and fattening swine. this was when the disease started to be called as 'porcine epidemic diarrhoea (ped)' [ ] . both transmissible gastroenteritis virus (tgev) and porcine epidemic diarrhoea virus (pedv) are classified into group of the genus coronavirus. pedv ranges in diameter from to nm (mean diameter: nm), including its projection. as in many particles with a tendency to a round shape, the pedv contains a centrally located electronopaque body; it also possesses widely spaced club-shaped projections measuring - nm in length. the internal structure of the virus remains unknown. the pedv is sensitive to ether and chloroform and has a density in sucrose of . g/ml. the virus possesses a glycosylated peplomer (spike, s) protein, poll (p ), envelope (e), glycosylated membrane (m) protein, and an unglycosylated rna-binding nucleocapsid (n) protein [ ] . cell cultureadapted pedv loses its infectivity when heated to c °c for min, but is moderately stable at °c; further, the virus is stable between ph . and . at °c and between ph . and . at °c [ ] . pedv shows no haemagglutinating activity [ ] . the pedv propagates by orally inoculating piglets, after which, during the early stages of diarrhoea, it collects in the tissues and contents of the small intestine [ ] . vero (african green monkey kidney) cells support the serial propagation of pedv and grow successfully in laboratory conditions; however, growth of the virus depends on the presence of trypsin in the cell culture medium. cytopathic effects consist of vacuolation and formation of syncytia. during the s and s, ped was prevalent throughout europe, in countries such as belgium, england, germany, france, the netherlands, and switzerland (table ) . ped is currently a source of concern in asia, where outbreaks are often more acute and severe than those observed in europe. in this respect, and in their high mortality rates, these resemble tgev outbreaks. for example, japanese outbreaks between september and june resulted in , deaths, with mortality ranging from to % in suckling pigs. during these epidemics, adult pigs showed only temporary decreases in appetite and milk production [ ] . another ped epidemic occurred in the winter of , during which , of , infant farrow-to-finish piglets died after experiencing diarrhoea. between january and december , . % of viral enteric cases in infant pigs surveyed in korea were attributable to pedv, rather than tgev. the vast majority of outbreaks ( %) involved piglets \ -day-old [ ] . the clinical lesions of pedv in the small intestine of piglets were similar to those of tgev. lesions are confined to the small intestine, which is distended with yellow fluid (fig. ). ped outbreaks also occurred in thailand from to . most of the affected farms reported that the disease first occurred in farrowing barns; % of newborn piglets were subsequently lost. between august and july , . % of , enteric cases across korean provinces were diagnosed as ped [ ] ; further, a korean abattoir serosurvey found pedv seroprevalences of . - % (mean of %) in samples from pigs from seven provinces. cumulatively, these results suggest that the virus had become endemic in some areas [ ] ( table ) . however, recent outbreaks seemed to be concentrated in certain countries where pork industry is prevalent, such as philippines, south korea and china. pedv is an enveloped virus possessing an approximately kb, positive-sense, single-stranded rna genome with a cap and a polyadenylated tail [ , ] . the genome comprises a untranslated region (utr), a utr, and at least seven open reading frames (orfs) that encode structural proteins [spike (s), envelope (e), membrane (m), and nucleocapsid (n)] and three non-structural proteins (replicases a and b, and orf ); these are arranged on the genome in the order the polymerase gene consists of large orfs, a and b, that cover the two-third of the genome and encode the non-structural replicase polyproteins (replicases a and b). genes for the major structural proteins s ( - kda), week of age died from severe watery diarrhoea after showing signs of dehydration. after the acute outbreak, piglets were anorectic, depressed, vomiting, and producing water faeces that did not contain any signs of blood. necropsies of deceased piglets from the kimpo outbreak uncovered gross lesions in the small intestines, which were typically fluidic, distended, and yellow, containing a mass of curdled, undigested milk. atrophy of the villi caused the walls of the small intestines to become thin and almost transparent virus genes ( ) : - e ( kda), m ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) , and n ( kda) are located downstream of the polymerase gene [ , , ] . the orf gene, which is an accessory gene, is located between the structural genes. it encodes an accessory protein, the number and sequence of which varies among different coronaviruses [ ] . the pedv s protein is a type i glycoprotein composed of , amino acids (aa). it contains a signal peptide ( - aa), neutralising epitopes ( - , - , - , and , - , aa), a transmembrane domain ( , - , aa), and a short cytoplasmic domain. the s protein can also be divided into s ( - aa) and s ( - , aa) domains based on its homology with s proteins of other coronaviruses [ ] [ ] [ ] [ ] [ ] [ ] . like other coronavirus s proteins, the pedv s protein is a glycoprotein peplomer (surface antigen) on the viral surface, where it plays a pivotal role in regulating interactions with specific host cell receptor glycoproteins to mediate viral entry, and stimulating induction of neutralising antibodies in the natural host [ , - , , ] . moreover, it is associated with growth adaptation in vitro, and attenuation of virulence in vivo [ , ] . thus, the s glycoprotein would be a primary target for the development of effective vaccines against pedv. additional studies of this structure are essential for understanding the genetic relationships between, and diversity of, pedv isolates, the epidemiological status of pedv in the field, and the association between genetic mutations and viral function [ ] [ ] [ ] [ ] [ ] . it was reported that aminopeptidase n is the receptor of tgev, human coronavirus e (hcov- e) and feline coronavirus (fecov) which all belong to group i coronavirus including pedv [ ] . the pedv m protein, the most abundant envelope component, is a triple-spanning structural membrane glycoprotein with a short amino-terminal domain on the outside of the virus and a long carboxy-terminal domain on the inside [ ] . the m protein not only plays an important role in the viral assembly process [ , ] but also induces antibodies that neutralise the virus in the presence of its complement [ , ] . the m protein may play a role in a-interferon (a-ifn) induction [ ] . coexpression of m and e proteins allowed the formation of pseudoparticles, which exhibited interferogenic activity similar to that of complete virions [ ] . additional work on the m glycoprotein should increase our understanding of the genetic relationships between, and the diversity of pedv isolates and the epidemic situation of pedv in the field [ , [ ] [ ] [ ] [ ] [ ] . the n protein, which binds to virion rna and provides a structural basis for the helical nucleocapsid, is a basic phosphoprotein associated with the genome [ , , , ] . as such, it can be used as the target for the accurate and early diagnosis of pedv infection. it has been suggested that n protein epitopes may be important for induction of cell-mediated immunity (cmi) [ ] . whereas the genes encoding the structural proteins have been thoroughly investigated for most coronaviruses, little is known about the functions of the accessory proteins, which are not generally required for virus replication in cultured cells [ ] [ ] [ ] [ ] . on the contrary, their expression might lead to decreases of viral fitness in vitro, and mutants with inactivated accessory genes are easily selected during serial passage through cell cultures [ ] [ ] [ ] [ ] . in general, accessory genes are maintained in field strains [ , ] , and their loss mainly results in attenuation in the natural host [ ] [ ] [ ] . in the case of pedv, the only accessory gene is orf , which is thought to influence virulence; cell culture adaptation has been used to alter the orf gene in order to reduce virulence [ ] , as has been done for tgev [ ] . differentiation of orf genes between the highly cell-adapted viruses and field viruses could be a marker of adaptation to cell culture and attenuation of the virus [ , , ] . thus, measures of variation in orf gene differentiation could be a valuable tool in molecular epidemiology studies of the pedv [ , , , ] . genetic and phylogenetic analyses based on the s, m, and orf genes have been used to determine the relatedness of pedv isolates, both within korea and among various countries in which pedv has surfaced. research on part of the s gene, and on the entire m gene, have suggested that pedvs can be separated into three groups (g , g , g ), which have three subgroups (g - , g - , g - ) [ ] . according to analysis of the partial s genes, the g pedvs had . - % nucleotide sequence similarities with each other, and they had . - . and . - . % sequence identities with the g and g pedvs, respectively. the g pedvs had . - . % similarities with each other, and they had . - . % similarities with the g pedvs [ ] . these results reflect the existence of genetic diversity among the korean pedv isolates (fig. ) . the majority of the korean pedv isolates are closely related to chinese strains [ ] . the chinese pedv clade also contains all strains isolated from several outbreaks of pedv that have occurred in thailand since late . these classifications have been based on the phylogenetic relationship of the s genes, and support the results of park et al. [ ] . recently, after analyzing the full s gene-based phylogenetic tree [ ] reported that all pedvs can be separated into clusters, and that korean field isolates are more closely related to each other. in , an analysis of the m gene of pedvs isolated from the faeces of chinese piglets indicated that the isolates compose a separate cluster with chinese strain js- - [ ] . these results demonstrated that there may be a new prevailing pedv genotype in china [ ] . phylogenetic relationships of complete m gene nucleotide sequences indicate that recent thai pedv isolates are closely related to isolates from china [ ] . likewise, most korean pedv isolates have been found to be closely related to chinese strains [ ] , and belong to the third of pedv groups containing all pedv isolates [ ] . relationships among pedvs isolated from various countries based on the partial s gene including epitope region. the phylogenetic tree was constructed using the neighborjoining method in mega version . with pairwise distances [ ] . bootstrap values (based on , replicates) for each node are given if [ %. the scale bar indicates nucleotide substitutions per site. asterisk represents pedv isolate whose sequence available in genbank database was shorter as compared to that of other reference strains. pedvs isolated from various countries were marked with various colors: europe (black), korea (blue), china (red), japan (olive green), thailand (green) and viet nam (purple) (color investigations of the orf gene have revealed the reemergence of pedv in immunised swine herds since early [ ] . orf genes have been used to divide chinese field strains and pedv reference strains into groups; further, chinese field strains appear to be closely related to korean strains, but genetically different from pedv vaccine strains. another report revealed that pedv has caused enteric disease with devastating impact since the first identification of pedv in in korea, and recent, prevalent korean pedv field isolates are closely related to chinese field strains but differ genetically from european strains and vaccine strains [ ] . a diagnosis of ped cannot be made on the basis of clinical signs and histopathological lesions [ ] [ ] [ ] [ ] . due to the similarities in causative agents of diarrhoea, differential diagnosis is necessary to identify the pedv in the laboratory [ , ] . many techniques have been used for the detection of pedv, including immunofluorescence (if) tests, immunohistochemical techniques, direct electron microscopy, and enzyme-linked immunosorbent assays (elisa). however, these techniques are time-consuming and are low in sensitivity and specificity [ ] . kim et al. [ ] compared three techniques (rt-pcr, immunohistochemistry and in situ hybridization) for the detection of pedv. they concluded that although rt-pcr identified the presence of pedv more frequently than the other methods, when only formalin-fixed tissues are submitted, immunohistochemistry and in situ hybridization would be useful methods for the detection of pedv ag and nucleic acid. the pedv leader sequence was used to develop a reverse transcriptase polymerase chain reaction (rt-pcr) diagnostic technique [ , ] that has successfully been used to detect both laboratory and field isolates [ , ] . m gene-derived primers can be used in an rt-pcr system to obtain pedv-specific fragments [ ] , and duplex rt-pcr has been used to differentiate between tgev and pedv [ ] . the past few years have seen several useful modifications of the basic rt-pcr method. for instance, it is possible to estimate the potential transmission of pedv by comparing viral shedding load with a standard internal control dna curve [ ] , as well as to perform multiplex rt-pcr to detect pedv in the presence of various viruses [ ] -a technique that is particularly useful for rapid, sensitive, and cost-effective diagnosis of acute swine viral gastroenteritis). the commercial dual priming oligonucleotide (dpo) system (seegene, seoul, korea) was also developed for the rapid differential detection of pedv. it employs a single tube -step multiplex rt-pcr with two separate primer segments to block a non-specific priming [ ] . another useful reverse transcription-based diagnostic tool is rt loop-mediated isothermal amplification (rt-lamp). this assay, which uses - primers that recognize - regions of target dna, is more sensitive than gel-based rt-pcr and elisa, in large part because it produces a greater quantity of dna [ ] . immunochromatographic assay kits can be used at farms in order to detect pedv s proteins with % sensitivity and % specificity. this technique is less accurate than rt-pcr, but allows diagnosis within min. thus, it is particularly effective for quickly determining quarantine or slaughter policies in the field. especially, endemic situation of ped infection brought the several commercialised ped virus detection systems using diagnosis techniques including conventional duplex rt-pcr (intron biotechnology, inc, korea), real time rt-pcr (kogenebiotech, kore), dpo based multiplex rt-pcr (seegene, seoul, korea), and immunochromatography (bionote, korea) in korea. recently, a protein-based elisa was developed to detect pedv. in this technique, a polyclonal antibody is produced by immunising rabbits with purified pedv m gene after its expression in escherichia coli. if analysis with anti-pedv-m antibody is then able to detect pedvinfected cells among other enteric viruses [ ] . elisa blocking and indirect if have been used to detect pedv antibodies at and - days postinoculation, respectively [ ] . for all tests, the second (convalescent) serum sample should be collected and examined no sooner than - weeks after the onset of diarrhoea. pedv antibodies, detected by the elisa-blocking and if-blocking tests, have been found to persist for at least year. due to the special features of the porcine mucosal immune system, the presence of serum antibodies against gastroenteric pathogens is not always correlated with protection; rather, detection of these antibodies only proves that individuals had contact with infectious microorganisms [ ] [ ] [ ] . additionally, ha et al. [ ] recently reported that colostrum iga concentration is a better marker of protection from pedv infection than serum neutralising (sn) titre from serum samples; however, sn titres may still be useful in determining herd infection status [ ] . until they are -to -day old, piglets are protected against pedv by specific igg antibodies from the colostrum and milk of immune sows [ ] ; the length of immunity depends on the titre of the mother. after antigenic sensitisation in the gut, iga immunocytes migrate to the mammary gland, where they localise and secrete iga antibodies into colostrum and milk. this 'gut-mammary' immunologic axis is an important concept in designing optimal vaccines to provide effective lactogenic immunity [ ] . pigs that regularly suckle the immune mother are constantly inoculating their lumens with milk-bound iga antibodies, a process that confers passive immunity. igg accounts for more than % of colostrum immunoglobulin content. however, iga is more effective at neutralising orally infectious pathogens than either igg or igm because it is more resistant to proteolytic degradation in the intestinal tract and has a higher virus neutralising ability than igg and igm [ ] . therefore, only passive transfer of iga from an immunised mother effectively induces immune responses in suckling piglets [ ] . however, these antibodies do not protect against intestinal infection with pedv. several pedv vaccines, which differ in their genomic sequence, mode of delivery, and efficacy, have been developed. a cell culture adaptation of the cv strain had a strikingly different genomic sequence [ ] , was associated with much lower virulence in new born caesarean-derived piglets, and caused much less severe histopathological changes. however, in europe, the disease caused by pedv was not of sufficient economic importance to start the vaccine development. therefore, the trial of vaccine development was mainly accomplished in asian countries where the pedv outbreaks have been so severe that the mortality of the new born piglets was increased. an alternative vaccine for suckling piglets may be an attenuated form of the virus derived from serial passage (passage level: ) of the pedv [ ] . in japan, a commercial attenuated virus vaccine of cell cultureadapted pedv (p- v) has been administered to sows since . although these vaccines were considered efficacious, not all sows developed solid lactogenic immunity [ ] . oral vaccination with attenuated pedv dr (passage level: ) has recently been proven to be more efficacious than injectable vaccine. further, this vaccine candidate remained safe even after three back passages in piglets [ ] . piglet mortality can be reduced by orally inoculating pregnant sows with the dr strain. the viral strain was licensed, and used as an oral vaccine in south korea from (patent no. ). and the oral vaccine was registered and commercialised in philippine at . despite the documented benefits of the dr vaccine, it does not significantly alter the duration of virus shedding-an indication of immune protection [ , ] in challenged piglets. shorter periods of virus shedding, as well as reduced severity and duration of diarrhoea in piglets, result from higher titres of serum antibodies; complete protection from pedv infection prevents shedding after exposure to viral challenge [ ] . oral immunisation with highly attenuated pedv confers partial protection against virulent challenge in conventional pigs, a result that is related to inoculation dose. at low doses of the attenuated pedv, % of pigs are protected against pedv challenge, but this proportion increased to % when pigs were inoculated with a dose times stronger [ ] . however, viral shedding may be difficult to measure accurately, as it is varies depending on viral strain and sensitivity of the detection tool [ ] . therefore, for the ideal and perfect development of vaccines, several criteria including the factors related the reduction of virus shedding in piglets, and the details of the mucosal immunity of pedv should be focused in the course of development of next generation vaccines. information on pedv mucosal immunity has typically been limited. de arriba et al. used the enzyme linked immunospot (eli-spot) technique to characterise the isotype-specific antibody secreting cells in mucosal and systemic-associated lymphoid tissues in pigs inoculated with pedv. after infection with pedv, levels of antibody secreting cell (asc) in the gut were similar to those observed in response to tgev and rotavirus infection; igg ascs were more prevalent than iga ascs. in pedv-infected pigs, a limited number of igm ascs were detected at post infection day (pid) , and memory b cells appeared at pid in mesenteric lymph nodes, spleen, and blood. finally, the authors noted correlations between protection and both serum isotype-specific antibody and asc response in gut-associated lymph tissues and blood on the challenge day [ ] [ ] [ ] . there have also been reports of immune responses by transgenic plants and lactic acid bacteria that express the pedv antigen [ , , ] . the transgenic tobacco plants that express the s protein corresponding to the neutralising epitope of pedv was tested whether feeding the plants induced the immune response in murine model. and the efficacy of orally administered antigen gene transgenic carrot and lettuce were tested after codon optimization and application of viral expression systems [ ] . in mice, induced antibodies have neutralising activity against pedv. no neutralising antibodies were detected in either mice or pigs given mucosal immunizations with recombinant lactobacillus casei expressing pedv n (nucleoprotein) on its surface. however, this treatment elicited high levels of mucosal iga and circulation igg immune responses against the pedv n protein. before this vaccine can be commercialised, further studies are needed; for instance, it will be necessary to understand discrepancies between test results of the first lab scale vaccine and large-scale pilot vaccines. research into this and other potential vaccines should be made a priority, as pedv-mediated diarrhoea causes significant economic losses in the swine industry. however, there is also a potential drawback to the use of live-attenuated vaccines. recently, a survey conducted in china indicated close phylogenetic relationships between a chinese pedv field strain (ch/gsjiii/ ) and two vaccine strains, suggesting that live vaccines can evolve into more infectious forms in the field [ ] . during the european outbreak of pedv, pregnant sows were deliberately exposed to the intestinal contents of dead infected pigs, thus artificially stimulating lactogenic immunity and, hopefully, shortening the duration of outbreaks at farms [ ] . however, several complications arose from this treatment. because the intestinal contents did not have homogenous titres of pedv, the induction of immunity-including solid lactogenic immunity-might not be expected. diseases may be spread via contamination with viral agents, such as prrsv and pcv . immunoprophylactic agents may also be used to treat pedv. for instance, anti-pedv chicken egg yolk immunoglobulin (igy) and colostrums from immunized cows have been found to increase survival rates of virally challenged piglets [ , ] . mouse monoclonal single chain variable fragment (scfv) antibodies to neutralised pedv, which can be expressed in e. coli, are as potent as parental antibodies and block pedv infection into target cells in vitro [ ] . thus, it is possible that recombinant e. coli cells expressing scfv can be used as prophylactic agents against pedv infection. epidermal growth factor (egf), which stimulates the proliferation of intestinal crypt epithelial cells and promotes recovery from atrophic enteritis in pedvinfected piglets [ ] , has also been proposed as a potential novel therapy to promote intestinal villous recovery in piglets with pedv infections; it may also be useful in other species with viral atrophic enteritis. drawbacks of this treatment include its high price and questionable safety. pig farming veterinary virology disease of swine clinical histopathological and immunohistochemical findings nidoviruses the coronaviridae coronavirus immunogens revue canadienne de recherche veterinaire revue canadienne de recherche veterinaire disease of swine development of an elisa for the detection fo antibody isotypes against porcine epidemic diarrhoea virus (pedv) in sow's milk key: cord- -y xvh hs authors: yamanaka, miles; crisp, tracey; brown, rhonda; dale, beverly title: nucleotide sequence of the inter-structural gene region of feline infectious peritonitis virus date: journal: virus genes doi: . /a: sha: doc_id: cord_uid: y xvh hs the sequence of the region located between the s and m glycoprotein genes of the - strain of feline infectious peritonitis virus (fipv) is presented. the inter-structural gene region encodes open reading frames (orfs), termed orfs a, b and , with nucleotide sequences conforming to the minimum conserved transcription signal upstream of each. an additional orf, x, partially overlaps the ′ end of orf a. the fipv interstructural gene region is identical in length when compared to the insavc- strain of canine coronavirus (ccv) but differs from various strains of transmissible gastroenteritis virus (tgev) by the presence of deletions and insertions. the sizes of orf a and are conserved in fipv, tgev and ccv. however, as with ccv, the fipv orf b is truncated in comparison with tgev. feline infectious peritonitis is a disease characterized by immunopathology and caused by a coronavirus. in fipv-infected cells, viral mrnas have been detected ( ) . two of these originate from a region of the fipv genome lying between the genes encoding s and m. this inter-structural gene region has been examined in the related coronaviruses transmissible gastroenteritis virus (tgev) and canine coronavirus (ccv) ( ± ). the arrangement of the open reading frames (orfs) in the inter-structural gene region has been described for the fipv genome ( , ), but detailed sequence has not been presented. here, we report on the sequence of this part of the fipv genome. we screened an fipv - cdna library with oligonucleotide probes derived from the published s sequence ( ) and isolated clones containing the interstructural gene region. the sequence of one of these cdnas (genbank accession number af ) was analyzed in detail. the overall organization of the fipv interstructural gene region is similar to that of ccv and tgev. three orfs encoding polypeptides of , and residues are present in the fipv sequence; these have been designated orf a, b and , respectively, in ccv and tgev (orf is also known as the small membrane gene because it encodes a polypeptide that is similar in sequence to an infectious bronchitis virus membrane protein). an additional orf of residues partially overlaps the h end of fipv orf a. this orf is present in ccv but absent from tgev, and has been called orf x ( ) . upstream of the fipv orf a and orf is the nucleotide sequence ctaaac that is the minimum conserved transcription signal found in other fipv, tgev and ccv genes ( , ) . a related sequence, ctaaat, is present upstream of fipv orf b. for these orfs, the distance between the transcription signal and the start of translation is conserved in the coronaviruses. the inter-structural gene regions of fipv and the insavc- strain of ccv are identical in size. the * corresponding author. sequence identity of the two regions is . % at the nucleotide level. a base deletion, located bases upstream of orf a, and a base insertion, starting bases upstream of orf b, are present in the fipv interstructural gene region relative to the purdue strain of tgev. different strains of tgev also show variation in these same regions, emphasizing the sequencē exibility in this part of the coronavirus genome. the sequence identity between fipv - and the purdue strain of tgev in the inter-structural gene region is . %. the products of fipv orf a and orf are identical in length to the corresponding polypeptides of ccv and tgev (purdue strain), with amino acid similarities of . % and . %, respectively, between fipv and ccv, and . % and . % between fipv and tgev. in contrast, an amber codon limits fipv orf b to only residues while orf b of the purdue strain of tgev extends amino acids ( ) . although the fipv orf b is shorter, the region distal to the amber codon is similar in nucleotide sequence and identical in length to the remaining portion of the tgev orf b sequence. this is consistent with the idea that a base substitution has created a premature stop codon in the fipv orf b coding region. for ccv, orf b is also limited to residues. differences in expression and primary sequence of orf b occur in various tgev strains, and orf b is truncated in fipv and ccv. this indicates that orf b is not absolutely required for virus growth. sequence determination of virus passaged in cats will help to answer questions about the requirement for orf b expression in the virus life cycle ( ) . the coronaviridae we thank our colleagues lloyd chavez and bill acree at fort dodge laboratories for supplying the virus stock and for stimulating discussions. this work was supported by fort dodge laboratories and scios, inc. key: cord- -vq ckqtd authors: lee, meong-hun; jeoung, hye-young; park, hye-ran; lim, ji-ae; song, jae-young; an, dong-jun title: phylogenetic analysis of porcine astrovirus in domestic pigs and wild boars in south korea date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: vq ckqtd porcine astrovirus (pastv) belongs to genetically divergent lineages within the genus mamastrovirus. in this study, / ( . %) domestic pig and / ( . %) wild boar fecal samples tested in south korea were positive for pastv. positive samples were mainly from pigs under weeks old. bayesian inference (bi) tree analysis for rna-dependent rna polymerase (rdrp) and capsid (orf ) gene sequences, including mamastrovirus and avastrovirus, revealed a relatively geographically divergent lineage. the pastvs of hungary and america belong to lineage pastv ; those of japan belong to pastv ; and those of canada belong to pastv , , , and , but not to . this study revealed that the pastvs of korea belong predominantly to lineage pastv and secondarily to pastv . it was also observed that pastv infections are widespread in south korea regardless of the disease state in domestic pigs and in wild boars as well. their association with enteric diseases is not well documented, with the exception of turkey and mink astrovirus infections [ ] . family astroviridae is separated into two genera. viruses of the genus mamastrovirus infect mammals, and those of avastrovirus infect avian [ ] . avastroviruses include duck astrovirus (dastv- ), turkey astrovirus and (tastv- and tastv- ), and avian nephritis virus (anv) [ ] . mamastroviruses appear to have a broad host range, including human [ ] , sheep [ ] , cow [ ] , pig [ ] , dog [ ] , cat [ ] , red deer [ ] , mouse [ ] , mink [ ] , bat [ ] , cheetah [ ] , brown rat [ ] , roe deer [ ] , sea lion and dolphin [ ] , and rabbit [ ] . porcine astrovirus (pastv) was first detected by em in the feces of a diarrheic piglet [ ] and was later isolated in culture [ ] . molecular characterization of the capsid (orf ) gene from this isolate followed some years later [ ] . since then, research groups have successfully used pcr approaches to investigate the presence and diversity of pastv [ ] [ ] [ ] . pastv has been detected in several countries, including south africa [ ] , the czech republic [ ] , hungary [ ] , canada [ ] , and colombia [ ] . in south korea, there have been studies done on astrovirus but were only limited to its detection in human infection. there has been no attempt yet to know the extent of astrovirus infection in the pig population of the country. it was, therefore, the aim of this study to investigate the genetic groups of korean pastv in domestic pigs and wild boars and to identify the incidence of co-infection with other porcine enteric viruses as well. a total of fecal samples of domestic pigs ( piglets under weeks old, weaned pigs, growing-finishing pigs, and sows over year old) was collected from six piggery farms with good breeding facilities in four provinces of south korea from january to june . out of these collected samples were from diarrheic and were from non-diarrheic pigs. a total of fecal samples of wild boars over year old was collected from the wildlife areas in five provinces of south korea during the hunting season from december to january . out of these collected samples were from diarrheic and were from non-diarrheic boars. viral rna was extracted from the feces using trizol ls b according to the manufacturer's instructions. pastv was detected in fecal specimens by rt-pcr, as previously described [ ] , with primers specific for the rdrp and orf regions of pastv (pastv-f, -tgacatttt gtggatttacagtt- and pastv-r: -cacccagg gctgacca- ). the rt-pcr process resulted in the amplification of a -nt-long fragment at an annealing temperature of °c. products of the expected size were cloned with the pgem-t vector system ii tm (promega, cat. no. a , usa). the cloned gene was sequenced with t and sp sequencing primers on an abi prism Ò xi dna sequencer (applied biosystems, foster city, ca, usa) at the macrogen institute (macrogen, seoul, korea). the sequences of all the positive samples for pastv were submitted to genbank under accession numbers jq -jq . the astroviruses used in this study are listed in table along with their genbank accession numbers. to investigate the relationship between astroviruses and other economically important viral diseases that cause diarrhea in piglets in asia, screening tests were conducted to detect porcine epidemic diarrhea virus (pedv), transmissible gastroenteritis virus (tgev), and porcine group a rotavirus (gar), as previously described [ ] . the primer pairs used in this study were p (ttctga gtca cgaacagcca, - ) and p (catatg cagcctgctctgaa, - ) for the s gene of pedv, t (gtggttttggtyrtaaatgc, [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] and t (cactaaccaacgtggarcta, - ) for the s gene of tgev, and rot (aaagatgctagggacaaa attg, - ) and rot (ttcagattgt ggagcta ttcca, - ) for the segment region of group a rotavirus. the sizes of the expected products of multiplex rt-pcr were bp for tgev, bp for pedv, and bp for rotavirus, which could be differentiated by agarose gel electrophoresis. out of the domestic pig fecal samples tested, were positive for pastv. prevalence of pastv in weaned pigs ( . %, / ) was higher than that in suckling piglets ( . %, / ) and in growing-finishing pigs ( . %, / ) ( table ). only one wild boar which is coming from the province of gyunggi tested positive for pastv ( . %, / ). the low prevalence of pastv in wild boars might have been due to the fact that usually older animals (over year old) have lesser susceptibility to infection and generally wild pigs are more resistant to many diseases than the domesticated ones. the percentage of samples that were pastv-positive differed among the six pig farms: chungchong a, . % ( / ); chungchong b, . % ( / ); kangwon, . % ( / ); gyunggi a, . % ( / ); gyunggi b, . % ( / ), and gyungsang, % ( / ). the low or no incidence of pastv in gyunggi a (growing-finishing pigs) and gyungsang (sows) can likewise be attributed to the lesser susceptibility of adult pigs to infection. the proportion of non-diarrheic and diarrheic pig fecal samples was . % ( / ) and . % ( / ), respectively. these results suggest that pastv is widespread in south korea regardless of the disease status (with or without clinical manifestations) of pigs. although astroviruses are highly prevalent in young pigs and are mostly present in diarrheic pigs, pastv is a common finding as well in the fecal samples of apparently healthy pigs [ , ] . the clinical significance of pastv infection remains to be clarified. the clinical symptoms of diarrhea are frequently reported to be associated with rotavirus, coronavirus, and calicivirus-like infections in piglets [ , , , , ] . although pedv and tgev infections were not identified in any of the fecal samples, porcine gar infection was identified in . % ( / ) of the samples collected from suckling pigs under weeks old (table ) . furthermore, coinfection with pastv and gar was observed in two cases (one in diarrheic and one in non-diarrheic pig fecal samples). however, it is not cleared yet if gar is directly associated with astrovirus infection in pigs. all astrovirus sequences were aligned initially with the clustalx . program [ ] . the nucleotide sequences were translated and the nucleotide and amino acid sequence identities among the astrovirus strains were calculated with bioedit . [ ] . bayesian trees were generated with mrbayes . . [ , ] using best-fit models which were selected with prottest . [ ] for amino acid alignment. markov chain monte carlo (mcmc) analyses were run with , , generations for each amino acid sequence. bayesian posterior probabilities by mrbayes . . were estimated on the basis of a % majority rule consensus of the trees. for each analysis, a chicken astrovirus (nc_ ) was specified as the outgroup and a graphic output was produced with treeview . . [ ] . the best modelssof the rdrp and orf amino acid sequences were obtained using protest . , which showed wag ? i ? g and wag ? g ? f, respectively, according to the results of the akaike information criterion (aic). bi trees for the rdrp (fig. ) and orf (fig. ) amino acid sequences revealed the presence of five unrelated pastv lineages. the first lineage (pastv ) contained the poastv - strain (hm ) from canadian pig and two porcine strains (y and ab ) derived from japanese pigs, interestingly, rat astrovirus (hm ) and porcine poastv cc (jn ) have formed two different lineages on the bayesian trees for the rdrp and orf amino acid sequences (figs. , ) . astrovirus strains contained in group (g ), (g ), and (g ) on the two bayesian trees showed similar topologies. however, astrovirus strains in group (g ) on the bayesian tree for the orf amino acid sequence were divided into g and group (g ) on the bayesian tree for the rdrp amino acid sequence (figs. , ) . a strain isolated from a hungarian wild boar in [ ] belonged to pastv or group (g ) that also contained pastk derived from korean wild boar. a previous study suggested that the number of pastv lineages extends to a total of five, all of which most likely represent distinct species of different origins [ ] . however, with the available astvs research data from countries around the world, future studies could unveil diverse genetic lineages. in this study, the porcine astrovirus strains appeared to be phylogenetically related to not only prototypical human astroviruses (as was already known) but also the recently discovered novel human strains. this finding suggests the existence of multiple cross-species transmission events between the hosts and the other animal species. several recent studies have shown that bats form multiple independent lineages [ , ] . bat astrovirus strains in this study also showed independent lineages and specifically, the ld (fj ) strain had a close relationship with astrovirus strains of human, sheep, mink, and sea lion (figs. , ) . a previous study suggested that porcine astvs have played an active role in pigs in the evolution and ecology of the astroviridae [ ] . recent studies have shown evidence of multiple recombination events between distinct pastv strains and between pastv and human astrovirus (hastv) in the variable region of orf [ ] , as well as interspecies recombination between porcine and deer astroviruses [ ] . a study of the molecular epidemiology and genetic diversity of human astrovirus in south korea from to revealed genotype to be the most prevalent, accounting for . % of strains, followed by genotypes korean pastvs are shown in bold prints, and strains isolates from korean and hungarian wild boar are marked with a star and an arrow, respectively. the numbers above the nodes represent posterior probabilities ( . %), ( . %), ( . %), ( . %), and ( . %) [ ] . this finding suggests that little interspecies (between human and pig) transmission has occurred until now in south korea. in conclusion, this study extends current knowledge of pastv in wild boar and domestic pig. a more extensive study should be done on wild life pastvs to further elucidate their potential role in the epidemiological landscape of the astrovirus infection in domestic pig population. to a greater length, continuous surveillance on the prevalence of both pastvs will provide a wider understanding of the possible cross-species or human transmissions, in particular. virus taxonomy. eighth report of the international committee on taxonomy of viruses acknowledgments we are grateful to mr. min-heg lim and ms. sa-ra choi for technical assistance. key: cord- -yqysedne authors: ducatez, mariette f.; liais, etienne; croville, guillaume; guérin, jean-luc title: full genome sequence of guinea fowl coronavirus associated with fulminating disease date: - - journal: virus genes doi: . /s - - -z sha: doc_id: cord_uid: yqysedne guinea fowl coronavirus (gfcov), a recently characterized avian coronavirus, was identified from outbreaks of fulminating disease (peracute enteritis) in guinea fowl in france. the full-length genomic sequence was determined to better understand its genetic relationship with avian coronaviruses. the full-length coding genome sequence was , nucleotides long with open reading frames and no hemagglutinin–esterase gene: a genome organization identical to that of turkey coronavirus [ ′ untranslated region (utr)—replicase (orfs a, ab)—spike (s) protein—orf (orfs a, b)—small envelop (e or c) protein—membrane (m) protein—orf (orfs b, c, a, b)—nucleocapsid (n) protein (orfs n and b)— ′ utr]. this is the first complete genome sequence of a gfcov and confirms that the new virus belongs to group gammacoronaviruses. electronic supplementary material: the online version of this article (doi: . /s - - -z) contains supplementary material, which is available to authorized users. coronaviruses (covs) are enveloped viruses with positivesense, non-segmented rna genomes of - kb. covs infect a wide range of hosts causing various degrees of morbidity and mortality. group i covs (alphacoronaviruses) contain viruses that infect not only humans (hcov- e and hcov-nl ) but also cats and dogs (with feline cov and canine cov, respectively), or pigs (with the porcine transmissible gastroenteritis virus, tgev for example). similarly, group ii covs (betacoronaviruses) may infect humans (examples: hcov-oc , hcov-hku , severe acute respiratory syndrome (sars)-related covs or the recently emerged mers-cov), horses (with ecov), or cattle (with bcov). in contrast, group iii covs (gammacoronaviruses) primarily infect birds: chickens, peafowl, and partridges harbour infectious bronchitis virus (ibv) while turkeys have turkey cov (tcov) and guinea fowl may be infected with guinea fowl cov (gfcov). gammacoronavirus strains have however been isolated from a whale and a wild felid [ ] . group iv covs (deltacoronaviruses) have been detected in birds (with bucov, mucov, spcov, etc.), or pigs (with porcine deltacoronavirus) [ ] . interestingly covs of the groups i, ii, and iv have been detected in chiroptera (bats), thought to be the reservoir of covs [ , ] . in the present study, we focused on a new member of the group iii covs, gfcov, and aimed at sequencing its full genome to better understand its molecular relationship with gammacoronaviruses. to determine the full genome of gammacov/guinea fowl/ france/s/ (gfcov/fr/ ), we first analysed the data generated on a miseq illumina platform as previously described [ ] . briefly, pooled intestinal contents of experimentally infected guinea poults were clarified, ultracentrifuged, and treated with nucleases to concentrate encapsidated viral material. rna was extracted, and a random rt-pcr was performed to generate unbiased pcr products of about bp [ , ] . the sequences generated that matched with avian covs sequences, as determined using gaas software [ ] , were extracted for further analysis and visualized using integrative genomics viewer (igv) with the closest blast hit as reference genome: tcov mg (accession number: eu ) [ ] . primers were designed based on the known sequence data to amplify missing genome fragments by pcr. sanger sequencing was then performed with pcr primers. the full genome sequence was submitted to embl and was attributed the following accession number: [ln ]. sequence analysis was carried out using bioedit version . . . [ ] , muscle for the alignment [ ] , and mega version . for the phylogeny [ ] . the gfcov-generated sequences were assembled into one contiguous coding sequence of , nucleotides. the entire genome had a gc content of . %, identical to the turkey coronavirus (tcov) mg genome [ ] . gfcov and tcov genomes have the same organization: (i) a untranslated region (utr), (ii) two large slightly overlapping orfs coding for the replicase: a and ab, (iii) gene coding for the spike (s) protein, (iv) orf (orfs a, b), (v) gene coding for the small envelop (e or c) protein, (vi) gene coding for the membrane (m) protein, (vii) orf ( b and c, a, b), (viii) genes coding for the nucleocapsid (n) protein (orfs n and b), and (ix) utr ( table ) . the multiprotein on single orfs is generated by alternative translation. while the role of avian coronavirus (ibv) structural proteins is known: binding to rna, nucleocapsid formation and role in cell-mediated immunity for n; virus budding site determination, role in virus particle assembly and in interferon-induction, interaction with viral nucleocapsid for m; association with viral envelop, role in virus particle assembly and putatively in apoptosis for e; binding to cellular receptors, induction of fusion between viral and cellular membranes, induction of neutralizing antibodies and role in cell-mediated immunity for s; little is known on the function of non-structural proteins. it has mainly been shown that they are not essential for virus replication in vitro but likely help the virus replicate in vivo [ , ] . the proteins a, b, b, a, and n were of the same size. sizes of other proteins varied, but within the range observed previously between different tcov strains. interestingly, gfcov/fr/ harboured a shorter small envelop protein than its tcov counterparts (table ) . further studies are warranted to understand the impact of avian covs protein sizes in the biology of the viruses. phylogenetic analysis on the full genome of gfcov/fr/ showed it clearly clustered with north american tcov strains (fig. a , supported by a high bootstrap value of ), as it was observed previously for the s gene [ ] . the genetic distance between gfcov/fr/ and tcov ranged between . and . %, while genetic distances between gfcov/fr/ and representative ibv strains were larger and varied between . and . % (supplementary table) . a simplot analysis comparing the gfcov/ fr/ full genome to its closest tcov and ibv blast hits showed that the three genomes are highly similar throughout the genome ( - % similarity, with no significantly higher identity of gfcov/fr/ with tcov or ibv genomes), except for the s gene (fig. b) . gfcov s gene was indeed more closely related to tcov s than to ibv s genes but also more distinct to both viruses on the s gene than on the rest of its genome (\ % identity for ibv and - % identity with tcov s genes, fig. b) , suggesting a recombination event as was hypothesized for the origin of tcov [ ] . a parallel evolution from a common ancestor with a much higher substitution rate on the s gene than on the rest of the genome can however not be ruled out at this stage. the present study showed that gfcov/fr/ harbours a genome organization very similar to that of tcov strains. in addition, and again like tcov, gfcov/fr/ likely originated from a recombination event between an ibv-like (or tcov-like) virus that would have given most of its genome and a so far unknown cov that would have contributed by its spike gene. despite the similarity of their genomes and their enteric tropism, tcovs often cause mild clinical signs while gfcovs are usually associated with extremely high mortalities in their host, suggesting strikingly different host-virus interactions. further studies are ongoing to understand the host range of gfcov/fr/ and its determinants of pathogenicity. infectious diseases of wild mammals and birds in europe nucleic acids symp diseases of poultry fields virology acknowledgments this work was supported by the 'epicorem' grant of the agence nationale de la recherche (anr), by the french key: cord- -i tkdgy authors: suo, siqingaowa; wang, xue; zarlenga, dante; bu, ri-e; ren, yudong; ren, xiaofeng title: phage display for identifying peptides that bind the spike protein of transmissible gastroenteritis virus and possess diagnostic potential date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: i tkdgy the spike (s) protein of porcine transmissible gastroenteritis virus (tgev) is located within the viral envelope and is the only structural protein that possesses epitopes capable of inducing virus-neutralizing antibodies. among the four n-terminal antigenic sites a, b, c, and d, site a and to a lesser extent site d (s-ad) induce key neutralizing antibodies. recently, we expressed s-ad (rs-ad) in recombinant form. in the current study, we used the rs-ad as an immobilized target to identify peptides from a phage-display library with application for diagnosis. among the phages selected that specifically bound to rs-ad, the phage bearing the peptide tlnmhlfpfhtg bound with the highest affinity and was subsequently used to develop a phage-based elisa for tgev. when compared with conventional antibody-based elisa, phage-mediated elisa was more sensitive; however, it did not perform better than semi-quantitative rt-pcr, though phage-mediated elisa was quicker and easier to set up. transmissible gastroenteritis virus (tgev) is a member of the coronaviridae family and is a major cause of enteric disease in pigs where it threatens swine production and triggers substantial economic losses in the industry [ ] [ ] [ ] [ ] . its genome is composed of positive-stranded rna approximately . -kb in length. the virus consists of four structural proteins: envelope (e), membrane (m), spike (s), and nucleocapsid (n) proteins [ , , ] . non-structural proteins, which comprise two-thirds of the -proximal end, are encoded by open reading frames a and ab as well as the replicase. in contrast, the end of the genome encodes both non-structural and structural proteins ( -s- a- b-e-m-n- - ) [ ] . the s protein, which induces neutralizing antibodies, is important in the initiation of infection [ ] [ ] [ ] and has been further delineated into four antigenic sites a, b, c, and d which are located within the n-terminal region of the s protein [ ] . among these, only site a and to a lesser extent site d (herein defined as s-ad) are involved in eliciting neutralizing antibodies. recent work demonstrated that recombinant s-ad (rs-ad) was able to induce antibodies capable of neutralizing tgev infection in vitro [ ] . edited attenuated or inactivated tgev vaccines are less than optimal because they are capable of reverting back to virulent phenotypes and generally do not prevent viral shedding. therefore, effective diagnostic tests have become important in virus management and control. phage display is a proven technology for identifying small peptide ligands that can bind specific target proteins [ ] [ ] [ ] [ ] . it has been utilized in antibody engineering [ ] , drug discovery [ ] , vaccine development [ ] , and molecular diagnosis. in virology, phage display has been used to identify peptides that interact with several viruses such as bovine rotavirus [ ] , adenovirus type [ ], andes virus [ ] , sin nombre virus [ ] , coronavirus [ ] , and avian h n virus [ ] . herein, we use similar technology and advance previous work by using the rs-ad as an immobilizing target to select phages from a peptide display library, with diagnostic potential for tgev. our results indicate that phages bearing peptide ligands that bind rs-ad can be used to develop a phage-mediated elisa with high sensitivity and specificity to distinguish tgev from other common swine viruses. biopanning swine testis (st) cells were purchased from atcc and used to propagate tgev strain pur -mad [ ] . the rs-ad was produced and purified as described elsewhere [ ] . a -mer phage-display library was purchased from new england biolabs for panning according to published protocols [ , , ] using the rs-ad as a target at a concentration of lg/well. the -well plates coated with rs-ad, were initially incubated with the phage library ( . pfu/ml; ll/well) suspended in tbst ( mm tris-hcl [ph . ], mm nacl, . % tween- ) for min. subsequent pannings , , and were performed using incrementally higher concentrations of tween- . the phage titers of the input, output (elution), and amplified phages were determined as defined by the manufacturer. indirect elisa was used to assess the phages that remained after four rounds of biopanning. either tgev ( . mg/ml) or rs-ad ( lg/well) in . m nahco ph . was used to coat -well plates at °c for h. the next day, the plates were blocked with % bovine serum albumin (bsa) in tbs (tbsb) for h, washed ( ) with tbst, and then incubated with phage ( . pfu/ml in . m nahco , ph . ; lg/well) for h at °c. the plates were again washed with tbst, then incubated for h at °c with rabbit anti-m antibody ( : in tbsb; abcam), followed by horseradish peroxidase (hrp)-conjugated goat anti-rabbit igg antibody (garp; : in tbsb, sigma). the od nm was determined in triplicate as previously described [ ] . ten phages with the highest affinity for binding rs-ad as determined by elisa were amplified, precipitated with polyethylene glycol-nacl, and then used for dna extraction according to the manufacturer's instructions (new england biolabs). amplification of the genes encoding the exogenous peptides was performed using sense ( -tcacctcgaaagcaagctga) and anti-sense ( -ccctcatagttagcgtaacg) m primers followed by dna sequencing [ , ] . the pcr conditions were as follows: °c for min, cycles of °c for s, °c for s, °c for s, and a final extension at °c for min. to compare the sensitivities of phage-mediated elisa to antibody-mediated elisa, tgev serially diluted in . m nahco (ph . ) was coated onto duplicate elisa plates overnight at °c followed by blocking with % skim milk for h at rt. the selected phages or unbound phage complexes (negative control) diluted in pbs ( . pfu/ml) were added to one set of plates, followed by anti-m antibody ( : in pbs ? bsa). to the second set of duplicate plates, rabbit anti-tgev polyclonal antiserum serially diluted in pbs ? bsa, and normal rabbit serum were added as the primary and control antibodies, respectively. after incubating both sets of plates for h at °c followed by extensive washing, garp ( : ) was added as described above. the od values were read on all plates; od ratios where od (sample-negative standard) (p)/od (positive control-negative standard) (n) [ were judged as positive. all experiments were performed in triplicate. the tcid of tgev was determined using the reed-muench method, and tgev was adjusted to . mg/ml in pbs. total rna was extracted from ll of virus (fastgene, china) and dissolved in ll of sterile water. reverse transcription was performed in ll using ll of rna ( ng/ll), oligo dt as primer, and m-mlv reverse transcriptase as recommended by the manufacturer (takara, china). the resulting cdna ( ll) was used as a template for pcr in ll which included . ll of easy taq polymerase (takara, china), ll of dntp ( . mm), pcr buffer ( ll), and . ll each of sense ( -cttagtagtaatattttgcatac) and antisense ( -tatagcagatgatagaattaaca) primers. amplification conditions were as follows: °c for min, then cycles of °c s, . °c s, and °c s followed by a final extension at °c for min. the amplified fragment was confirmed by dna sequencing. phage specificity was evaluated against the following panel of porcine viruses: tgev, strain hr/dn [ ] , porcine epidemic diarrhea virus (pedv; strain hljby) [ ] , porcine reproductive and respiratory syndrome virus (prrsv; strain jilintn ) [ ] , porcine circovirus type ii (pcv ; strain pcv -ljr) [ ] , porcine parvovirus (ppv; strain ppv ) [ ] , porcine pseudorabies virus (prv; strain kaplan) [ , ] , and porcine rotavirus (prov; isolate dn ) [ ] . all viruses were initially coated at lg/ml then serially diluted in . m nahco (ph . ) and subjected to phage-elisa as described above. average od values were obtained from three independent experiments. data were collated and the mean ± sd values were determined. arithmetic means were compared between treatment groups using anova (spss . ; spss inc., chicago, illinois, usa) followed by duncan's multiplerange test. values of p \ . and p \ . were defined as statistically significant (''*'') or highly significant (''**''), respectively. in this study, we used phage display to select -mer peptides that bind rs-ad [ ] and that may function for diagnosing tgev infections. after four rounds of panning, rs-ad-specific phages increased from . in the first round to . in the fourth round (table ) . following the last screen, we selected phage clones from the original that bound both rs-ad and tgev. this subset was characterized by elisa with respect to their binding efficiencies (fig. ) . pcr amplification and sequencing indicated that nine distinct -mer peptides were identified among the phages that were selected ( table ). in contrast to previous reports [ , , ] , these peptides exhibited substantial sequence diversity in the number of peptides that bound to rs-ad. it is not known if this relates to the length of the target protein or to changes made in the panning process to enhance binding specificity. as shown in fig. , we selected four (phtgev-sad- , phtgev-sad / , phtgev-sad , phtgev-sad ) of the ten phages with the highest binding affinity to tgev for further testing. the lowest detectable quantity of tgev for the above defined phages was . , . , . , and . mg, respectively, suggesting that phtgev-sad was the most sensitive when used in a phage-based elisa. binding directly to tgev was uncharacteristically better than binding to the rs-ad used in the selection process (figs. , ) . this is likely attributable to more complete folding of the native protein or to better accessibility of the binding epitope in the native form. the minimum quantity of tgev required for detection via antibody-based elisa was . lg (p/n value [ ) (fig. ) , whereas the minimum quantity of tgev required for phtgev-sad -based elisa was . lg. this is consistent with the phage-mediated elisa being more sensitive than conventional antibody elisa. a number of elisa-based assays have been developed over the years for detecting tgev, many of which have been directed at differentiating tgev from prcv-infected animals. among the earlier ones, sestak et al. [ ] targeted the s glycoprotein of tgev in a competition elisa where recombinant s protein was coated onto plates and used to capture host antibodies. using a monoclonal ab to epitope d and which is specific for tgev, the investigators were able to differentiate the infectious agents. liu et al. [ ] cloned and expressed the nucleoprotein (n) to develop an elisa. compared to the virus neutralization assay, they demonstrated % sensitivity and specificity; however, they did not characterize or address the lower level of sensitivity in vitro or in vivo. in , elia et al. [ ] used the recombinant s protein to develop an elisa to assess swine-like tgev coronaviruses in canine hosts. given the novelty of the virus, they were unable to compare it to other assays currently in use. zou et al. [ ] use techniques similar to those developed here, i.e., peptide display, to target the m protein of tgev in developing an elisa-based diagnostic test. in this case, the sensitivity of the elisa exceeded that when the phage-mediated elisa and antibody elisa were compared to rt-pcr which targeted a -base pair fragment of the s gene, the rt-pcr was most sensitive of all assays tested. this is not unexpected given the higher sensitivity of pcr assays in general. pcr amplification was positive using cdna equivalents of . lg of tgev (data not shown). real-time pcr and/or nested pcr would clearly have generated even more sensitive results. in addition, phages expressing peptide that bind to tgev s-ad did not bind to other selected viruses (fig. ) . table sequences of tgev rs-ad peptides. predicted amino acid sequences were generated for ten selected phages in summary, we identified peptides that specifically bind to tgev and can form the basis of new diagnostic tests where the sensitivity of phtgev-sad was . lg of tgev. this sensitivity fared quite well when compared to the antibody-mediated elisa which had a sensitivity of . lg but fell short of the sensitivity of rt-pcr; however, phtgev-sad- provides a quicker and less costly alternative to rt-pcr. diseases of swine th ed the authors declare no conflicts of interest. key: cord- -os kyvvf authors: wang, li; dai, xianjin; song, han; yuan, peng; yang, zhou; dong, wei; song, zhenhui title: inhibition of porcine transmissible gastroenteritis virus infection in porcine kidney cells using short hairpin rnas targeting the membrane gene date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: os kyvvf the membrane (m) protein is the most abundant component of the porcine transmissible gastroenteritis virus (tgev) particle. to exploit the possibility of using rna interference (rnai) as a strategy against tgev infection, three plasmids (prnat- , prnat- , and prnat- ) expressing short hairpin rnas were designed to target three different coding regions of the m gene of tgev. the plasmids were constructed and transiently transfected into a porcine kidney cells, pk- , to determine whether these constructs inhibited tgev production. the analysis of cytopathic effects demonstrated that prnat- and prnat- could protect pk- cells against pathological changes specifically and efficiently. additionally, indirect immunofluorescence and % tissue culture infectious dose (tcid( )) assays showed that prnat- and prnat- inhibited the multiplication of the virus at the protein level effectively. quantitative real-time pcr further confirmed that the amounts of viral rnas in cell cultures pre-transfected with the three plasmids were reduced by , , and %, respectively. this is the first report showing that rnai targeting of the m gene. our results could promote studies of the specific function of viral genes associated with tgev infection and might provide a theoretical basis for potential therapeutic applications. transmissible gastroenteritis coronavirus (tgev) is a positive rna virus, which is a member of a large family of enveloped viruses [ ] . pigs of any age and breed can be infected. in particular, sucking piglets at about weeks old are the most susceptible, showing mortality rates up to %, which results in large economic loss in swine-producing areas worldwide [ , ] . however, the pathogenic mechanism of tgev remains unclear [ ] . at present, several vaccines to prevent tge are available; however, their efficacies are variable. attenuated tgev vaccines have the risk of returning to the virulent form and might even induce an adverse reaction and inactivated viruses are not sufficiently protective in pigs [ , ] . moreover, newborn piglets can suffer from gastroenteritis within h post-infection, and death can occur in - days [ ] , whereas current vaccines cannot provide complete protection in the first days after inoculation. thus, it is necessary to develop novel, highly effective, and rapid-acting antivirals to resist tgev infection [ ] . rna interference (rnai) is a precise gene silencing method that uses double-stranded rna (dsrna) molecules comprising - nucleotides (nt) . rnai in the form of small interfering rnas (sirnas) or short hairpin rnas (shrnas) has been studied for their interference with virus replication [ , ] . recent research suggests that the replication of various viruses, including many coronaviruses, could be inhibited effectively in vitro and in vivo [ ] [ ] [ ] [ ] [ ] [ ] . therefore, it might be possible to disrupt the replication of tgev in cell culture using shrnas targeting the m gene of tgev. tgev is a positive-sense, ssrna virus with a . kb genome that contains a leader sequence at the end and a poly (a) tail at the end, which encodes four structural proteins [spike (s), membrane (m), nucleocapsid (n), and envelope (e)] and five non-structural proteins [ ] [ ] [ ] . the s protein is a major membrane glycoprotein that plays important roles in inducing a protective immune response, and in virus attachment, membrane fusion, and viral pathogenicity [ ] [ ] [ ] . the n protein, together with the genomic rna, forms the viral nucleocapsid [ ] . the e protein regulates virion assembly and release [ ] . the m protein is the most abundant component of the coronavirus particle [ ] and differs from other viral proteins in terms of its structure, processing, and intracellular transport [ ] . the expressions of the m and e proteins might be sufficient to trigger the formation of virus-like particles (vlps). in addition, m is highly conserved among different strains, and our previous studies proved that the expression the m protein alone using a baculovirus expression system could lead to the formation of vlps, as observed under a transmission electron microscope, which further confirmed that the m protein of tgev is a decisive protein for the proliferation of viral proteins [ ] . as one of the important structural proteins of tgev particles, the m protein is exposed on the viral internal core [ ] , and associates with the golgi complex in the cell, which suggests that the m protein plays a mechanistic role at the site of virus assembly and budding [ ] , and suggest that m is an indispensable component for the replication of virus particles in host cells. in this study, we constructed three shrnas in a plasmid expression system that targeted the m gene and investigated whether shrna-mediated rna interference could inhibit tgev infection of pk- cells. virus and cells tgev strain cq was isolated from sick piglets with symptoms of diarrhea [ ] and stored in our laboratory. pk- cells were grown in high glucose dulbecco minimum essential medium (dmem) supplemented with % fetal bovine serum (gibco, usa), iu of penicillin, and streptomycin per ml, at °c in a % co atmosphere incubator. according to the general principles and guidelines for the design of rna interference, sequences from the m gene of tgev cq were designed based on the ambion's online sirna target design tool to choose the three best target sequences to target the m gene (http://www.ambion.com/ techlib/misc/sirna_finder.html). three theoretically effective sequences at nucleotide positions - (rnat- ), - (rnat- ), and - (rnat- ) were selected. the sequences were analyzed by blast to ensure that they did not have any similar sequences in the swine genome, but share % similarity with the published sequences of different tgev strains. these three sequences are listed in table . all the sequences were arranged in the following alignment: bamhi ? sense ? loop ? antisense ? termination ? hindiii. we designed the double-stranded oligo dna hairpin structures to target the m after annealing. all the shrnaexpressing plasmids were diluted with tris-edta buffer to a final concentration of lg/ll. the annealing reaction system ( ll) comprised ll of shrna sense template, ll of antisense template, and ll of ddh o. the mixture was heated to °c for min, cooled to °c for s, and then incubated at °c. the annealed shrna dna sequences (rnat- , rnat- , and rnat- ) and shrna expression vector, prnat-u . /neo (ribobio, china), were then double digested with bamhi and hin-diii, and inserted into bamhi-hindiii digested prnat-u . /neo to yield prnat- , prnat- , and prnat- , respectively. after transformation of escherichia coli dh a competent cells to obtain the recombinant plasmids, the positive clones were identified by pcr and sequencing analysis. the enhanced green fluorescence protein fusion gene in the plasmids was used as a reporter during the transfection efficiency analysis. one day before transfection, pk- cells were seeded into six-well plates and incubated for h at °c in a % co atmosphere without antibiotics. when the cells reached - % confluence, they were washed with . m pbs (ph . ) three times and overlaid with transfection complexes containing . lg of prnat- , prnat- , prnat- , or prnat-nc in ll of dmem medium mixed with lipofectamine tm (invitrogen, usa), according to the manufacturer's instructions. the transfection complexes were completely removed after incubating for h, and the medium was replaced with % fbs containing lg/ml g . after maintenance for d in selection media, resistant cell clones were selected, cultured, and infected with . moi of tgev per well in six-well plates. non-transfected cells were used as a virus genes ( ) : - control. cell transfection efficiency and cpe images were captured under an inverted fluorescence/phase-contrast microscopy (nikon, japan). shrna-transfected cells were collected h after viral infection, subjected to three freeze-thaw cycles, serially diluted tenfold from - to - , and added to -well plates. each dilution was added to eight wells. the tcid was calculated using the reed and muench method. to quantify the effect of shrna on viral replication at h post viral infection, total rna was extracted from pk- cells using the rnaiso plus (invitrogen, usa) reagent, according to the manufacturer's instructions, and reverse transcribed into cdna using the goscript tm reverse transcription system (promega, usa), also according to the manufacturer's instructions. quantitative real-time pcr (qpcr) analysis was performed to amplify m gene using the cdna as the template and the b-actin gene as the internal standard. western blotting pk- cells were transfected with prnat-nc, prnat- , prnat- , or prnat- and infected with tgev. cells as well as virus particle were lysed in phosphate buffered saline (pbs), and the total proteins were separated using % sodium dodecyl sulfate-polyacrylamide gel electrophoresis (sds-page) and transferred onto a polyvinylidene difluoride membrane. the membranes were incubated with rabbit anti-tgev polyclonal primary antibodies ( : dilution, °c, overnight), washed, and then incubated with hrp-goat-anti-rabbit secondary antibody ( : dilution, room temperature, h). effects of shrna transfection pk- cells ( cells per well) were plated in six-well plates and transfected with shrna recombinant plasmids (prnat- , prnat- , or prnat- ) and empty plasmid (prnat-nc), separately, for h, before being examined by fluorescence and phase-contrast microscopy. the gfp gene expressed the green fluorescent protein from the cmv promoter, and more green fluorescent excitation by the blue wavelengths was observed in cells containing the empty plasmid (prnat-nc) compared with cells transfected with the recombinant plasmids (prnat- , prnat- , prnat- ). the normal pk- cells showed no fluorescence (fig. ) . the results showed that shrna recombinants were transfected into pk- cells successfully and that stably transfected cell lines were created. the transfection efficiencies were similar among the three recombinant plasmids, while that of the empty plasmid was higher. to study the tgev-induced cpe, pk- cells were infected with tgev at . moi. the virus infected cells (mock control) and empty plasmid (prnat-nc) exhibited obvious morphological changes at h post-infection, including cells shrinkage, turn round, and detachment, in contrast to the non-infected cells (normal) that remained tightly stuck to the plate and maintained their shape. as shown in fig. , the normal group grew well; however, the cells harboring the shrna-expressing plasmids prnat- and prnat- showed small patches of cpe, such as rounding, shrinking, and morphological changes of the cells, as well as shedding from the brim of the wells. interestingly, the cells harboring recombinant prnat- and prnat- were mostly capable of resisting the cpe as shown by the observation that the cells attached well and had reduced areas of cpe, which contrasted with the large area of severe cpe in the cells harboring prnat- . these results indicated that shrna-expressing plasmids prnat- and prnat- inhibited tgev-induced cpe to a certain degree and could relieve the specific cytopathic effect compared with the controls. to investigate the inhibition of tgev replication by the shrnas, virus titers in pk- cells were calculated by the reed-muench method. figure shows that the titers of tgev reached . , . , and . tcid /ml at h post-infection in cells harboring prnat- , prnat- , and prnat- , respectively. the titers at h postinfection corresponded to . -, . -, and . -fold reductions, respectively, compared with that of prnat-nc. the tgev titer was . tcid /ml in cells receiving no plasmid (mock) transfection, which was higher than the titer of . tcid /ml in cells pretransfected with prnat-nc. there was a significant difference between prnat- and prnat-nc (p \ . ), as well as between prnat- and prnat-nc (p \ . ). contrastingly, there was no significant difference between prnat- and prnat-nc. these data indicated that prnat- and prnat- resisted tgev infection by reducing the levels of progeny virus production significantly in pk- cells. in addition, prnat- and prnat- showed partial virus infection inhibition, with prnat- being the least effective shrna. the expression levels of the m gene in pk- cells treated with different interfering plasmids were examined using qpcr. figure shows the cellular expression of the m gene. when the cells were transfected with prnat- , the expression of m gene decreased by % compared with the cells transfected with prnat-nc. when the cells were transfected with prnat- or prnat- , the expression of the m gene decreased by and %, respectively, compared with cells transfected with prnat-nc. the results showed that prnat- and prnat- have a certain inhibitory effect on the proliferation of tgev in pk- cells, which is caused by degradation of the viral rna. to further investigate the levels of viral proteins in cells transfected with shrna plasmids and infected with tgev, the levels of viral proteins were assessed using western blotting. equal amounts of cell lysates from tgev-infected and mock-infected pk- cells at h were examined using positive anti-tgev serum. figure shows that the amount of viral protein recovered from cells transfected with prnat- or prnat- was reduced, while the amount of viral protein recovered from cells transfected with prnat- was similar to that recovered from cells without an interfering plasmid, which was consistent with the qpcr analysis. rnai has been used widely to silence target genes in mammalian and human cells [ ] [ ] [ ] . rnai can regulate specific gene expression and is closely related to anti-virus replication. rnai has an excellent prospect to improve the shortage of traditional anti-virus vaccines or related inhibitors. rnai has emerged as a potentially important therapeutic antiviral strategy [ , [ ] [ ] [ ] . recently, several kinds of animal viruses, such as porcine reproductive and respiratory syndrome virus [ , ] , newcastle disease virus [ ] , classical swine fever virus [ ] , porcine circovirus [ ] , infectious bursal disease virus [ ] , and porcine hemagglutinating encephalomyelitis virus [ ] have been silenced effectively, and most of these viruses are rna viruses. tgev is a porcine coronavirus with an rna genome; therefore, it should also be sensitive to rnai [ ] . several studies have reported the application of rnai against tgev replication. effective suppression of tgev infection in swine testicular (st) cells was achieved using dna-based vectors expressing sirnas or shrnas targeting the rna-dependent rna polymerase gene of tgev [ , ] . lei he, et al. reported the effective inhibition of tgev infection in st cells or pk- cells using dna-based vectors expressing an shrna targeting the transcription of tgev gene (a non-structural gene) [ , ] . however, there is no report showing that sirna/ shrna targeting the m gene of a coronavirus could efficiently inhibit viral infection. in this study, we constructed three shrnas plasmid expression systems to target the m gene and investigated whether shrna-mediated rna interference could inhibit tgev infection in pk- cells. our results demonstrated that the infection of tgev in cell culture could be disrupted by shrnas targeting the m gene of tgev: two of the three shrnas generated from the m gene of tgev blocked viral infection efficiently. the cpe and tcid assays revealed that cells transfected with prnat- , prnat- , and prnat- , harboring three sequencespecific shrnas, could trigger inhibition of tgev infection at h post-infection; prnat- in particular showed markedly suppression. western blotting and qpcr analyses further confirmed that the efficient inhibition of viral infection was caused by viral degradation. however, the qpcr analysis showed that transfection with prnat- and prnat- inhibited viral infection by the equivalent of %. the qpcr analysis and western blotting assays also demonstrated that, compared with the mock control, the amount of viral rnas in the prnat- group decreased a little, which suggested an inefficient inhibitory effect, which possibly indicated that the prnat- sequence results in non-specific inhibition or in 'off -target' effects. overall, the variability of viral suppression could be related to the following two aspects. one is that the regulation of rna transcription and protein expression is a very complex process, and represents the combined effect of various factors. the other possible explanation is the difference in the sensitivity and accuracy between tcid and qpcr. qpcr is highly sensitive to detect the suppression effect of rna interference. in addition to the potent inhibition shown by two sequence-specific shrnas, the tcid and qpcr analyses also demonstrated that, compared with the mock control, the amount of viral rnas in the negative control prnat-nc cells also decreased a little, which suggested a non-specific effect on tgev replication in pk- cells. similarly, other researchers have discussed an ''off-target'' effect induced by sirna or shrna in their reports. lu et al. [ ] found that the non-specific effect was positively related to the concentration of the shrnas. overall, compared with the low-efficiency inhibition and 'off-target' effects of prnat- , the other two sequencespecific shrnas exhibited the potential to silence tgev rnas. in conclusion, our results indicated that shrnas targeting the m gene in tgev genome could effectively block infection of tgev in pk- cells. this finding showed that shrnas could represent a potential novel tool against tgev infection. these results also provided an insight into the inhibition of tgev infection by targeting the m gene. taken together, the present data and the known advantages of shrna technology suggest that shrna represents a candidate agent for tgev therapeutic applications. the coronaviridae proc. natl. acad. sci. usa key: cord- -c ki n authors: koba, ryota; suzuki, satori; sato, go; sato, shingo; suzuki, kazuo; maruyama, soichi; tohya, yukinobu title: identification and characterization of a novel bat polyomavirus in japan date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: c ki n a novel polyomavirus (pyv) was identified in the intestinal contents of japanese eastern bent-wing bats (miniopterus fuliginosus) via metagenomic analysis. we subsequently sequenced the full genome of the virus, which has been tentatively named miniopterus fuliginosus polyomavirus (mfpyv). the nucleotide sequence identity of the genome with those of other bat pyvs was less than %. phylogenetic analysis revealed that mfpyv belonged to the same cluster as pyvs detected in miniopterus schreibersii. this study has identified the presence of a novel pyv in japanese bats and provided genetic information about the virus. electronic supplementary material: the online version of this article ( . /s - - - ) contains supplementary material, which is available to authorized users. bats are considered the natural reservoirs of a variety of zoonotic rna viruses, such as ebola viruses, marburg viruses, and severe acute respiratory syndrome coronavirus [ ] [ ] [ ] . several dna viruses, including adenoviruses, herpesviruses, and polyomaviruses (pyvs), have also been detected [ ] [ ] [ ] . however, the pathogenic and zoonotic roles of these dna viruses have been not clarified. pyvs are small double-stranded dna viruses with a circular genome of approximately kbp. the viral genome consists of three regions: regulatory, early, and late regions. the regulatory region is responsible for transcription from both the early and late promoters and the initiation of viral dna synthesis. the early region contains genes encoding the large t antigen (tag) and small t antigen (tag). the late region contains the structural proteins vp , vp , and vp [ ] . although pyv diversity in bat populations in north, central, and south america, africa, indonesia, and new zealand were investigated in previous studies [ , [ ] [ ] [ ] [ ] [ ] , the prevalence and genetic diversity of pyvs in japanese bats remain unclear. the aims of this study were to (i) determine the presence of pyvs in japanese bats, (ii) characterize the genomic structure of bat pyvs, and (iii) analyze the evolutionary relationships between the bat pyv detected in this study and other known bat pyvs. eighteen bats (miniopterus fuliginosus) were collected in wakayama prefecture, japan. the pooled intestinal contents obtained from each bat (approximately g/body) were prepared as a % suspension in sterilized phosphate-buffered saline (pbs a total of , , reads and contigs were obtained in a pool of sample from bats. to identify homologous sequences, the obtained genomic data were analyzed via a blastn search using the dna data bank of japan database in accordance with a previously reported method [ , ] . virus-related sequences were identified in contigs. of these, contigs contained pyv-like sequences with high identities. other contigs contained the sequences of eel river basin pequenovirus, montastraea cavernosa colony-associated virus, and grapevine-associated totivirus- . to determine the complete viral genome of these pyv-like sequences, pcr was performed using la taq dna polymerase (takara bio, otsu, japan) in accordance with the manufacturer's instructions. specific pcr primers were designed on the basis of the sequences obtained from the contigs. the primer sequences were as follows: pyv-f (sense, ′-aag ttt gca gta gtc ttt gaa gat gtg aag ggt c- ′), pyv-r (antisense, ′-cac tcc tgg gct ttc ctg ctc ata ttt atg ca- ′), pyv-f (sense, ′-cat aaa cag ggt caa acc ac- ′), and pyv-r (antisense, ′-aag cac tcc acc aaa gga aa- ′). dna extracted from the pooled sample of bat intestinal contents was used as the template. the pcr products were visualized via electrophoresis on a % agarose gel stained with sybr safe (life technologies, carlsbad, ca, usa). the full genomic dna could be amplified by two independent pcr using the aforementioned primers. the amplified dna was cloned by inserting the pcr product into the pcr . topo vector (life technologies) in accordance with the manufacturer's instructions. the obtained sequences were analyzed using the bigdye terminator v . cycle sequencing kit (life technologies), and nucleotide sequences were assembled using atgc computer software (genetyx corporation). a homology search was performed using ncbi blast. the genome of miniopterus fuliginosus polyomavirus (mfpyv) has a length of bp (accession number: lc ). the genome organization includes an early region coding for tag and tag on one strand and a late region encoding the capsid proteins vp , vp , and vp on the opposite strand. a noncoding regulatory region (nccr) was located between the start of the early region and that of the late region, in line with previous findings for bat pyvs (fig. a and supplementary table ) [ ] [ ] [ ] [ ] . interestingly, open-reading frames encoding vp and vp of mfpyv did not overlap with that of vp . the stop codons of vp and vp are located at base positions - , whereas the start codon of vp is located at base positions - . a single nucleotide (guanine) at separates the vp / and vp regions ( fig. a and supplementary fig. ). therefore, genomic composition of mfpyv is genetically different from those of typical pyvs in terms of non-overlapping vp regions. figure b and c present the phylogenetic trees of vp and tag of mfpyv and other bat pyvs constructed using neighbor-joining analysis. based on phylogenetic analyses of the vp and tag amino acid sequences, both regions of mfpyv are closely related to those of other miniopterus pyvs and group b bat pyvs (fig. b and c) . vp is a major pyv structural protein that is indispensable for entry of the virus into host cells [ ] . mfpyv vp displayed less than % nucleotide sequence identity with other bat pyvs (supplementary table ). tag is a multifunctional protein that plays important roles in viral dna replication and the regulation of viral and cellular gene expression [ ] [ ] [ ] . the predicted mfpyv tag exhibited low similarity (< %) with those of other pyvs (supplementary table ). mfpyv tag sequences contained features known to be conserved in tags of other bat pyvs, including the highly conserved dnaj domain (hpdkgg), a retinoblastoma (rb)-binding motif (lycne), and several functional motifs (supplementary fig. ). according to a previous report, these elements work together to bind rb and interrupt its interaction with the e f transcription factor to promote viral replication and cell cycle progression [ ] . tag is generated via alternative splicing of the early mrna transcript [ , ] . in the early region of the mfpyv genome, conserved predicted splice donor sites are located at base positions - (cct gag /gta agg ) and - (ttt cag /gtc ttc ) (fig. a) . in the deduced nccr region of the mfpyv genome, several conserved elements were identified, including several copies of the consensus tag binding site gaggc and its reverse complement gcctc supplementary fig. ) . these elements are likely to comprise the core of the replication origin [ ] . comparison of the full-length genome sequence of mfpyv with those of other bat pyvs revealed that mfpyv is most closely related to the ky strain with % nucleotide sequence identity (supplementary table ). according to the polyomaviridae study group of the international committee on taxonomy of viruses, a novel pyv species should have < % nucleotide sequence identity to other known pyvs [ ] . mfpyv exhibited less than % nucleotide sequence homology to the known reference pyvs including previously reported bat pyvs. in line with the nomenclature of the other bat pyvs, we propose the name mfpyv for the newly discovered virus. for virus isolation, we attempted to propagate the mfpyv strain using the tb -lu cell line derived from the lungs of the free-tailed bat tadaria brasiliensis (atcc #ccl- ). however, a cytopathic effect was not observed in the cells following serial passage of the cultures. viral dna replication was also not detected in the cells and supernatant collected at each passage. there is a need for additional research to identify efficient cell culture systems for bat pyvs to elucidate the viral infection/replication mechanisms and their pathogenicity. in conclusion, we detected a novel pyv genome sequence in japanese bats. further epidemiological investigations are needed to determine the extent of pyv genetic variation in various bat species in japan. bats: important reservoir hosts of emerging viruses from sars to mers: years of research on highly pathogenic human coronaviruses large serological survey showing cocirculation of ebola and marburg viruses in gabonese bat populations, and a high seroprevalence of both viruses in rousettus aegyptiacus first detection of adenovirus in the vampire bat (desmodus rotundus) in brazil a novel bat herpesvirus encodes homologues of major histocompatibility complex classes i and ii, c-type lectin, and a unique family of immunerelated genes discovery of diverse polyomaviruses in bats and the evolutionary history of the polyomaviridae taxonomical developments in the family polyomaviridae detection of polyoma and corona viruses in bats of canada novel polyomaviruses in south american bats and their relationship to other members of the family polyomaviridae detection of novel polyomaviruses in fruit bats in indonesia genomic characterization of two novel polyomaviruses in brazilian insectivorous bats discovery of novel virus sequences in an isolated and threatened bat species, the new zealand lesser short-tailed bat (mystacina tuberculata) evaluation of rapid and simple techniques for the enrichment of viruses prior to metagenomic virus discovery the fecal virome of pigs on high-density farm sitespecific binding of wild-type p to cellular dna is inhibited by sv t antigen and mutant p sv large t antigen targets multiple cellular pathways to elicit cellular transformation cellular transformation by sv large t antigen: interaction with host proteins the molecular chaperone activity of simian virus large t antigen is required to disrupt rb-e f family complexes by an atp-dependent mechanism rna processing in the polyoma virus life cycle sequences flanking the pentanucleotide t-antigen binding sites in the polyomavirus core origin help determine selectivity of dna replication publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord- -umvk dc authors: lee, dana n.; angiel, meagan title: two novel adenoviruses found in cave myotis bats (myotis velifer) in oklahoma date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: umvk dc bats are carriers of potentially zoonotic viruses, therefore it is crucial to identify viruses currently found in bats to better understand how they are maintained in bat populations and evaluate risks for transmission to other species. adenoviruses have been previously detected in bats throughout the world, but sampling is still limited. in this study, pooled-guano samples were collected from a cave roost of myotis velifer in oklahoma. a portion of the dna polymerase gene from adenoviridae was amplified successfully in m. velifer samples; however, dna sequence was obtained from only of these m. velifer samples. one was collected in october , one in march , and in july . the october and march samples contained viral dna that was . % different from each other but % different than the novel viral sequence found in the july samples. phylogenetic analysis of these fragments confirmed our isolates were from the genus mastadenovirus and had genetic diversity ranging from to % when compared to other bat adenoviruses. bats make up % of all mammals, and they are the second richest mammalian order in respect to number of species [ , ] . in recent years, bats have emerged as a rich source of novel viruses [ , ] . they have been found to host more zoonotic viruses per species than rodents [ ] , and even documented to harbor viruses from two different viral families simultaneously [ ] . viruses in bats can switch hosts to other bat species [ ] and they are known to carry pathogenic viruses that can infect humans such as rabies, lyssaviruses, nipah and hendra viruses, ebola, and sars coronavirus [ , ] . however, in most cases bats serve as reservoirs for viruses with immunological tolerance and without transmission to other humans [ , ] . consequently, it is important to first identify viruses housed in bats in order to better understand the ecology of bat-borne viruses, how they are maintained in bat populations, and then evaluate risks for host transmission to other species. adenoviruses (advs) are double stranded dna viruses found in vertebrate hosts of many different species [ , ] . the family adenoviridae consists of five genera [ ] with members in the genus mastadenovirus infecting mammals [ ] . advs are widespread in the human population and cause a variety of usually minor symptoms, such as respiratory illnesses, conjunctivitis, and gastroenteritis [ ] . generally, these viruses are host-specific [ ] and thought to have low zoonotic risk [ ] ; however, chen et al. [ ] discovered a novel adenovirus (tmadv) with the ability to infect both monkeys and humans. since bats are known reservoirs of numerous viruses and cross-species transmission has been documented for an adv, it will be useful to know which advs bats carry. adv strains have been found in more than species of bats across their global distribution [ , , , [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] with seven species proposed by the international committee on taxonomy of viruses [ ] . these studies represent a start at investigating bat advs, but there is a need for additional studies considering there are over species of bats [ ] and few north american bats have been investigated. in this study, myotis velifer guano samples were tested for the presence of advs. we expected to find advs in m. velifer because that genus had the most advs in a study on bat species in china [ ] . guano was collected from m. velifer individuals in washita bat cave (washita co, ok). samples were either collected from a plastic tarp left laying overnight at the entrance of the cave (in march or july) or after bats were captured and placed in a sterile cup for h (in october). bats were handled following guidelines from sikes et al. [ ] , and white nose syndrome decontamination protocols were followed [ ] . regardless of method of collection, four guano pellets were stored collectively in µl of rna later ® , and stored at − °c. we obtained a total of guano pellets and this provided pooled samples for analysis. dna extraction was carried out with qiaamp ® dna mini kit (qiagen) following the manufacture's protocol with minor modifications. nested pcr of the partial adenoviridae dna polymerase gene was carried out on each sample following li et al. [ ] using primers pol-f ( ′ cagcck-ckgtt rtg yag ggt ′) and pol-r ( ′ gchacc aty agc tcc aactc ′). cycling profile consisted of °c for min, cycles of °c for s, °c for s, °c for s, and then a final extension of °c for min. the second round of amplification used µl of first round pcr product as template, primers pol-nf ( ′ ggg ctc rtt rgt cca gca ′ and pol-nr ( ′ tay gac atc tgy ggc atg ta ′) and the same cycling steps. positive (human adenovirus d dna from american type culture collection) and negative controls were used for each pcr. positive pcr products were purified with wizard ® sv gel and pcr clean-up system (promega). species verification of bats with positive adv samples was performed using nested pcr with primers sff_ f ( ′ gthachgcy cay gchtty gta ataat ′) and sff_ r (ctc cwg crtgdgcw agr tttcc ′) from [ ] and thermocycler steps consisting of °c for min, cycles of °c for s, °c for s, °c for s, and final extension of °c for min to amplify a region of the cytochrome c oxidase gene that is highly diagnostic among bats. sanger sequencing of positive samples for bat and adv identification was performed by oklahoma medical research foundation, and fragments were aligned and manually edited in geneious v. . . [ ] . adv sequences ( ) isolated from other bat species, turkey, canine, bovine, and human advs a, b, c, and d on genbank were added to the final alignment. model of sequence evolution, maximum likelihood analysis, and uncorrected p nucleotide distance were performed in mega v. . . using all sites, including gaps, and bootstrap replicates [ ] . there was at least one positive sample for each collection date, but only of positive samples had viral dna quantities necessary for successful sequencing. the alignment of our dna sequences with only the recognized viral species followed a hasegawa-kishino-yano model of evolution with a gamma distribution of . . the maximum likelihood analysis indicated that the advs were of the mastadenovirus genus and our proposed advs form separate clusters in distinct clades (fig. ) . when all sequences from genbank were included in the alignment the model of nucleotide evolution was the general time reversible model with a gamma distribution of . and invariant sites. in this analysis (not shown), the adv sequences did not form clusters according to their host family. this suggests transmission between host species is more common than coevolution with the host. myotis velifer samples from october (guano ) and march (guano ) were only . % different from each other, while they were ~ % different from the sequences extracted from july samples ( guano , , , ) . guano was identical to guano and , and it was only nucleotide different than guano . dna sequences from guano , , and have been deposited in genbank (accession mn -mn ). we do recognize the sequenced fragment is short ( basepairs) and only provides preliminary viral classification. advs species are designated if amino acid sequence is > % for the dna polymerase gene. based on this criterion, we suggest guano and guano are different strains of the same adv species and are further referred to as cave myotis adv - and cave myotis adv - . dna sequence from guano sample , , , and represent a separate adenovirus species and are further referred to as cave myotis adv . there were non-synonymous mutations and synonymous mutations between cave myotis adv - and adv - . cave myotis adv - and adv - are most similar to gu isolated from myotis horsfieldii [ ] with genetic differences of . % and . %, respectively, and most different from hq isolated from rousettus leschenaultii [ ] with genetic differences of . and . %, respectively. cave myotis adv is most similar to mf , - , - isolated from pipistrellus pygmaeus [ ] with a genetic difference of . %. cave myotis adv is most different from kc , isolated from pteropus giganteus [ ] and hq isolated from r. leschenaultii [ ] by . %. these two new advs are ≥ % different than any currently recognized bat mastadenovirus a-g (table ) . when adv sequences were compared to other advs from bats, genetic diversity ranged from to %. this study demonstrates that there is great genetic diversity of dna viruses within the same species of bats found in the same location, which is relatively uncommon for other vertebrate viruses [ ] . we sampled caves during seasons and found greater prevalence of viral dna in m. velifer guano during summer (july; / samples = %) than spring (march; / samples = %) or autumn (october; / samples = %). this is the highest percentage of positive adv samples detected in bats to date from a single sampling period. drexler et al. [ ] collected guano samples from m. myotis in germany during may, june, and july for years, and the highest percentage of positive samples for sample date ( . %, / samples) was collected in may and the same frequency in july. it is likely our high percentage of positive samples from one sample date is because m. velifer give birth to their young in may-june [ ] . in summer there are many young bats present with weaker immune systems and a greater risk of lactating females sharing viruses with their young. drexler et al. [ ] found a significant increase in prevalence of coronaviruses one month after parturition during summer months but adv detection was not significantly higher in any particular month within in the summer. little work has been done to investigate advs in north american species of bats; however, these studies [ , ] and ours highlight the importance of identifying viruses housed in bats to better understand viral evolution, how viruses are maintained in bat colonies and evaluate risks for host transmission to other species. li et al. [ ] found their novel bat adv (btadv-tjm) was capable of infecting several mammalian cells from different species, including humans, which indicates that bat advs possibly have a wide host range. they also suggest some bat advs have similar amino acid sequences for structural proteins to those in human advs and a high gc content, which suggest bat advs might be an ideal vector for gene therapy and vaccine delivery in humans [ ] . future studies should include sequencing the entire viral genomes and isolating the viruses to test possible transfections in other species to better characterize the viruses discovered here. mammal species of the world: a taxonomic and geographic reference how many species of mammals are there? optimizing viral discovery in bats a comparative analysis of viral richness and viral sharing in cave-roosting bats a comparison of bats and rodents as reservoirs of zoonotic viruses: are bats special? short report: molecular detection of adenoviruses, rhabdoviruses, and paramyxoviruses in bats from kenya bats: important reservoir hosts of emerging viruses host range, prevalence, and genetic diversity of adenoviruses in bats bats are 'special" reservoirs for emerging zoonotic pathogens genome analysis of bat adenovirus : indications of interspecies transmission family adenoviridae novel bat adenoviruses with low g+c content shed new light on the evolution of adenoviruses molecular evolution of adenoviruses do nonhuman primate or bat adenoviruses pose a risk for human health? cross-species transmission of a novel adenovirus associated with a fulminant pneumonia outbreak in a new world monkey colony isolation of novel adenovirus from fruit bat new adenovirus in bats novel adenoviruses and herpesviruses detected in bats bat guano virome: predominance of dietary viruses from insects and plants plus novel mammalian viruses amplification of emerging viruses in a bat colony isolation of a novel adenovirus from rousettus leschenaultia bats from india detection of adenoviruses in the northern hungarian bat fauna genetic diversity of adenoviruses in bats of china a strategy to estimate unknown viral diversity in mammals metagenomic study of the viruses of african straw-coloured fruit bats: detection of a chiropteran poxvirus and isolation of a novel adenovirus first detection of adenovirus in the vampire bat (desmodus rotundus) in brazil random sampling of the central european bat fauna reveals the existence of numerous hitherto unknown adenoviruses novel bat adenoviruses with an extremely large e gene evolution and cryo-electron microscopy capsid structure of a north american bat adenovirus and its relationship to other mastadenoviruses novel coronaviruses, astroviruses, adenoviruses and circoviruses in insectivorous bats from northern china a metagenomic viral discovery approach identifies potential zoonotic and novel mammalian viruses in neoromicia bats within south africa new adenovirus groups in western palearctic bats molecular detection of viruses in kenyan bats and discovery of novel astroviruses, caliciviruses, and rotaviruses detection of diverse viruses in alimentary specimens of bats in macau surveillance for adenoviruses in bats in italy guidelines of the american society of mammalogists for the use of wild mammals in research national whitenose syndrome decontamination protocol. https ://s .amazo naws. com/org.white noses yndro me.asset s/prod/ a c c -b - e - bb- edc -natio nal_wns_decon _updat e_ species from feces: order-wide identification of chiroptera from guano and other non-invasive genetic samples geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data molecular evolutionary genetics analysis version . bats of texas acknowledgements we thank jason shaw, bill caire, linda loucks publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. key: cord- - ixb g authors: phillips, j. e.; jackwood, m. w.; mckinley, e. t.; thor, s. w.; hilt, d. a.; acevedol, n. d.; williams, s. m.; kissinger, j. c.; paterson, a. h.; robertson, j. s.; lemke, c. title: changes in nonstructural protein are associated with attenuation in avian coronavirus infectious bronchitis virus date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: ixb g full-length genome sequencing of pathogenic and attenuated (for chickens) avian coronavirus infectious bronchitis virus (ibv) strains of the same serotype was conducted to identify genetic differences between the pathotypes. analysis of the consensus full-length genome for three different ibv serotypes (ark, ga , and mass ) showed that passage in embryonated eggs, to attenuate the viruses for chickens, resulted in . – . % of all the amino acid changes occurring in nsp within a virus type, whereas changes in the spike glycoprotein, thought to be the most variable protein in ibv, ranged from . to . % of all changes. the attenuated viruses did not cause any clinical signs of disease and had lower replication rates than the pathogenic viruses of the same serotype in chickens. however, both attenuated and pathogenic viruses of the same serotype replicated similarly in embryonated eggs, suggesting that mutations in nsp , which is involved in replication of the virus, might play an important role in the reduced replication observed in chickens leading to the attenuated phenotype. electronic supplementary material: the online version of this article (doi: . /s - - - ) contains supplementary material, which is available to authorized users. avian coronavirus infectious bronchitis virus (ibv) causes a highly contagious upper respiratory tract disease in chickens. live attenuated vaccines are used against the virus but the disease is difficult to control because cross-protection does not usually occur between different serotypes. the respiratory disease caused by this virus can be mild to moderate and can vary depending on the breed of chicken infected as well as the strain of the virus [ ] . the virus is worldwide in distribution, and in addition to chickens, ibv has been isolated from peafowl (galliformes) and other electronic supplementary material the online version of this article (doi: . /s - - - ) contains supplementary material, which is available to authorized users. gamma-coronaviruses have been isolated from teal (anas crecca), geese (anserinae), pigeons (columbiformes), and ducks (anserfiformes) [ ] . coronaviruses are enveloped viruses in the order nidovirales and are classified based on genome organization and antigenic characteristics as alpha (previously group ), beta (previously group ), and gamma (previously group )-coronaviruses with the avian coronaviruses belonging to the gamma-coronaviruses. subgroups within each group have been reported, and recently, comparative full-length genome analysis placed a novel coronavirus from a beluga whale in subgroup b and three new coronavirus isolates from passerine birds in subgroup c [ ] . infectious bronchitis virus and related isolates as well as turkey coronavirus (tcov) are assigned to subgroup a. coronaviruses have a single-stranded positive-sense rna genome ranging in size from to kb, with a cap and a poly-a tail. transcription occurs through a leader-primed rna synthesis mechanism that results for ibv in six co-terminal subgenomic mrna molecules. four structural proteins-spike (s), envelope (e), membrane (m), and nucleocapsid (n)-along with the viral rna make up the enveloped virion. the n protein binds to the viral rna forming the ribonucleoprotein (rnp) complex. the e and the m protein are membrane bound proteins that play a role in virus assembly [ ] . the s glycoprotein on the surface of the virus mediates attachment to the host cell, is responsible for fusion of the host cell membrane and viral envelope, and in ibv, it contains epitopes that define serotype and induce neutralizing antibodies [ ] . the s glycoprotein of ibv is post-translationally cleaved into s and s subunits, and the s subunit is reported to have three hypervariable regions [ ] [ ] [ ] . mutations, insertions, deletions, and recombination in s contribute to the genetic diversity of ibv, which is recognized as different genetic or serologic types of the virus [ ] . two polyproteins a and ab account for approximately two-thirds of the viral genome-coding region and make up the replication transcription complex (rtc). the polyprotein ab is translated through a- frame-shift translation mechanism that occurs approximately - % of the time [ ] . the ibv a and ab polyproteins are post-translationally cleaved into nonstructural proteins (nsps), nsps through by a papain-like protease (plp) and the main protease (mpro), also referred to as the c-like protease [ ] . ibv does not have an nsp equivalent found in some other coronaviruses. the plp contained within nsp is divided into pl and pl papain-like proteases. the pl protease, present in other coronaviruses, is truncated and nonfunctional in ibv, thus pl cleaves nsps , , and [ ] . the mpro contained within nsp cleaves nsps through . the biological characteristics of many nsps have been previously reported [ , , [ ] [ ] [ ] [ ] [ ] [ ] . in addition to nsps and , which contain proteases pl and mpro, respectively, nsps , , and contain hydrophobic residues predicted to play a role in anchoring the rtc to the golgi. nonstructural proteins , , , and are reported to have rna-binding activity. nonstructural protein / is the rna-dependent rna-polymerase, nsp is a rna helicase, nsp is an exoribonuclease, nsp is an endoribonuclease, and nsp is a methyltransferase. adaptation of ibv to different hosts has been associated with changes in the s glycoprotein, suggesting that spike plays a key role in pathogenicity [ , ] . however, the ectodomain of the s glycoprotein from the beaudette strain of ibv, an attenuated laboratory strain, was replaced with an s from a pathogenic strain (mass strain) of the same serotype. this chimeric virus was shown to induce an immune response but remained nonpathogenic in chickens, indicating that the s glycoprotein is not solely responsible for pathogenicity of ibv [ , ] . in another study, a chimeric ibv was created with the replicase genes a and ab from the attenuated beaudette strain, and all of the structural genes from the pathogenic mass strain including the s gene. this chimeric virus was not pathogenic in chickens, indicating that the replicase proteins also appear to be determinants of ibv pathotype [ , ] . genetic differences reported in a and s between virulent and avirulent strains of ibv also led others to suggest that the replicase proteins, in addition to s, are involved in the pathotype of the virus [ ] . to examine the sequence changes in individual genes associated with attenuation of ibv for chickens, we sequenced and compared the full-length consensus genomes of pathogenic ibv viruses and egg-passaged attenuated (for chickens) viruses from three different serotypes. we also examined the replication of pathogenic and attenuated viruses in embryonated eggs and in chickens to determine whether there are differences in growth rate between the pathotypes. pathogenic and attenuated (for chickens) ibv strains from three different serotypes were used in this study. the pathogenic arkansas-delmarva poultry industry ark/ark-dpi/ and the massachusetts strain mass/mass / were obtained from dr. j. gelb, jr. (university of delaware, newark, de). the pathogenic georgia virus, ga /cwl / virus, was isolated in our laboratory in [ ] . the pathogenic viruses were propagated in -day-old embryonated chicken eggs (ark/ark-dpi/ pass , mass/mass / pass , and ga /cwl / pass ) as previously described [ ] . the attenuated viruses of the same strain and serotype were obtained from intervet and were designated ark-attenuated (mildvac-ark), mass -attenuated (mildvac-h), and ga attenuated (mildvac-ga- ). whole-genome nucleotide and deduced amino acid sequence analysis viral rna extraction, rt-pcr, library construction, and sequencing were conducted as previously described [ ] . briefly, the viruses were filtered through a . -lm filter then through a . -lm filter (millipore, billerica, ma) prior to rna extraction. viral rna was purified using the high pure rna isolation kit according to the manufacturer's recommendation (roche diagnostic corporation, foster city, ca) and re-suspended in depc-treated water. reverse transcription (rt) and polymerase chain reaction (pcr) amplification were performed with the takara rna la pcr kit (takara bio inc., otsu, shiga, japan) using a random primer and an amplification primer in a strand displacement amplification reaction following the manufacture's protocol. the sequence of the random reverse transcription primer was -agc ggg ggt tgt cga atg ttt gan nnn n- , and the amplification primer sequence, which is designed to anneal to the complement of the conserved region on the random primer, was -agc ggg ggt tgt cga atg ttt ga- . both primers were obtained from integrated dna technologies, inc. (coralville, ia). for the rt reaction, a master mix was prepared, which included mgcl ( mm), rna pcr buffer ( ) , dntp mixture ( mm), rnase inhibitor ( units/ll), reverse transcriptase ( . units/ll), degenerate primer ( . lm), and rna ( . ll/reaction) then ll per sample was aliquoted in a thermocycler tube. the reaction conditions for the rt reaction were min at °c for the primer annealing then an hour at °c for extension followed by a five-minute incubation at °c for inactivation of the enzyme and a five-minute period at °c. a pcr master mix-which included at the final concentrations mgcl ( . mm), la pcr buffer ( ) , sterilized distilled water ( . ll), takara la taq ( . u/ ll), and primer ( . lm)-was prepared and ll of the rt reaction was added to ll of the mix. the amplification reaction consisted of a °c step for min followed by cycles of °c for s, °c for s, and °c for min. ten pcr were combined for each virus and purified using the qiaquick pcr purification kit (qiagen, foster city, ca) and then run on a % agarose gel to visualize the amplified product. the pcr products were size selected by cutting out amplicons between and bp from the gel. the amplicons were purified using the qiaquick (qiagen) gel purification kit. the topo cloning kit (invitrogen, life technologies, carlsbad ca) was used to clone the pcr products into the pcr-xl-topo vector according to the manufacturer's recommendations. then, one shot topo electrocompetent escherichia coli cells (invitrogen) were transformed using ll of competent cells mixed with ll of the ligation reaction and electroporated with settings at kv and x using a biorad (biorad gene pulser, hercules, ca). the electroporated cells were incubated at °c in ll of super optimal broth medium for h on a rotary shaker. the cultures were mixed with % glycerol and frozen in - °c until plated on q-trays (genetix, boston, ma) containing liquid broth agar cat# - (mp biomedicals, llc, solon, oh) with lg/ml of kanamycin. the q-trays were pre-warmed at °c before the entire culture (approximately ll) was spread on the plates and incubated overnight at °c, then robotically picked with a q-bot (genetix, boston, ma). plasmid dna from the libraries of cloned cdna fragments for each virus was isolated using an alkaline lysis method modified for the -well format, and incorporating both hydra and tomtek robots (http://www.intl-pag.org/ /abstracts/p c_p _xi.html). cycle sequencing reactions were performed using the bigdye tm terminator Ò cycle sequencing kit version . (applied biosystems, foster city, ca) and mj research (watertown, ma) thermocyclers. finished reactions were filtered through sephadex filter plates into perkin-elmer microamp optical -well plates. a / -strength sequencing reaction on an abi was used to sequence each clone from both the and ends. each viral genome was sequenced to approximately coverage. the accuracy of the sequence was ensured by generating data in both the and the directions. gaps and areas with less than coverage were identified and specific primers were synthesized (idt) for rt-pcr amplification and sequencing of the ambiguous areas. the rt-pcr was conducted as described above, and the reaction conditions were °c for min, °c for min, then cycles of °c for s, °c for s, °c for s, followed by cycles of °c for s, °c for s, °c for s ? s/cycle added. the final elongation step was °c for min, and then, the reaction was cooled to °c. the pcr products were sequenced in both directions using the abi prism bigdye terminator v . (applied biosystems, foster city, ca) and the specific primers that were used for amplification at a concentration of ng. the amount of cdna added to the reaction ranged from to ng, and the sequencing reactions were analyzed on an abi (applied biosystems). chromatogram files and trace data were read and assembled using seqman pro, and genome annotation was conducted with seqbuilder (dnastar, inc., v. . . , madison, wi). low-quality segments and vector sequence were trimmed from the ends of each sequence and removed from further analysis. full-length genomes were uploaded to the national center for biotechnology information (ncbi) open reading frame (orf) finder (http://www.ncbi. nlm.nih.gov/gorf/) to identify orfs. nucleotide and deduced amino acid alignments were generated using clustalw, and phylogenetic trees with , bootstrap replicates were constructed in the megalign program (dnastar, inc.). hydrophilicity analysis using hopp-woods and kyte-doolittle were conducted with the protean program (dnastar, inc.). the viruses were titrated in day of incubation embryonated eggs to obtain a % embryo infectious dose (eid ) according to previously published procedures ( ). two-week-old chickens were given eid of virus in ll of pbs equally divided intraocularly and intranasally. due to isolator availability, different numbers of birds were tested for each virus. six birds were given ark/ ark-dpi/ , birds were given ark attenuated, birds each were given mass/mass / , mass attenuated, and ga attenuated, and birds were given ga / cwl / . each of the negative control groups consisted of birds. clinical signs and lesions were recorded, and tracheal swabs were collected and placed in ml of ice-cold pbs (ph . ) at days post-exposure [ ] . the presence of virus in the tracheal swab supernatant was determined by quantitative real-time rt-pcr [ ] . tracheas were collected in % neutral buffered formalin, routinely processed into paraffin, and -lm sections were cut for hematoxylin and eosin staining. epithelial hyperplasia, lymphocyte infiltration, and the severity of epithelial deciliation were scored for each trachea with being normal and being severe [ ] . as a measure of adaptation, we examined the growth of the ark/ark-dpi/ , ark attenuated, mass/mass / and mass -attenuated in embryonated eggs and chicks. because of limited isolator availability, we did not include the ga viruses in this experiment. virus growth in embryonated eggs was examined by inoculating eid of each virus into eggs at days of incubation via the chorioallantoic route. for each virus, allantoic fluid was harvested from five eggs at , , , , , and h after inoculation. the amount of virus present in fresh (not previously frozen) allantoic fluid was determined by quantitative real-time rt-pcr [ ] . to examine virus growth in chicks, eid of each virus was inoculated into specific pathogen-free chicks at day of age via the ocular/nasal route. tracheal swabs were collected from each of five birds at , , , , , and h after inoculation and placed in ml of ice-cold pbs (ph . ). once the birds were swabbed, they were removed from the study. the amount of virus present in the fresh (not previously frozen) tracheal swab supernatant was determined by quantitative real-time rt-pcr [ ] . sequences generated in this study were submitted to genbank and assigned the following accession numbers: ark/ark-dpi/ (gq ); ark-attenuated (gq ); ga /cwl / (gq ); ga -attenuated (gq ); mass/mass / (gq ); and mass -attenuated (gq ). the consensus sequence of the full-length genomes of ark/ ark-dpi/ , ark-attenuated, ga /cwl / , ga attenuated, mass/mass / , and mass -attenuated were sequenced, and the genome sizes were found to be , nt, , nt, , nt, , nt, , nt, and , nt, respectively. the genome organization consisting of a untranslated region (utr), polyproteins a and ab, spike, a, b, envelope, membrane, b, a, b, nucleocapsid, and utr was the same for all six viruses (table ) . gene locations for the nsps in orf a and ab are shown in table . the b protein, previously recognized in m [ ] , is amino acids long and located downstream from the membrane protein in all the viruses sequenced. a blast search was conducted, and we found the protein to have % sequence identity with the b protein from tcov (tcov, genbank accession number eu . ). in addition, a b protein downstream of the nucleocapsid protein was similar to the predicted b orf reported for tcov (genbank accession number eu . ). the b orf was identified in the ark and ga viruses but not in the mass viruses. alignment and phylogenetic analysis of the full-length genomes show that ark/ark-dpi/ has . % sequence identity with ark-attenuated, ga /cwl / has . % sequence identity with ga -attenuated, and mass/ mass / has . % sequence similarity with mass attenuated (fig. ) . nucleotide and amino acid sequence differences were identified between each of the pathogenic and attenuated viruses (table ) . when the genome sequences are compared, there are nucleotide (nt) changes resulting in amino acid changes in the coding regions between the ark viruses, nt changes resulting in amino acid changes between the ga viruses, and , nt changes resulting in amino acid changes between the mass viruses (see table and supplemental data tables and ). the size of the utr is nt for all the viruses ( table ). the number of nt differences between the ark viruses for the utr was with a . % identity. the ga viruses have nt differences with . % identity, and the mass viruses have nt differences with . % identity in the utr ( table ). the leader junction sequence, nucleotides - ( -cttaacaa), were found to be identical for the ark and mass viruses, whereas the ga /cwl / pathogenic virus leader junction sequence is -ctcaacaa and the ga attenuated virus sequence is -ctttacaa. the transcriptional regulatory sequences (trs) were identical in all of the viruses and were -ctgaacaa- for mrnas and , and -cttaacaa- for mrnas , , and . the size of the utrs is nt for ark/ark-dpi/ pathogenic and ark-attenuated, nt for ga / cwl / , nt for ga -attenuated, and nt for mass/mass / , and mass -attenuated ( table ). the number of nt differences within the utrs for the ark viruses is with . % identity. the ga viruses have nt differences resulting in . % identity, and the mass viruses have nt differences with . % identity within the utrs ( table ) . the utrs contain the s m motif, which is nt long with a sequence identity of . % or higher between the six viruses. analysis of the locations and number of sequence differences between pathogenic and attenuated viruses of the same serotype for individual nsps in polyproteins a and a/b (table ) shows that nsp has the highest number of amino acid differences among all the nsps. in addition, nsp has the greatest number of differences when coding regions across the entire genome are compared. a schematic representation of nsp and number of amino acid changes in each domain is presented in fig. . the nsp orf has . % of all amino acid differences observed between ark/ark-dpi/ and ark-attenuated (including a ten amino acid deletion in the attenuated virus at positions - ), . % of all amino acid differences observed between ga /cwl / and ga -attenuated (including an eight amino acid deletion in the pathogenic virus at positions - and a three amino acid deletion in the pathogenic virus at positions - ), and . % of all amino acid differences observed between mass/mass / and mass-attenuated (including a ten amino acid deletion in the attenuated virus at positions - ). these changes represent . , . , and . differences per amino acids within nsp for ark, ga and mass , respectively. we also found a virus subpopulation within the ark/ark-dpi/ strain, which had a ten amino acid deletion in nsp at positions - similar to the ark-attenuated virus. the catalytic triad of the pl protease, amino acids cys , hys , asp [ ] was conserved among all of the viruses, and a hydrophobicity plot of nsp predicted fours transmembrane regions between amino acids , and , (data not shown). the fewest amino acid changes for the nsps between pathogenic and attenuated viruses within a serotype are found in nsps - , which are the rna-binding proteins. the polyprotein ab- frame-shift slippery sequence ( -uuuaaac) is conserved among all six viruses but the location was found at nt , for ark/ark-dpi/ , nt , for ark-attenuated, nt , for ga /cwl / , nt , for ga -attenuated, nt , for mass/ mass / and nt , for mass -attenuated. the percent amino acid identity for the s glycoprotein is . % for ark viruses, . % for ga viruses, and . % for mass viruses (fig. ) . the number of amino acid differences within the s glycoprotein between pathogenic and attenuated viruses are , , and for ark, ga , and mass , respectively ( table ). the s glycoprotein for the ark viruses had . % ( . differences/ amino acids) of all amino acid differences, which is the third most variable orf in the entire genome after nsp and . for the ga viruses, the s glycoprotein has . % ( . differences/ amino acids) of all amino acid differences, which is the third most variable orf in the entire genome after nsp and orf b. the s glycoprotein for the mass viruses has . % of all amino acid differences ( . differences/ amino acids), which was the fourth most variable orf in the entire genome after nsp , , and . orf b has the fewest number of differences with no differences observed between the ark viruses, whereas the ga and mass viruses each have one amino acid difference. for orf b, no amino acid differences are observed for the ark viruses, amino acid differences are observed between the ga viruses, and amino acid differences are observed between the mass viruses. the ark virus b proteins have only one amino acid mutation and are . % similar to each other, whereas the ga virus b proteins have amino acid mutations, amino acid deletions, and substitution and are only . % similar. because this protein has not been previously recognized in ibv, a nucleotide blast search rather than an amino acid search was conducted and showed that the ga /cwl / virus has % identity with mass h (fj ) and the ga -attenuated virus has % identity with ark-dpi (eu ). to determine whether the ga -attenuated virus b sequence was a subpopulation within the ga /cwl / virus, two forward primers (ga a # -tcacgctcaagttcaagacctg- , and ga a # -cagctttaggtgagaatgaact- ) and two reverse primers (ga a # -tacgataaaacaa actaatgagaa- , and ga a # -ttgataggaa agcacagaaatag- ) specific for the ga -attenuated m a positions are based on ab from tcov (accession number yp_ ) and presented as the residue position with being the methionine at the beginning of orf a and ab followed by the single letter code for the amino acid at that position b sequence were used in combination in an rt-pcr assay, but no amplicons were observed. the data on pathogenicity of the viruses in -week-old spf chicks are presented in table . a birds were given % embryo infectious doses intraocularly/intranasally and examined for clinical signs, virus, and lesions at days post-inoculation b virus was detected in tracheal swabs by real-time rt-pcr as previously described callison et al. [ ] c epithelial hyperplasia, lymphocyte infiltration, and the severity of epithelial deciliation were scored for each trachea with one being normal and four being severe d a representative control group from one of the experiments is presented. all of the data from the negative control groups were the same (fig. a) . the ark-attenuated virus, which is adapted to embryonated eggs, only killed chicks inoculated with virus at day of age showed statistical differences (p b . ) in the amount of virus detected in the trachea between the ark/ark-dpi/ and ark-attenuated viruses at , , , and h post-inoculation with the pathogenic ark/ark-dpi/ having the higher amount of virus at each of the sample times (fig. b) . although not statistically different, the chicks given the pathogenic ark/ark-dpi/ virus also had more virus detected in the trachea than the chicks given the ark-attenuated virus at and h post-inoculation. many studies have examined sequence changes in the structural proteins of ibv and found that most of the changes associated with adaptation to a particular host or with a particular virus pathotype occur in the spike glycoprotein [ , , ] . but only a few studies have examined changes across the entire genome associated with biological characteristics of the virus [ , ] . ammayappan et al. [ ] found a total of amino acid changes between the genomes of ark dpi , a pathogenic virus and ark dpi an attenuated virus, with four amino changes in nsp and six amino acid changes in the s glycoprotein. based on that data, it was suggested that changes in the replicase sequence in addition to structural proteins might play a role in pathogenicity. fang et al. [ ] found . % of all amino acid substitutions across the entire genome were located in the spike glycoprotein following adaptation of an attenuated avian coronavirus to primate cells, suggesting that spike plays a role in host adaptation. in this study, we analyzed the consensus full-length genome for the pathogenic and attenuated viruses of three different ibv types and showed that within a virus type, . to . % of all the amino acid changes between the pathotypes occurred in nsp , whereas changes in spike ranged from . to . % of all changes. it should be noted, however, that spike had the highest number of differences between different serotypes of the virus, which is consistent with previous reports [ ] [ ] [ ] [ ] . a high percentage of differences between pathogenic and attenuated viruses within a serotype in nsp suggests this region plays a key role in pathogenicity. the nsp is a complex protein with multiple domains making it an attractive target for antiviral drug design [ , ] . it is approximately , amino acid residues in length and consists of an acidic domain, an adp-ribose phosphatase, the pl protease (a deubiquitinating protease), y and transmembrane domains. the acidic domain is of unknown function, however; there is some evidence that it possesses nucleic acid binding activity because it is consistently co-purified with singlestranded rna [ ] . previous studies with other organisms indicate that electrostatic interactions from this type of domain play a key role in ligand binding [ ] . influenza a viruses also contain a polymerase acidic protein (pa) that is required for the transcription and replication activity of the viral polymerase [ ] . differences between pathogenic and attenuated ibv strains within a serotype, including deletions in ark and mass viruses, were in and around the acidic domain within nsp (fig. ) . thus, it is likely that the acidic domain plays a role in attenuation in chickens but the exact function(s) of the amino acids in this domain is unclear. it was interesting that we observed an eight and a three amino acid deletion in the pathogenic virus ga /cwl / at positions - and - , respectively, compared to the ga -attenuated virus. since sequence insertions are not likely to occur during the attenuation process, the ga -attenuated virus possibly represents a minor undetected subpopulation in the pathogenic virus, which was selected by passage in embryonated eggs. the adp-ribose- phosphatase domain within nsp is relatively conserved between the pathogenic and attenuated strains. this domain has been shown in the beaudette laboratory attenuated strain of ibv not to function as an adp-ribose binding protein [ ] . however, the triple glycine sequence that forms part of the adp-ribose binding site (gly -gly -gly ), which was not conserved in beaudette, is conserved in all of the viruses sequenced herein [ ] . this suggests that the adp-ribose- protein may be functional in the pathogenic and attenuated ibv viruses and is consistent with the results of the mass -x domain as reported by xu et al. [ ] . the adp-ribose- phosphatase may be important in pathogenicity of ibv because it has been shown to play a role in adp ribosylation, a post-translational protein modification involved in dna damage repair and transcription regulation [ ] . in addition, it was reported that the adp-ribose- is dispensable for viral replication in tissue culture, suggesting that this domain is involved in regulation of viral replication rather than the actual replication process [ ] . the pl domain is a papain-like protease that is responsible for the cleavage of the nsp / and / sites. most coronaviruses have two papain-like proteases; however, in ibv the pl protease is truncated and is nonfunctional [ ] . the structure of the pl protease domain was determined to be a ''thumb-palm-finger'' motif [ ] . this domain has also been shown to be a potent ifn antagonist by inhibiting the phosphorylation and nuclear translocation of interferon regulatory factor (irf- ) causing a disruption in the activation of the type i ifn response through toll-like receptor (tlr ) or retinoic acid-inducible gene i (rig-i) [ ] . although the catalytic triad of the pl protease is conserved, amino acid changes between the pathogenic and attenuated viruses are observed in the pl protease, which could affect the efficiency of this ifn antagonist leading to altered viral replication in the cell. the disruption of ifn signaling has been shown in many viral infections, including sars-cov, dengue virus, and paramyxoviruses [ ] [ ] [ ] . the ibv pl viral protease was also shown to have characteristics similar to ubiquitin-specific proteases [ ] . deubuquitinating proteases, which remove ubiquitin from proteins that have been marked by cellular mechanisms for atp-dependent degradation, could be a potential mechanism by which the virus can alter the cellular environment favoring replication. the y domain, containing transmembrane domains at its n-terminus, was originally described by gorbalenya et al. [ ] and has been predicted to consist of three domains y , y , and y , which may act together to form an enzymatic function [ ] . the transmembrane domain is inserted into the endoplasmic reticulum (er) membrane co-translationally and plays an important scaffolding role for the replication transcription complex [ ] . recently, it was shown that three transmembrane domains were predicted for the sars-cov nsp but only two were found to span the er membrane orienting the protease domain of nsp on the cytoplasmic side where viral replication occurs [ , ] . in murine hepatitis virus (mhv), five transmembrane domains were predicted but only two domains were found to span the membrane, also locating the protease domain on the cytoplasm side [ , ] . our sequence data for ibv predicts four transmembrane domains within nsp . assuming the protease domain is located on the cytoplasm side of the membrane, we predict that either two or all four transmembrane domains would be used. a chimera ibv containing the replicase genes a and a/b from the attenuated beaudette strain and the structural genes from the pathogenic mass strain was not pathogenic in chickens, indicating that the replicase proteins appear to be determinants of pathotype in ibv [ , ] . our data strongly support these studies and further indicate that changes in nsp play a key role in ibv pathotype. it should also be emphasized that pathogenicity in avian coronaviruses is likely polygenic, since we and others [ ] observed amino acid substitutions in other viral proteins including spike. the b orf detected in tcov (genbank accession numbers acb and acb ) is identified in ark and ga viruses herein. only one amino acid difference was observed between the ark viruses, but differences as well as amino acid deletions and insertion are observed between ga viruses. an attempt to identify a subpopulation in the ga /cwl / pathogenic virus with the ga attenuated gene b was unsuccessful. it is not clear why gene b is so variable between the ga viruses but it appears recombination rather than mutations over time may have played a role. a nucleotide blast analysis indicated that the ga /cwl / virus was % similar to mass h a vaccine virus and the ga -attenuated virus was % similar to ark-dpi a pathogenic virus, suggesting an origin for those genes. nonetheless, assuming the b orf is expressed, it apparently does not play a role in defining pathotype. interestingly, we find differences between pathogenic and attenuated viruses in the and utrs. the and utrs play key roles in transcription and replication of coronaviruses [ ] . however, the differences between the ark and mass viruses, which are nt and nt, respectively, for the utr, and nt and nt, respectively, for the utr did not appear to affect replication as determined in embryonated eggs. the trs sequences for generation of the subgenomic mrnas were identical in all of the viruses; however, the leader junction sequences were different for ga viruses. different leader junction sequences could be important for attenuation since efficiency of subgenomic mrna production would affect growth of the virus [ ] . differences are observed in the amount of virus detected in chickens given viruses with different pathotypes. when the same amount of virus was administered, birds given the attenuated virus compared to birds given the homologous pathogenic virus had less virus detected in the trachea at all sampling times and the difference was statistically significant for most of the time points. thus, it appears that the amount of ibv replication in the trachea correlates with the ability of the virus to cause disease in chickens. attachment and entry, and replication of the attenuated virus (for chickens) were not impaired because it grew to the same titer (with the exception of one time point) as the pathogenic virus in -day-old embryonated eggs. inefficient attachment and entry into chicken host cells in vivo could be due to changes in spike. and decreased replication of the attenuated viruses could be due to the inability of the virus to overcome some as yet unidentified innate defense mechanism(s) in chicken cells that is not present in embryonic cells. domains within nsp associated with the deubiquitinating protease or ifn antagonists are likely candidates for further research. in summary, we find that most changes associated with attenuation of ibv for chickens are located within nsp and that the attenuated viruses have reduced replication in chickens but not in -day-old embryonated eggs. changes in spike suggest that attachment and entry may have been affected and changes in nsp suggest that the attenuated virus lost the ability to overcome some innate host cell defense mechanism in the mature chicken cell. the exact mechanism(s) surrounding the interaction of virus and host processes affecting virus replication have yet to be determined for ibv, but identifying the sequence changes in the virus responsible for reduced replication and attenuation is an important step in elucidating those mechanisms. finally, changes observed in nsp and spike as well as in other viral genes support the polygenic nature of pathogenicity in avian coronaviruses. infectioius bronchitis epitopes of neutralizing antibodies are located within three regions of the s spike protein of infectious bronchitis virus infectious bronchitis, in a laboratory manual for the isolation, identification, and characterization of avian pathogens code of federal regulations, standard requirements for ibv vaccines. animal and plant health inspection service, us national archives and records administration proc. natl. acad. sci. usa acknowledgments this work was supported by usda, csrees award number - - . the authors appreciate the assistance that was provided by lauren byrd, carey stewart, and joshua jackwood in conducting these studies. key: cord- - qja j h authors: li, weike; li, tiansong; liu, yuxiu; gao, yuwei; yang, songtao; feng, na; sun, heting; wang, shengle; wang, lei; bu, zhigao; xia, xianzhu title: genetic characterization of an isolate of canine distemper virus from a tibetan mastiff in china date: - - journal: virus genes doi: . /s - - -z sha: doc_id: cord_uid: qja j h canine distemper (cd) is a highly contagious, often fatal, multisystemic, and incurable disease in dogs and other carnivores, which is caused by canine distemper virus (cdv). although vaccines have been used as the principal means of controlling the disease, cd has been reported in vaccinated animals. the hemoagglutinin (h) protein is one of the most important antigens for inducing protective immunity against cd, and antigenic variation of recent cdv strains may explain vaccination failure. in this study, a new cdv isolate (tm-cc) was obtained from a tibetan mastiff that died of distemper, and its genome was characterized. phylogenetic analysis of the h gene revealed that the cdv-tm-cc strain is unique among other cdv strains and can be classified into the asia- group with the chinese strains, hebei and hlj - , and the japanese strain, cyn -hv. the h gene of cdv-tm-cc shows low identity ( . % nt and . % aa) with the h gene of the classical onderstepoort vaccine strain, which may explain the inability of the tibetan mastiff to mount a protective immune response. we also performed a comprehensive phylogenetic analysis of the n, p, and f protein sequences, as well as potential n-glycosylation sites and cysteine residues. this analysis shows that an n-glycosylation site at aa - within the f protein of cdv-tm-cc is specific for the wild-type strains ( p, a / , and ) and the asia- group strains, and may be another important factor for the poor immune response. these results provide important information for the design of cd vaccines in the china region and elsewhere. analysis of cdv strains from various animal samples has demonstrated an important relationship with the h gene/ glycoprotein, which has changed by genetic/antigenic drift. as the key protein for cdv, h is used for attachment to cell receptors as the first step of infection and mediates adequate host immune response [ ] . the h protein is considered to have the highest antigenic variation and can reflect genetic changes in comparative studies of cdv strains [ ] [ ] [ ] [ ] . this variation may affect neutralization-related sites with disruption of important epitopes. analysis of cdv strains from different animal species and geographical settings has revealed that the geographic pattern is an important factor in the genetic/antigenic drift affecting the h gene/glycoprotein of cdv [ ] [ ] [ ] [ ] [ ] [ ] [ ] . therefore, the h gene may be used for identification and phylogenetic classification of cdv strains, which have been identified into seven major genetic lineages, namely america- and - , asia- and - , arcticlike, europe, and wild-life [ , ] , as well as an indication of the antigenic response of the virus. three other proteins, the nucleocapsid (n) protein, the phosphoprotein (p) protein, and the fusion (f) protein, also have important roles for cdv and could provide additional sources of antigenic variability among strains. the n protein has immunosuppressive properties and is the major component of the cdv virion. the n-terminal domain of the n protein is generally well conserved, while the c-terminal end is poorly conserved and is considered hypervariable. the c-terminal tail of the n protein also contains the majority of its phosphorylation sites and antigenic sites [ , ] . during active infection, antibodies made against the n protein in the host are predominant and account for most of the complement-fixing antibody [ , ] . the p protein is relatively well conserved and plays a vital role in transcription and replication [ ] . this protein is an essential component of the viral rna phosphoprotein complex (vrnap) [ ] and also function as a chaperone for the n protein. the f protein is a type i integral membrane protein that mediates viral penetration by fusion between the virion envelope and the host cell plasma membrane at neutral ph. it is synthesized as an inactive precursor, f , and must be proteolytically cleaved to produce the functionally active fusion protein, which consists of disulfide-linked f and f polypeptides [ ] . like the h protein, the f protein has high antigenic variation. in this study, the wild-type cdv-tm-cc strain was isolated from the spleen of a -year-old tibetan mastiff that developed clinical signs of cd after having received all standard vaccines. to determine whether this occurrence may be explained by variations in specific nucleotide or amino acid residues of the cdv circulating in china, we sought to genetically characterize the cdv-tm-cc strain. verodogslam cells constitutively expressing the cdv receptor dog signaling lymphocyte activation molecule (slam) were cultured in dulbecco's modified eagle medium (dmem; gibco) supplemented with % heatinactivated fetal bovine serum (fbs) with an additional lg of g per ml. the wild-type cdv-tm-cc strain was originally isolated from spleen homogenate ( % w/v suspension) from a tibetan mastiff that succumbed to naturally infection. virus was propagated in verodogslam cells and stored at - °c. total rna was prepared from verodogslam cells infected with cdv-tm-cc according to the manufacturer's instructions (total rna kit i, omega). the reverse transcription reactions were performed using m-mlv reverse transcriptase (invitrogen) with oligo d(t) and random primers. according to the complete consensus genomic sequence of cdv (genbank), two sets of primers were designed to amplify the entire genome (oligo . design software), as shown in table . sequences were assembled and compared using dna sequence analysis software (dnastar), and the complete consensus genomic sequence was determined. pcr amplification was carried out using phusion high-fidelity dna polymerase (new england biolabs). clones (amplicons emcompassing the full-length cdv-tm-cc genome) were obtained a genbank number is provided for each of the strains of cdv that were compared with cdv-tm-cc in this study. the geographical location of strain isolation and the species/organ of isolation are also indicated, as well as the clade into which the strains are categorized virus genes ( ) : - from thirty rt-pcr reactions using cdv-specific oligonucleotides. to genetically characterize the cdv-tm-cc strain, the deduced amino acid sequence was compared to f and h gene fragments of the variant field isolates shown in table . a phylogenetic tree was constructed based on the deduced amino acid sequences in supplementary table using mega . , and multiple sequence alignment was carried out using clustalw. statistical significance of the phylogeny was estimated by bootstrap analysis over a , pseudoreplicate data set. the wild-type cdv-tm-cc strain was isolated from the spleen of a -year-old tibetan mastiff in jilin province that had succumbed to cd after having received all standard vaccines ( weeks first immunization, weeks second immunization, weeks third immunization with distemper, adenovirus type , parvovirus, parainfluenza quadruple vaccine; canine coronavirus disease killed virus vaccine portion, usa). the virus was propagated in verodogslam cells and the virulence of the strain was confirmed (data not shown). to identify sequence features that may explain the failure of the vaccine strain to protect the dog against cd, we sequenced the entire genome, using two sets of overlapping primers (table ) . within the cdv genome, the h gene is a major causative disease determinant and also has one of the highest rates of mutation. consequently, the phylogenetic relationship of cdv strains is often based on the deduced amino acid sequence of the h protein. the h gene of the cvt-tm-cc strain has , nucleotides and the inferred protein sequence has amino acids, similar to the other cdv strains. amino acid analysis of the h protein from cdv-tm-cc and other cdv strains in genbank (table ) identified seven clades of cdv strains (america- , america- , asia- , asia- , europe, arctic-like, and europe wildlife). cdv-tm-cc was classified into the asia- group with the strains cyn -hv (japan), hebei (china), and hlj- (china) (fig. (fig. a, b) . glycosylation is an important factor in determining the antigenicity of many proteins [ ] . prediction of the glycosylation sites of the h gene (http://www.cbs.dtu.dk/ser vices/netnglyc/, netnglyc . servera) identified a total of eight potential glycosylation sites at positions notably, the - n-glycosylation site is specific for virulent strains [ , ] with the exception of a / . the n-glycosylation site is specific for the asian- strains, suggesting that it was acquired later [ , ] . cdv-tm-cc has both of these predicted glycosylation sites, which could explain its virulence properties. phylogenetic analyses of the amino acid sequence of the n and p proteins to determine whether the conservation of cdv-tm-cc also extends to other proteins within the virus, we assessed the similarity of the n and p proteins. consistent with the results for the h protein, the homology of the deduced cdv-tm-cc amino acid sequence of the n protein to the asia- strains (cyn -hv, hlj - , and hebei) was high with . - . % identity, as shown in fig. . the n protein sequence of cdv-tm-cc also showed . % identity with the asia- group (strains m cr, lm, c, con, and l), and . % identify with the onderstepoort strain. moreover, cdv-tm-cc had high similarity ( . , . , and . % identity) with wild-type strains , a / , and p. the lowest homology of the cdv-tm-cc n protein sequence ( . - . % aa identity) was found with arctic-like strains cdv , shuskiy, and phoca-caspian- . this relatively high similarity between the n protein of cdv-tm-cc and other cdv strains is consistent with the generally high conservation among n proteins. the phylogenetic relationship of cdv-tm-cc based on the deduced amino acid sequence of the p protein was also analyzed (fig. ). similar to the results for the h protein, cdv-tm-cc classified into the asia- group, but was in a separate branch from the classical onderstepoort vaccine strains. these results verify the classification of cdv-tm-cc as an asia- group virus. the signal peptide is a short amino acid sequence at the n-terminus of the majority of newly synthesized proteins that are destined towards the secretory pathway and is a highly divergent region [ ] . analysis of the - aa signal peptide region of the f protein of cdv-tm-cc demonstrated the same set of amino acid variations in comparison with the onderstepoort strain as for the other asia- strains (cyn -hv, hlj - , and hebei): s/ k, t/ p, (fig. ) . among the asia- strains (m cr, lm, c, con, and l), variations in comparison with the onderstepoort strain were found in t/ s, s/ a, r/ w, s/ y, n/ k, r/ k, i/ v, and n/ k. additionally, both the asia- and asia- strains had clade-specific amino acid variation in p/ q. moreover, the cdv-tm-cc strain had characteristic additional variations in p/ y and c/ y. therefore, the signal peptide region of cdv-tm-cc has both asia group-specific and individual variations. among the cdv strains, amino acid variation was also found in k/ n in the f region (aa - ) for the asia- group. generally, there was high conservation within the hydrophobic fusion peptide (fp) domain at the n-terminus of the membrane anchored f subunit, with the exception of a/ v in the - and - strains (fig. ) . amino acid variations between the asia- and asia- groups were also found in a region between the helical bundles (hb) and heptad repeats b (hrb) at v/ s, r/ k, and l/ i; within the trans-membrane (tm) domain at c/ y, q/ r, and h/ f; and within the cytoplasmic tail (ct) domain, at r/ k. among the asia group strains, the hra (aa - ) and hb (aa - ) domains were highly conserved, with the exception of a q/ a variation in the hra domain. likewise, the amino acids were highly conserved in the hrb (aa - ) domain in all cdv strains except for hebei ( d/ n) and p ( v/ i). common amino acid changes in other regions of cdv strains in comparison to the onderstepoort strain were found at k/ r and s/ g. the potential n-glycosylation sites (n-x-s/t) of the f protein were highly conserved at nls, nvs, nct, and nqs in the f region among all cdv strains as reported previously [ ] [ ] [ ] (fig. ) . moreover, the asia- group (strains cyn -hv, hlj - hebei, and cdv-tm-cc) had specific potential n-glycosylation sites at nrt and nat in the signal peptide region, with the exception of the cdv-tm-cc strain, which had the sequence nkt. five of these six potential glycosylation sites of the cdv-tm-cc strain were at the same positions within the known virulent cdv strains (a / , p and ) at aa - , - , - , - , and - , whereas nkt was unique for cdv-tm-cc, and nrt and nit were unique for p. cysteine is an a-amino acid that plays an important role in intramolecular disulfide bond formation and the steric structure of proteins. in the f protein of cdv-tm-cc, a total of cysteine residues were detected. among them, residues (aa , , , , , , , , , , , , , and ) were located at identical positions in all cdv strains; however, several amino acid(s) were characteristic to individual strain(s), such as r/ c in the america group (strains onderstepoort, - , - , snyder hill, cdv , shuskiy and phoca-caspian- ) and y/ c in cdv-tm-cc. the presence of amino acid variations, as well as specific n-glycosylation sites and cysteine residues within the f protein, could affect the immune response to cdv-tm-cc. improved vaccination has reduced the frequency and magnitude of cd [ ] . distemper vaccination failures are uncommon, but outbreaks of cd continue to occur among vaccinated individuals and populations [ , , , ] . the most common factor in cd occurrence is a lack of the h protein, a major structural protein of cdv, mediates host selection and pathogenicity, and the rate of genetic variation for its gene is greater than for other genes. with geographically distinct lineages, many studies have demonstrated that phylogenetic analysis can be carried out in accordance with the deduced amino acid sequences of the h protein [ , , , ] . in this study, phylogenetic analysis based on the h protein identified seven clades of cdv strains (america- , america- , asia- , asia- , europe, arctic-like, and europe wild-life), and cdv-tm-cc was classified into the asia- group, with the highest identity to the chinese strains, hlj - and hebei, and the japanese strain, cyn -hv. potential n-glycosylation sites may differ for the h protein of the wild-type and vaccine strains of cdv. usually, only - potential sites are found within vaccine strains (such as onderstepoort), in comparison with - sites in wild-type cdv strains (for example, p). in particular, the - n-glycosylation site, which is specific for the wild-type strain [ , ] , is suggestive of the pathogenicity of the cdv-tm-cc strain. furthermore, the - n-glycosylation site has been acquired in the asian- strains [ , ] . further study may determine whether these differences in glycosylation the n protein is a highly conserved immunogenic protein that can elicit cellular and humoral immunity [ ] . based on sequence differences between the gene of the wild strains and vaccine strain, the n protein may affect the seroprotection rate of the host and lead to immune failure. like the h protein, the n protein of cdv-tm-cc showed the highest homology with the asia- group. high homology was also observed with the asia- group (strains m cr, lm, c, con, and l) and wild-type strains ( , a / , and p). moreover, the lowest homology was found between cdv-tm-cc and the onderstepoort strain. variation in the immunodominant epitope of the virus may change the structure, and therefore, we can speculate that the t cell-mediated immune response may be altered by variations in this protein. the p gene is extremely well conserved and, therefore, is particularly important in the phylogenetic classification. based on the phylogenetic relationship of the deduced amino acid sequence of the p protein, cdv-tm-cc was also classified into the asia- group. these results highlight the importance of considering the geographical setting to control the occurrence of the disease in a more efficient manner. the f protein is a surface glycoprotein that mediates viral entry into the host cell by fusion of the virion envelope and the host cell plasma membrane at a neutral ph. within the f protein, the signal peptide region (aa - ) has the lowest amino acid homology, especially at positions - and - . however, our analysis shows that the signal peptide region is relatively well conserved among the asia- group, except for specific individual amino acids, indicating that the signal peptide of the f protein is geographically distinct. in addition, three amino acids specific to the cdv-tm-cc strain ( k, y, and y) are located in the signal peptide region. the previous study reported that the amino acids k and l are specific for the cdv vaccine strains; however, we also found k in the wild-type strains in the america group (a / , , and p) and asia- group ( c, m cr, l, con, and lm). the f protein of the cdv-tm-cc strain has six potential glycosylation sites. among them, differences were found to reside mainly in the signal peptide region, but no clear rule could obviously explain the differences in the wide-type and vaccine strains or the geographical variation, including the occurrence of a strain-specific site ( - nkt) for cdv-tm-cc. four additional potential glycosylation sites were recognized at positions - , - , - in the f region and - in the f region, as reported previously [ ] [ ] [ ] . the - n-glycosylation site is specific for the wildtype strains ( p, a / , and ) and the asia- group (hebei, hlj - , and cyn -hv), and may be another important factor in vaccination failure. the fusion peptide (fp) domain also was found to be highly conserved among all cdv strains, except for a/ v in - and - . in short, the genetic/antigenic drift observed in the currently circulating cdv strains should be considered as a possible factor leading to the resurgence of cd cases. analysis of cdv strains detected globally and from a variety of host species will provide a more in-depth understanding of the global ecology of cdv and will provide the basis for the improvement of current cdv vaccines. the wild-type cdv-tm-cc strain, originally isolated from spleen homogenate from a fully vaccinated tibetan mastiff in china, was classified into the asia- group cluster of cdv strains based on the sequence of its h protein and verified by the sequence of its p protein. variations in specific amino acid residues, n-glycosylation sites, and cysteine residues throughout the cdv-tm-cc genome may explain the failure of the dog to mount vaccine-mediated protection against cd. these results provide the foundations for the global improvement in current cdv vaccines. virus infections of carnivores role of glycosylation of notch in development acknowledgments this work was supported by ecology of zoonoses and research of infection and immunity mechanisms ( cb ). key: cord- -co j uig authors: kobayashi, tomoya; murakami, shin; yamamoto, terumasa; mineshita, ko; sakuyama, muneki; sasaki, reiko; maeda, ken; horimoto, taisuke title: detection of bat hepatitis e virus rna in microbats in japan date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: co j uig several recent studies have reported that various bat species harbor bat hepatitis e viruses (bathev) belonging to the family hepeviridae, which also contains human hepatitis e virus (hev). the distribution and ecology of bathev are not well known. here, we collected and screened bat fecal samples from nine bat species in japan to detect bathev rna by rt-pcr using hev-specific primers, and detected three positive samples. sequence and phylogenetic analyses indicated that these three viruses were bathevs belonging to genus orthohepevirus d like other bathev strains reported earlier in various countries. these data support the first detection of bathevs in japanese microbats, indicating their wide geographical distribution among multiple bat species. bats are known to be natural reservoirs of various zoonotic viruses such as rabies virus, nipah virus, and severe acute respiratory syndrome (sars) coronavirus [ ] [ ] [ ] . in addition, hepatitis e virus (hev)-like viruses have been detected in bats in several countries [ , ] . bat hev (bathev) belongs to family hepeviridae, genus orthohepevirus and is a non-enveloped, positive-sense, single-stranded rna virus. orthohepevirus is divided into four species, orthohepevirus a-d [ , ] . orthohepevirus a contains hev, which is a causative agent for human acute hepatitis. orthohepevirus b includes avian-hev associated with hepatitis-splenomegaly syndrome. orthohepevirus c is divided into c (rodent-hev) and c (carnivore-associated hev) genotypes. orthohepevirus d comprises bathev, which has . - . % identity with hev [ ] . recently, bathevs were detected from macrobats and microbats in a variety of countries [ , ] . however, limited information about the distribution of bathevs in other regions and their ecology is available. here, we surveyed several japanese microbat species to detect bathevs. to examine whether bathevs exist in japanese bats, we collected bat fecal samples from nine bat species captured in five different prefectures of japan in , with permission from the ministry of the environment, japan and the respective local government (fig. a ). bats were caught using a harp trap and kept in a pouch for an hour to check the signs of disease as well as to obtain fresh feces. all captured bats did not show any obvious symptoms. the feces were added in a medium containing antibiotics, and frozen in dry ice. we extracted rna from the fecal samples and performed rt-pcr using a primer set ( table ) that was designed specifically against the conserved region of the rna-dependent rna polymerase (rdrp) of hev, to screen the hev genomes. two samples (bthev-ej and bthev-ej ) from the japanese short-tailed bat (eptesicus japonensis) and sample (bthev-ps ) from the brown long-eared bat (plecotus sacrimontis) were found to be positive. although we attempted to sequence the entire genome of the bathevs, we failed to amplify the whole genome using rt-pcr. therefore, we amplified partial orf , which encodes non-structural protein including rdrp, and entire capsid coding region of bthev-ej , -ej , and -ps (corresponding to nucleotides (nt) - , nt - , and nt - of bathev/bs , respectively) using specific primers ( table ). the sequences of bthev-ej and bthev-ej , which were collected in the same site on different days, were % identical (nt out of ). the identity between bthev-ej /-ej and bthev-ps using a part of the rdrp was about % at the nucleotide level, suggesting the presence of multiple bathevs in japan. blast analysis indicated that bthev-ej /-ej showed the highest sequence identities to bathev/bs , a german strain detected from the serotine bat (eptesicus serotinus), among strains previously reported in other countries. on the other hand, bthev-ps showed the highest sequence identity to bthev/nms b, a german strain detected from the daubenton's bat (myotis daubentonii). in particular, % identity observed between bthev-ej /-ej and bathev/bs or % identity observed between bthev-ps and bthev/nms b were greater than that observed between japanese strains. these data suggest that similar viruses exist in geographically distant regions. we then phylogenetically analyzed the sequences by maximum-likelihood analysis using clustalw and mega version . [ ] . a phylogenetic tree constructed using the partial amino acid sequences of rdrp indicated that the japanese viruses were included in orthohepevirus d (fig. b) , demonstrating that all bathevs are classified in this species. we also amplified the full-length capsid (orf ) sequences by rt-pcr and analyzed them phylogenetically. the resulting tree confirmed that the novel japanese viruses were included in orthohepevirus d (fig. c) . the sequences of bthev-ej /-ej were placed in a position neighboring the german bathev/bs strain, confirming the phylogenetic similarity between these strains. all bats captured in this study were insectivores and hibernate in winter. although there is no information about migration of e. japonensis and p. sacrimontis, bat species closely related to them were reported to migrate only a few kilometers from their colonies per night [ , ] . they had different ecology in the terms of the habitat. bthev-ej and -ej were detected on different days from two e. japonensis bats, both of which used eaves of the same house as night roost in nagano. although the e. japonensis bats formed mixed colony at the roost with myotis ikkonikovi and rhinolophus ferrumequinum bats, bathevs were only detected in e. japonensis, implying bthev-ej /-ej might have a narrow host range. p. sacrimontis bats usually form a small colony without other species of bats. indeed, p. sacrimontis bats, from which bthev-ps was detected in this study, were captured near such small colony in a ruin in aomori. thus, bthev-ps is likely circulating in the p. sacrimontis since the bats have low opportunity to come in contact with other species of bats. the closely related bathevs (bthev-ej /-ej and bat hev/bs ) have been detected in different species of eptesicus bats (e. serotinus and e. japonensis). since the distribution areas of these bats are not overlapping, viruses ancestral to bthev-ej /-ej and bathev/bs might have infected ancestral eptesicus and might have branched into different species in the process of evolution. for virus isolation, we inoculated the rt-pcr-positive fecal samples not only into several bat cells (bkt, fbkt, and demkt cells) but also into other mammalian cell lines (madin-darby canine kidney (mdck), african green monkey veroe , human a , madin-darby bovine kidney (mdbk), and swine pk cells) since we suspected that the bat fecal samples may contain several pathogens other than bathevs. all inoculated cells were incubated for - days with media changes at - days interval. after the incubation, cells were blindly passaged three times. however, we could neither recover any infectious viruses nor detect bathev rna in the inoculated cells by rt-pcr. in conclusion, the present study showed the presence of several bathev strains, which were independently classified to obtain fresh feces, bats were kept in a pouch for an hour, and the feces were then collected by a sterilized cotton bud and transferred to ml of dulbecco's modified medium eagle's minimum essential medium (dmem) supplemented with u/ml of penicillin, mg/ml of streptomycin, µg/ml of gentamycin, and µg/ml of amphotericin. the feces were suspended well and then centrifuged at , ×g for min at °c. the supernatants were used for rna extraction with isogen ls reagent (nippon gene). cdna was synthesized using prime script rt reagent kit (takara bio) with a mixture of random hexamer and oligo dt primers. pcr amplifications were performed using the kod fx neo (toyobo) with consensus hev primer sets (panhev f and r), which were designed in this study to amplify a -bp fragment of the rna-dependent into orthohepevirus d, in japanese bats, suggesting wide geographical distribution of bathev among multiple bat species. although these data suggest limited transmissibility of bathev to other animals, further studies are needed to determine its zoonotic potential. fields virology members of the international committee on the taxonomy of viruses hepeviridae study group acknowledgements we thank mr. mitsuru mukohyama for helping us key: cord- - jgqofbr authors: kocherhans, rolf; bridgen, anne; ackermann, mathias; tobler, kurt title: completion of the porcine epidemic diarrhoea coronavirus (pedv) genome sequence date: journal: virus genes doi: . /a: sha: doc_id: cord_uid: jgqofbr the sequence of the replicase gene of porcine epidemic diarrhoea virus (pedv) has been determined. this completes the sequence of the entire genome of strain cv , which was found to be , nucleotides (nt) in length (excluding the poly a-tail). a cloning strategy, which involves primers based on conserved regions in the predicted orf products from other coronaviruses whose genome sequence has been determined, was used to amplify the equivalent, but as yet unknown, sequence of pedv. primary sequences derived from these products were used to design additional primers resulting in the amplification and sequencing of the entire orf of pedv. analysis of the nucleotide sequences revealed a small open reading frame (orf) located near the ′ end (no – ), and two large, slightly overlapping orfs, orf a (nt – ) and orf b (nt – ). the orf a and orf b sequences overlapped at a potential ribosomal frame shift site. the amino acid sequence analysis suggested the presence of several functional motifs within the putative orf protein. by analogy to other coronavirus replicase gene products, three protease and one growth factor-like motif were seen in orf a, and one polymerase domain, one metal ion-binding domain, and one helicase motif could be assigned within orf b. comparative amino acid sequence alignments revealed that pedv is most closely related to human coronavirus (hcov)- e and transmissible gastroenteritis virus (tgev) and less related to murine hepatitis virus (mhv) and infectious bronchitis virus (ibv). these results thus confirm and extend the findings from sequence analysis of the structural genes of pedv. porcine epidemic diarrhoea virus (pedv) is a causative agent for diarrhoea in pigs, particularly in neonates. the disease has been recognised for approximately thirty years, but the causative virus was only first described in [ ] , while another ten years elapsed before a method was developed for propagation of the virus in cell culture [ ] . during this time, outbreaks of the disease were reported from numerous european countries as well as korea, china and japan. the epidemiology and pathogenesis of the disease have been well described by pensaert [ ] . the biological behaviour, electron microscopic appearance and polypeptide structure of pedv resulted in its provisional classification as a coronavirus [ , , ] . coronaviruses belong to the taxonomic order of nidovirales and contain a single stranded rna genome of positive polarity, which is approximately thirty kilobases in length. the genes encoding the structural proteins are located at the end of the genome. an astonishing two-thirds of the genome consist of the replicase gene, which is located at the end of the genome. the replicase proteins are encoded by orf a and orf b. these two long, slightly overlapping orfs are connected by a ribosomal frame shift site in all coronaviruses sequenced to date. this regulates the ratio of the two polypeptides encoded by orf a and the readthrough product orf ab. about ± % of the translation products are terminated at the end of orf a, and ± % continue to the end of orf b. the polypeptides are post-translationally processed by viral encoded proteases [reviewed by ]. these proteases are encoded within orf a; the polymerase-and the helicase-function are encoded by orf b. we have previously completed the sequencing of the nucleocapsid-(n), membrane-(m), small membrane-(e), orf and spike-(s) genes of the pedv strain cv [ ± ]. the alignment of the deduced amino acid sequences indicated that pedv occupies an interesting intermediate position between the two well-characterized members of the group i coronaviruses, transmissible gastroenteritis virus (tgev) and human coronavirus (hcov)- e. in this study, we have continued to determine and analyse nucleotide sequences of pedv. to our knowledge, only two group i coronaviruses have been sequenced completely, hcov- e and tgev [ , ] . in addition, two strains of mouse hepatitis virus (mhv), jhm and a belonging to the group ii coronaviruses, and infectious bronchitis virus (ibv) have been completely sequenced [ ± ] . therefore, the sequence presented in this paper is the sixth sequence of a coronavirus covering the entire genome. growth of cell adapted pedv strain cv was performed essentially as has been described elsewhere [ , ] , except that virus-infected cells were harvested at approximately h post infection. cells were freeze-thawed three times and cell debris removed by low speed centrifugation. virus was pelleted by centrifugation for h at , rpm and c in a sw rotor of a beckman centrifuge. virus pellets prepared from two cm flasks were pooled and resuspended in ml trizol tm (gibco-brl), and rna was prepared as recommended by the manufacturer. in order to obtain the first partial pedv specific sequences, the predicted amino acid sequences of the hcov- e and tgev polymerase orfs were aligned and homologous regions identified. the homologous regions were used to design degenerate primers [ ] that were used for rt-pcr amplifications. these initial amplicons were cloned and sequenced [ ] . later, a mixture of up to six antigenome sense primers based on pedv specific sequences or the degenerate primers and random hexamer primer (purchased from schmidheini ag; balgach, switzerland) was used for first strand cdna synthesis. rna prepared from two cm flasks of virus-infected cells was denatured for min at c and first strand cdna was performed in a ml total reaction volume using superscriptii tm (gibcobrl; basel, switzerland) according to the manufacture's protocol. this was modified to create the longer reverse transcription products by including a denaturation step at c for min following the first h incubation at c, followed by the addition of ml superscriptii tm and a second prolongation step of h at c. template rna was digested by adding ml rnaseh (gibcobrl; basel, switzerland) to the reaction mix and incubating at c for min. pcr amplification was performed as described elsewhere. in brief, pfu dna polymerase (stratagene; basel, switzerland) was used for the amplifications, which were performed on a dna engine (mj research) machine. pcr fragments were subsequently cloned into pbluescript ii ks or puc vectors using standard procedures. the nucleotide sequence was determined on these cdna clones. direct sequencing was performed on a rt-pcr product (see fig. b ), which was cleaned through an agarose gel. the contigs of the sequence determinations were constructed using seqman (dna*, lasergene, madison wi, usa). we previously reported the determination of the pedv leader sequence on the mrna encoding the n-gene [ ] . this sequence was used for the primer design in order to amplify the end of the genome. the leader sequence was used for the in silico construction of the genomic rna sequence, which is available on genbank database (accession number af ). virus sequences covering replicase genes were obtained from the genembl sequence database. the files with the accession numbers x , z , af , and m for hcov- e, tgev (purdue ), mhv-a , and ibv (beaudette) respectively were used. the deduced amino acid sequences were compared as indicated in the text using pileup and gap (gcg package version . ; madison, wi, usa). the files generated by pileup were used in distances (gcg package version . ; madison, wi, usa) to determine the kimura protein sequence distances, which were subsequently used for the construction of unrooted dendrogram using treegen on the cbrg server (http://cbrg.inf.ethz.ch/) the cloning approach we used previously to clone the pedv m and n genes involved designing primers based on conserved regions of the coronavirus m and n genes to amplify the equivalent to the unknown pedv sequence. in this study, we employed this technique to clone parts of the orf of pedv. such a method is useful for viruses which do not grow to high titre, avoids lengthy screening of clones and could potentially be applied to the cloning of any group i coronavirus. however, the large size of orf and the paucity of sequence data from other coronaviruses made this an ambitious objective. a number of conserved functional domains were identified in the predicted orf products, but these domains are mainly located in the orf b region and leave large regions of the orf a product with no known function and only a low level of sequence conservation between different coronavirus genomes. in order to clone and determine the sequences for the pedv orf , the predicted amino acid sequences of the hcov- e and tgev orf were aligned and homologous regions identified. the hcov- e and tgev orfs were sufficiently closely related to allow complete alignment of the predicted expression products. in contrast, the mhv and ibv sequences were much more divergent, and could only be aligned with the group i sequences in some of the conserved regions. degenerate primers were designed from regions conserved between the hcov- e and tgev and, where possible, mhv and ibv orf . these primers were used both to prime reverse transcription and for the pcr amplifications. sequence data derived from these pcr products allowed us to design sequence-specific primers which were then used to amplify the entire orf (see fig. b ). numerous small cdna clones, five large cdna clones and one rt-pcr product covering the twothirds of the pedv genome were used to determine the nucleotide sequence of the pedv orf (fig. ). this analysis completes the nucleotide sequence of pedv, and thereby the sixth entire sequence determined from a coronavirus genome [ ± , ] . the genome of pedv (cv ) excluding the poly a-tail is nt in length. analysis of the newly determined nucleotide sequence revealed a pattern of orfs typical of coronaviruses. a small orf with the potential to code for a -amino acid peptide was found at the end of the genome from nucleotide position ± . such small orfs (uorfs) are present in all coronaviruses sequenced so far. the uorfs of hcov- e [ ] and ibv [ ] are found to be eleven codons in length, while that of mhv is eight codons long [ , ] . that of tgev can only encode a three-amino acid peptide [ ] . two long orfs of and nt, which overlap by nt, covered most of the newly determined sequence. by analogy to published coronavirus sequences [ , , ] , the orfs were designated orf a and orf b. the predicted orf a of fedv extended from nucleotide to . this resulted in a -codon orf. the overlapping orf b starting at nucleotide and ending at nucleotide had the capacity to code for amino acids. it has been proposed for coronaviruses and other members of the order nidovirales [ ] that the nucleotide sequences in the overlapping regions of orf a and orf b are able to fold into a pseudoknot tertiary structure [ , ] . this region allows the ribosome shifting of the reading frame during translation of the orf a and subsequently continues the translation in orf b. the function of these rna structures as ribosomal frame shift sites was demonstrated for the analogous sequences of ibv [ ] and hcov- e [ ] . it seems likely that the translation of the pedv orf b is mediated by such a ribosomal frame shifting. the nucleotide sequences of pedv, hcov- e, and tgev covering the ribosomal frame shift site are more conserved to each other than to mhv-a or ibv. in order to identify the sequence which could be involved in the formation of the tertiary structure, the nucleotide sequences covering the end of orf a and the beginning of orf b from hcov- e [ ] and tgev [ ] were aligned with the corresponding sequence of pedv. fig. a shows the predicted frame shift region of pedv based on this comparison. the so-called slippery site (uuuaaac) at which frame shifting occurs is identical in all coronaviruses sequenced so far. the stems and loops required to provide the tertiary structure of the frame shift regions of tgev and hcov- e were compared and fig. b shows the predicted tertiary structure required for the frame shift of pedv based on this comparison. pairwise comparison of the deduced amino acid sequences (using gap) revealed that orf b of pedv is more conserved than orf a to corresponding sequences of other coronaviruses. the percentage of similarities and identities is shown in table . the putative protein sequence of orf a was most similar to the sequence of orf a of hcov- e ( . %) and less similar to the corresponding orf a of tgev ( . %), mhv-a ( . %) and ibv ( . %). the same relationship, but at a higher level of similarity, was true for the deduced amino acid sequence of the predicted pedv orf b. it was most similar to the amino acid sequence of hcov- e orf b and tgev orf b ( . % and . %, respectively). the similarity to the orf b from mhv-a and ibv was around %. the deduced amino acid sequences of orf a and orf b from pedv were aligned with the corresponding sequences of hcov- e, tgev, mhv-a , and ibv using pileup. the degrees of amino acid homologies are graphically presented as dendrograms (fig. a,b) . the multiple sequence alignments revealed several putative functional domains common to coronavirus sequences [ , ] located on the deduced amino acid sequence of orf ab of pedv. some of these had been used to design the primers for the rt-pcr amplification. in the orf a region the following motifs were observed. two motifs indicative of papain-like proteases (plp) were present at amino acid positions ± and ± . the plp motif is found twice in the replicase genes of hcov- e, tgev and mhv, but only once in that of ibv. in this respect, pedv resembles hcov- e, tgev and mhv rather than ibv. a highly conserved region (x-domain) was found between the two plp motifs. despite this motif being present in all coronavirus sequences, its function is not yet known. a picornavirus c-like ( c ) protease domain is located between amino acids and of the pedv orf a. all corona-and arteriviruses encode this motif, which is the main protease for the coronavirus mediated processing of the polyproteins. three markedly hydrophobic domains conserved among coronaviruses are found in orf a. the first is located after the second plp motif and the others flank the cl motif. finally, a growth factorlike (gfl) domain was located close to the end of orf a (amino acid position ± ). in the orf b region, three structural protein motifs could be recognized, which all play a role in viral replication. a sub-sequence at amino acid position ± containing the characteristic tripeptide orf of pedv sdd (or gdd in most rna viruses) [ ] is probably the active site for the rna dependent rna polymerase. a metal ion-binding domain covering amino acids ± and a helicase motif at amino acid positions ± were also observed in the pedv orf b product. alignments of the deduced amino acid sequences of the cl protease and the polymerase motif from five different coronaviruses are shown in fig. a and b, respectively. the findings concerning conserved domains are summarised in fig. a . a deletion of about amino acids located between the x-domain and the second plp motif in the putative orf a sequence of tgev compared to that of hcov- e was reported by eleouet et al. [ ] . this additional sequence was present in the pedv orf a product. the alignment (using gap) of the hcov- e and pedv amino acid sequences revealed . % similarity and . % identity in this region. earlier sequence analysis of pedv based on the structural protein sequences has shown that pedv is most closely related to hcov- e and tgev [ ± , ] , less related to mhv-a , and least related to ibv. however, it was not possible to determine the relative similarities of hcov- e, tgev and pedv. in this study, the similarities and identities of the amino acid sequence alignments based on orf a and orf b show clearly that pedv is most closely related to hcov- e and, moreover, that hcov- e is more similar in sequence to pedv than it is to tgev. in addition to the sequence analysis, the presented work offers various possibilities for future research on coronaviruses. functional analysis and processing of the as yet uncharacterised pedv orf is now possible. recently, almazan et al. and yount et al. achieved the generation of infectious tgev from cdna [ , ] and thiel et al. suceeded in generating full length cdna clones of hcov- e and ibv in a recombinant vaccinia virus system [ ] . the sequence and the cdna clones covering the entire genome of pedv would allow the development of a mini-genome system to study viral replication or the generation of an assembled, infectious cdna clone. bearing in mind the close relationship of pedv and hcov- e, the latter approach could be used to exchange functional parts of these viruses to gain new insights into the biology of these viruses. furthermore, the porcine epidemic diarrhea virus virus infections of porcines a reverse genetic system for coronaviruses the authors thank christa meyer for excellent technical assistance. these studies were supported by the swiss national science foundation, grant # - . . key: cord- - x p wc authors: tao, pan; dai, li; luo, mengcheng; tang, fangqiang; tien, po; pan, zishu title: analysis of synonymous codon usage in classical swine fever virus date: - - journal: virus genes doi: . /s - - -z sha: doc_id: cord_uid: x p wc using the complete genome sequences of classical swine fever viruses (csfv) representing all three genotypes and all three kinds of virulence, we analyzed synonymous codon usage and the relative dinucleotide abundance in csfv. the general correlation between base composition and codon usage bias suggests that mutational pressure rather than natural selection is the main factor that determines the codon usage bias in csfv. furthermore, we observed that the relative abundance of dinucleotides in csfv is independent of the overall base composition but is still the result of differential mutational pressure, which also shapes codon usage. in addition, other factors, such as the subgenotypes and aromaticity, also influence the codon usage variation among the genomes of csfv. this study represents the most comprehensive analysis to date of csfv codon usage patterns and provides a basic understanding of the mechanisms for codon usage bias. electronic supplementary material: the online version of this article (doi: . /s - - -z) contains supplementary material, which is available to authorized users. synonymous codons are not used randomly. rather, some codons are used more frequently than others. mutational pressure and translational selection were thought to be the main factors that account for codon usage variation among genes in different organisms [ ] [ ] [ ] [ ] . understanding the extent and causes of biases in codon usage is essential to the understanding of viral evolution, particularly the interplay between viruses and the immune response [ ] . however, in contrast to many organisms such as bacteria, yeast, drosophila, and mammals, where codon usage bias and nucleotide composition have been studied in great detail [ ] , the factors shaping synonymous codon usage bias and nucleotide composition in viruses, especially in animal viruses, have been studied only to a limited extent. for human rna viruses, it has been observed that codon usage bias is related to mutational pressure, g ? c content, the segmented nature of the genome and the route of transmission of the virus [ ] . for some vertebrate dna viruses, genome-wide mutational pressure, rather than natural selection for specific coding triplets, is the main determinant of codon usage [ ] . analysis of the bovine papillomavirus type (bpv ) late genes has revealed a relationship between codon usage and trna availability [ ] . in the mammalian papillomaviruses, it has been proposed that differences from the average codon usage frequencies in the host genome strongly influence both viral replication and gene expression [ ] . codon usage may play a key role in regulating latent versus productive infection in epstein-barr virus [ ] . recently, it was reported that codon usage is an important driving force in the evolution of astroviruses and small dna viruses [ , ] . clearly, studies of synonymous codon usage in viruses can reveal much about the molecular evolution of viruses or individual genes. such information would be relevant in understanding the regulation of viral gene expression. to date, little codon usage analysis has been performed on classical swine fever virus (csfv), which is the pathogen that causes classical swine fever (csf), an economically important and highly contagious disease of swine. although eradicated from many countries, csf continues to cause serious problems in different parts of the world [ ] . csfv is an enveloped virus with a single stranded rna genome, which contains a single open reading frame (orf) encoding a polyprotein that, following cellular and viral proteasemediated co-and post-translational processing, gives rise to - final cleavage products [ ] . studies on the phylogenetic relationship of csfvs have divided the viruses into main genotypes and subgenotypes based on sequence comparisons of nt of e sequence [ ] . based on differences in virulence, csfvs can also be divided into three clusters, namely, highly virulent strains, moderately virulent strains, and avirulent strains [ ] . recently, we have analyzed the positive selection pressure acting on the csfv envelope protein genes, e rns , e , and e , and identified several specific codons subject to diversifying positive selection in e rns and e [ ] . in order to better understand the characteristics of the csfv genome and to reveal more information about the viral genome, we have analyzed the codon usage and dinucleotide composition. in this report, we sought to address the following issues concerning codon usage in csfv: (i) the extent and causes of codon bias in csfv; (ii) the relationship between csfv genotype and codon usage; and (iii) how csfv virulence might affect codon usage. three complete genomes of csfv were previously sequenced by our laboratory (af , af , and af ) [ , ] . the other available complete cds of csfv were downloaded from genbank in march and sequences with [ % sequence identities were excluded. a total of csfv genomes [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] representing subgenotypes ( . , . , . , . , . , and . ) and all kinds of virulence (highly virulent strains, moderated virulent strains, and avirulent strains) were used in this study. the genotyping of csfv genomes was performed using the csfv sequence database (http://viro .tiho-hannover. de/eg/eurl_virus_db.htm) based on nt of e sequence [ ] . the serial number (sn), mononucleotide composition of each genome, genbank accession numbers, subgenotype, virulence, and other detail information are listed in table . relative synonymous codon usage (rscu) values of each codon in each orf were used to measure the synonymous codon usage [ ] . rscu values are largely independent of amino acid composition and are particularly useful in comparing codon usage between genes, or sets of genes that differ in their size and amino acid composition. the effective number of codons (enc) was used to quantify the codon usage bias of an orf [ ] , which is the best overall estimator of absolute synonymous codon usage bias [ ] . the enc values range from to . the larger the extent of codon preference in a gene, the smaller the enc value is. in an extremely biased gene where only one codon is used for each amino acid, this value would be ; in an unbiased gene, it would be . the index gc s was used to calculate the fraction of the nucleotides g ? c at the synonymous third codon position (excluding met, trp, and the termination codons). similarly, gc s is the fraction of the nucleotide g ? c at the synonymous first and second positions. the general average hydrophobicity (gravy) score and the frequency of aromatic amino acids (aromo) in the hypothetical translated gene product were also computed. all the indices mentioned above were calculated using the program codonw, version . . the relationships between variables and samples can be explored using multivariate statistical analysis. correspondence analysis (coa) was used to study the major trend in codon usage variation among orfs. in order to minimize the effects of amino acid composition on codon usage, each orf is represented as a -dimensional vector; each dimension corresponds to the rscu value of one sense codon (excluding aug, ugg, and stop codons). major trends within this dataset can be determined using measures of relative inertia and genes ordered according to their positions along the axis of major inertia. the relative abundance of dinucleotides in the csfv orfs was assessed using the method described by karlin and burge [ ] . the odds ratio q xy = f xy /f x f y , where f x denotes the frequency of the nucleotide x and f xy the frequency of the dinucleotide xy, etc., for each dinucleotide were calculated. as a conservative criterion, for p xy [ . (or . ), the xy pair is considered to be of high (or low) relative abundance compared with a random association of mononucleotides [ ] . statistical analysis correlation analysis was carried out using spearman's rank correlation analysis method. all statistical analyses, as well as cluster analysis, were carried out using the statistical analysis software spss version . . in order to investigate the extent of codon bias in csfv, the rscu values of different codon in each orf was to investigate synonymous codon usage variation among csfv viruses, coa was implemented for all csfv orfs selected for this study. figure depicts the position of each orf on the plane defined by the first and second principal axes generated by coa on rscu values of orfs. the first principal axis accounts for . % of the total variation. the next three axes account for . %, . %, and . % of the variation, respectively. this observation indicates that although the first major axis explains a substantial amount of variation in trends in codon usage, the second major axis also has an appreciable impact on total variation in synonymous codon usage. it is worth noting that several csfv chinese c strains that can replicate efficiently in rabbits but not in swine have similar coordinates (fig. ) to two csfv riems strains, which can replicate efficiently in swine. this suggests that the host may not influence the codon usage bias between the csfv c strain and other csfv strains. in fact, our study demonstrated that a -nt insertion (cuuuuuucuuuu) at position of utr may be responsible for the characteristics of the csfv chinese c strain [ ] . mutational pressure is the main factor accounting for codon usage variation in csfv mutational pressure and translational selection are thought to be the main factors that account for codon usage variation in different organisms [ ] [ ] [ ] [ ] . hence, in order to establish which factor in csfv can explain their codon usage, first, the g ? c content at the first and second codon positions (gc s) was compared with that at the synonymous third position (gc s). it was found that gc s and gc s are significantly correlated (r = . , p \ . ). this suggests that they are most likely the result of mutational pressure, as natural selection would be expected to act differently on different codon positions. additionally, wright [ ] suggested that the enc-plot (enc plotted against gc s) be used as part of a general strategy to investigate patterns of synonymous codon usage. genes, whose codon choice is constrained only by a g ? c mutation bias, will lie on or just below the curve of the predicted values. as shown in fig. , all of the spots lie below the expected curve, indicating that the codon usage bias in these genomes is greatly influenced by the g ? c compositional constraints. furthermore, the correlation between the first or second axis values in coa and gc s or gc s values of each strain was analyzed. as shown in table , the first axis value in coa of each selected genome, which contains most of the variation in synonymous codon usage bias between these genomes, is closely correlated with the gc composition at the first, second, and third codon position. the second axis in the coa of each gene is also closely correlated with the gc s. this analysis indicated that most of the codon usage bias among different orfs is directly related to the nucleotide composition. therefore, the compositional constraint is the main determinant of the variation in synonymous codon usage among different csfv orfs. the relative abundance of dinucleotide and cpg suppression also shape the codon usage in csfv it has been reported that dinucleotide biases can affect codon bias. to study the possible effect of the composition of dinucleotides on codon usage in csfv, the relative abundances of the dinucleotides in the csfv genomes were calculated. as shown in table , the frequencies of occurrence for dinucleotides were not randomly distributed and no dinucleotides were present at the expected frequencies. the relative abundance of cpg showed the most marked deviation from the ''normal range'' (mean ± s.d. = . ± . ). the relative abundance of upg and cpc also showed slight deviation from the ''normal range'' (mean ± s.d. = . ± . and . ± . , respectively). among the dinucleotides, are correlated with the first axis value in coa; are correlated with the second axis value in coa (table ). these observations indicated that the composition of dinucleotides, which are independent of the overall base composition but still the result of differential mutational pressure, also determines the variation in synonymous codon usage among different csfv orfs. in the rest four cpc containing codons for proline, cca (mean . ) is markedly over-used; ccg (mean . ), which also is a cpg containing codon, is slightly suppressed; ccu (mean . ) and ccc (mean . ) are almost equally used. the effect of selection pressure on codon usage as shown in fig. , the majority of the actual enc values are slightly lower than the expected enc values. this implies that although codon bias is mainly explained by mutational pressure, there are other factors, with less of an effect, that also influence the codon bias. to test that whether any selection pressure contributes to the codon usage variation between these csfvs, we performed a correlation analysis between axis values in coa and aromaticity or gravy score of each polyprotein. it was found that both axis and axis are significantly correlated with the aromaticity score (r = - . , p \ . , r = . , p \ . , respectively), indicating that the frequency of aromatic amino acids (phe, tyr, trp) in the hypothetical translated gene product of each orf is also related to the observed variation in codon bias. no significant relationship was found between axis values in coa and gravy using spearman's correlation ( table ). beyond the factors mentioned above, we were also concerned with how csfv genotype and virulence might affect codon usage. based on the variation in rscu values among the csfv genomes, a cluster tree was generated by the hierarchical clustering method. as shown in fig. , these csfv genomes were divided into sublineages. sublineages i- and i- contain all subgenotype . strains, and sublineage i- contains almost all avirulent strains in genotype . . sublineages i- , ii- , ii- , ii- , and ii- contain the subgenotypes . , . , . , . , and . , respectively. it should be noted that the distance between sublineages ii- and ii- is closer than the distance between sublineages ii- and ii- (fig. ) . since sublineages ii- and ii- contain the subgenotypes . and . , respectively, which, in turn, belong to genotype , the distance between two sublineages is closer than the distance between sublineage ii- and sublineage ii- (contains the subgenotype . ). this may be because of the special characteristics of strain in subgenotype . (see discussion). mean values of csfvs relative dinucleotide ratios ± s.d table summary of correlation analysis between the first two axes in coa and sixteen dinucleotides in the selected viruses , indicating that the overall extent of codon usage bias in csfv genomes is low. in fact, jenkins et al. [ ] have previously reported that the overall extent of codon usage bias in rna viruses is low with an average enc value close to . nevertheless, we still wished to determine the factors that constrain codon usage in csfv. according to the selection-mutation-drift model [ , ] , mutational pressure and translational selection are generally thought to be the main factors that account for codon usage variation between genes in different organisms [ ] [ ] [ ] [ ] . in our study, the general correlation between codon usage bias and base composition we observed suggests that mutational pressure is the main factor that determines codon usage bias in csfv; this conclusion is also supported by the highly significant correlation between gc s and gc s (r = . , p \ . ), and the result of enc-plot (fig. ). since mutation rates in rna viruses are much higher than those in dna viruses [ ] , it is understandable that mutational pressure is the major cause of codon usage bias in the csfv strains included in this study. the majority of the actual enc values are slightly lower than the expected enc values (fig. ) , indicating that there are other factors, albeit with smaller effects, that also influence codon bias. we then asked how csfv genotype and virulence might affect codon usage. our cluster analysis revealed that the csfv genotype also constrains codon usage, since different csfv strains with the same genotype were clustered together with only one exception, csfv strain (fig. ) . csfv strain (af ) was, however, postulated to be a recombinant virus by he et al. [ ] . to date phylogenetic analyses have been performed largely on one or three genomic regions but not the complete genome, which might limit it to genotype recombinant viruses. on the other hand, our rscu-based cluster was based on the complete cds of each virus. therefore, it is expected that differences will arise between phylogenetic analyses of recombinant viruses using the two different clustering methods. our results suggest that csfv strain might indeed be a recombinant virus and also raised interesting questions about csfv evolution and the relative contribution of intertypic recombination to the generation of csfv genetic diversity. furthermore, our results indicate that virulence is not significantly influenced by codon bias, since not all avirulent strains were clustered together. although of the avirulent strains of subgenotype . were clustered together ( fig. subgenotype . b) , the other avirulent strains were clustered with highly virulent strains, and moderately virulent strains were also not clustered together (fig. ) . at present, however, only small numbers of complete cds of csfv are available, and these only six cover subgenotypes. clearly, more complete sequences are needed to allow us to make more precise judgments. due to a previous report about cpg under-representation in rna and small dna viruses [ ] , we wanted to determine if the relative abundances of dinucleotides in csfv affects codon usage. the frequencies of occurrence for dinucleotides were not randomly distributed and no dinucleotides were present at the expected frequencies ( table ). the general correlation between the axis values in coa and the relative dinucleotide abundances (table ) suggests that codon usage in csfv can also be strongly influenced by underlying biases in dinucleotide frequencies. as a case in point, all cpg containing codons are markedly suppressed. the marked cpg deficiency is a common phenomenon in small eukaryotic viruses [ , ] . the cpg deficiency was proposed to be related to the immunostimulatory properties of unmethylated cpgs, which were recognized by the host's innate immune system as a pathogen signature [ , ] . indeed, unmethylated cpg motifs in dna sequences can be recognized by tlr [ ] , and unmethylated cpg motifs in ssrna may stimulate monocytes through a novel mechanism [ ] . this notion was further supported by the fact that cpg is not suppressed in the genomes of most large viruses [ , ] because they might encode a range of proteins that interfere with cellular pathogen recognition. as a case in point, vaccinia poxvirus encodes agonists of tlrs [ ] . in csfv, ruggli et al. and our group have shown that n pro and e rns protein can prevent both poly(ic)-and ndv-mediated ifn-a/b induction [ ] [ ] [ ] [ ] . inhibition by n pro protein is thought to involve an inactivation of interferon regulatory transcription factor (irf- ) [ ] . however, no evidence has been found to support the notion that n pro and e rns proteins interfere with ssrna through the recognition of unmethylated cpg motifs. it is most likely that the codon usage bias in csfv may be also related to its host's innate immune selective forces. taken together, our study reveals that codon usage bias in csfv is slight and mutational pressure is the main factor that affects codon usage variation in csfv. other factors, such as dinucleotide composition, genotype, aromaticity, and even innate immune selective forces also significantly influence codon usage bias. however, due to a lack of sequence data and detailed information about these isolations, it is currently impossible to performance an exhaustive analysis about csfv codon usage. clearly, a more comprehensive analysis is needed, based on more available data, to reveal more about the viral genome. to our knowledge, this work is the first report of codon usage analysis in csfv, and it provides a basic understanding of the mechanisms that give rise to codon usage bias. the results we have reported are also useful in understanding the processes involved in csfv evolution. the viruses and their replication proc. natl. acad. sci. usa key: cord- - kxgu t authors: oem, jae-ku; an, dong-jun title: phylogenetic analysis of bovine astrovirus in korean cattle date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: kxgu t bovine astrovirus (bastv) belongs to a genetically divergent lineage within the genus mamastrovirus. the present study showed that bastv was associated with the gastroenteric tracts of cattle in nine positive fecal samples from cattle, whereas no positive samples were found in the brain tissues of downer cattle. interestingly, the positive diarrheal samples were obtained mainly from calves aged days– months. bayesian inference tree analysis of the partial orf ab and capsid (orf ) gene sequences of bastvs identified four divergent groups. eleven bastvs, four porcine astroviruses, and two deer astroviruses (dastvs; ccastv- and - ) belonged to group ; group contained two bastvs (bastk – and bastk – ) with another two in group (bastk – and bastk – ); and group comprised the bastv-neuros strain derived from a cattle brain tissue sample and an ovine astrovirus. the same divergent groups were obtained when the pairwise alignments were produced using both amino acid and nucleotide sequences. the korean bastvs isolated from infected cattle had a nationwide distribution and they belonged to groups , , and . electronic supplementary material: the online version of this article (doi: . /s - - - ) contains supplementary material, which is available to authorized users. astroviruses are single-stranded positive-sense rna viruses that measure approximately . - . kb in length. the family astroviridae comprises two genera: mamastrovirus infects mammals and avastrovirus infects birds [ ] . human astrovirus was first reported in children with diarrhea in [ ] and mamastroviruses were found subsequently in a variety of wild hosts, including sheep, cow, pig, dog, cat, red deer, mouse, mink, bat, cheetah, brown rat, roe deer, sea lion, dolphin, and rabbit [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] . bovine astrovirus (bastv) was one of the first astroviruses [ ] to be discovered and it has been isolated in the usa and the uk. two bastv serotypes were established based on the results of a virus neutralization assay [ ] . the genomic characterization and sequence analysis of astroviruses in bovine fecal specimens collected in hong kong provides evidence of potential recombination in orf [ ] . recently, the complete genome of a novel bastv associated with neurological disease in cattle was sequenced, i.e., boastv-neuros , which was phylogenetically related to an ovine astrovirus (oastv). a previous study suggested that boastv-neruos infection was a potential cause of neurological disease in cattle [ ] . however, genetically diverse lineages of bastvs have not been identified in many countries because there have been few studies of astroviruses derived from cattle. thus, the present study investigated the genetic groupings of korean bastvs and examined their relationships with the age of cattle infected with bastvs. in total, fecal samples were collected from cattle with certain or suspected diarrheal disease at cattle farms throughout korea between january and december . the cattle comprised calves aged \ days and cattle aged [ days. based on the fecal condition, samples were from animals with diarrhea and from nondiarrheic animals. the cattle comprised korean cattle and holstein cattle. nonambulatory cattle which are commonly referred to as ''downer'' cattle are unable to stand or walk. cattle brain tissue samples with a histopathological diagnosis of encephalitis were also collected from downer cattle between and . out of brain samples collected from downer cattle, were found to be positive for akabane virus and two were bovine viral diarrhea virus (bvdv) whereas no pathogenic agent for encephalitis was detected from the other. viral rna was extracted from the feces using trizol ls b , according to the manufacturer's instructions. bastv was detected in fecal specimens by rt-pcr using a specific primer set for the orf ab and orf regions of bastv (bastv-f, -gtgtttggcatgtgggtyaarcc- and bastv-r: -rtcvyybktggtggt- ), which were designed based on known strains deposited in genbank (accession no. hq -hq ). the rt-pcr process amplified a -nt long fragment at °c for min, °c for min, °c for s, °c for s, and °c for min, followed by cycles using virus-specific conditions. the bastv associated with neurological disease in cattle was also detected in brain tissue specimens by rt-pcr, as described previously [ ] . products with the expected size were cloned using the pgem-t vector system ii tm (promega, cat. no. a , usa). the cloned gene was sequenced with t and sp sequencing primers using an abi prism Ò xi dna sequencer at the macrogen institute (macrogen co. ltd). the sequences of all the bastv-positive samples were submitted to genbank under accession numbers kf -kf . nine of fecal samples from korean cattle were positive for bastv and all of the bastvs were related to diarrhea. however, bastv was not detected in cattle brain tissue samples. although bastv was first reported in england in [ ] , the association between bovine astroviruses and gastroenteric diseases in cattle is still not clear. a recent study reported that bastv is not associated directly with severe diarrheic disease in calves under natural conditions [ , , , ] . in the present study, nine korean bastvs were associated with clinical diarrhea in cattle where calves aged \ month accounted for . % of cases (table ) . a previous study shows that bastvs were excreted by - % of calves on farms [ ] while a recent study of rectal swab samples from asymptomatic adult cattle showed that only . % ( / ) contained bastv [ ] . this must be because bastvs are more frequent in young calves than adult cattle. to investigate the relationships between astroviruses and other bovine viruses that cause diarrhea in cattle, a screening test was conducted using specific primers for the detection of bovine rotavirus (brv) [ ] , bovine coronavirus (bcv) [ ] , bvdv [ ] , and bovine kobuvirus (bkv) [ ] , as described previously. co-infections with other viruses were associated with the clinical symptoms of diarrhea in only two cases: the bastk / strain derived from a -day-old calf was coinfected with brv and the bastk / strain from a -day-old calf was co-infected with both brv and bvdv (table ) . although the association between bkv and diarrhea or gastroenteritis is unclear, it was co-infected with six korean bastvs, except for bastk / , bastk / , and bastk / (table ) . in cattle, two astrovirus serotypes have been recognized based on serological investigations, i.e., boastv- and boastv- infections [ ] , and recent phylogenetic analyses support the classification of bastvs and the newly discovered astroviruses in roe deer (ccastv) under the proposed mamastrovirus genocluster gi [ , ] . all of the astrovirus sequences were aligned using the clustal x alignment program [ ] . the nucleotide sequences were translated and the shared nucleotide and amino acid sequence identities among the astrovirus strains were calculated using bioedit . [ ] . the analysis of the diversity of bastvs in the present study identified four table) . bayesian trees were generated with mrbayes . . [ , ] using best-fit models, which were selected with mrmodeltest . [ ] for nucleotide sequences and prottest . [ ] for amino acid sequences. markov chain monte carlo analyses were run using , , generations for each nucleotide and amino acid sequence. the best-fit model of the orf ab nucleotide sequence selected by mrmodeltest . software was trnef?g, according to the results of a hierarchical likelihood ratio test. the likelihood parameter was set to nst = and rate = gamma for the datasets, and the gamma distribution shape parameter was . . the substitution model rmat was . and - [ ] . recently, the boastv-neuros strain was detected in the brain tissues of cattle and the analysis of its genetic diversity showed that it was most closely related to the oastv prototype, which was identified in [ ] , whereas it was phylogenetically distant from a recently reported oastv [ ] and the hong kong bastvs [ ] . this suggests the occurrence of multiple cross-species transmission events among hosts and other animal species. however, it appears that the histopathogenic findings of encephalitis in korean downer cows were not associated with the detection of boastv-neuros in brain tissue. the bi analysis of the partial orf ab and/or orf genes also showed that all of the known bastvs could be separated into four groups (fig a, b) , in the same way as the diversity analysis. group of the bi tree contained six hong kong bastvs and five korean bastvs, groups and included only korean bastvs, and the boastv-neuros strain was the only member of group . in conclusion, the present study identified four bastvs groups based on the phylogenetic analysis and their shared pairwise amino acid sequence identities. the bastv detection rate in cattle feces was higher in calves aged \ month compared with adult cattle. thus, continuous surveillance of novel diversity in bastvs should be conducted on many cattle farms throughout the world because of the risk of emerging astroviruses associated with neurological disease in cattle. pastk- pig (jq ) pastk- pig (jq ) b /hk cow (hq ) b /hk cow (hq ) ccastv- deer (hm ) ccastv- deer (hm ) poastv - pig (hm ) poastv - pig (hm ) pastk- pig (jq ) pastk- pig (jq ) b /hk cow (hq ) b - cow (jf ) cow pastv- / /hun pig (gu ) wbastv- wildboar (jq ) pastv/ /mn pig (jf ) pastk- pig (jq ) pastk- pig (jq ) pastk- pig (jq ) pastv/ /pa pig (jf ) human astroviurs human (dq ) virus taxonomy. eighth report of the international committee on taxonomy of viruses acknowledgments we are grateful to dr. soo-kyung joo for technical assistance. key: cord- - q ic ye authors: zhang, jianqiang; yim-im, wannarat; chen, qi; zheng, ying; schumacher, loni; huang, haiyan; gauger, phillip; harmon, karen; li, ganwu title: identification of porcine epidemic diarrhea virus variant with a large spike gene deletion from a clinical swine sample in the united states date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: q ic ye two genetically different porcine epidemic diarrhea virus (pedv) strains have been identified in the usa: us prototype (also called non-s indel) and s indel pedvs. in february , a pedv variant (usa/ok - / ) was identified in a rectal swab from a sow farm in oklahoma, usa. complete genome sequence analyses indicated this pedv variant was genetically similar to us non-s indel strain but had a continuous -nt ( -aa) deletion in the n-terminal domain of the spike gene compared to non-s indel pedvs. this is the first report of detecting pedv bearing large spike gene deletion in clinical swine samples in the usa. porcine epidemic diarrhea virus (pedv) is the causative agent of porcine epidemic diarrhea (ped) that was first recorded in europe in the s [ , ] . pedv spread to asia during the s and s and became endemic in pigs in asian countries [ ] . in , a severe ped outbreak occurred in china characterized by high morbidity in pigs of all ages and high mortality in neonatal piglets [ , ] . in , ped outbreaks were reported for the first time in the usa [ ] and caused substantial economic losses [ ] . subsequently, us-like pedvs were identified in other american countries and also emerged or re-emerged in some asian and european countries [ ] . global pedvs exhibit significant genetic diversities. recently, lin et al. [ ] proposed to categorize global pedv strains into classical, s indel, emerging north american non-s indel, and emerging asian non-s indel strains. in the usa, at least two genetically different pedv strains have been identified: the highly virulent pedv first identified in april associated with severe ped outbreaks was referred to as 'us prototype' or 'us original' or 'non-s indel' strain [ , ] ; a clinically milder pedv variant identified in the usa in january which was different from the original highly virulent pedv strains, as reflected by insertions and deletions in the spike (s) gene, was designated as 's indel' pedv [ , ] . in this case report, we describe, for the first time, identification of a pedv variant with a large spike gene n-terminal domain deletion from a clinical swine sample in the usa. at the iowa state university veterinary diagnostic laboratory (isu vdl), a nucleocapsid (n) gene-based real-time rt-pcr (rrt-pcr) is routinely used for the screening detection of pedv from clinical specimens [ ] [ ] [ ] . if positive, a spike gene-based multiplex rrt-pcr can be further used to differentiate non-s indel from s indel pedv strains. in february , rectal swabs collected from a edited by juergen a richt. genbank accession numbers: the complete genome sequences of two porcine epidemic diarrhea viruses described in this study have been deposited in genbank under accessions mg and mg . sow farm in oklahoma, usa, were submitted to the isu vdl for pedv pcr testing. the samples were positive for pedv by the n gene-based rrt-pcr. subsequent pedv s gene-based differential rrt-pcr revealed that these samples were negative for s indel pedv (c t > ) but positive for non-s indel pedv. generally, the pedv s gene-based differential rrt-pcr gave - c t higher than the n genebased rrt-pcr on the same sample. however, the sample # gave unexpected results: strong positive by n gene-based rrt-pcr (c t . ) but weak positive for non-s indel pedv by the differential rrt-pcr (c t . ). to determine the possible reasons for this observation, the sample # and another control sample # (c t . by n gene-based rrt-pcr and c t . for non-s indel by the s gene-based differential rrt-pcr) were sequenced using next-generation sequencing technology following previously described procedures [ , ] . the pedv in the sample # (usa/ ok - / ) and the sample # (usa/ok - / ) had whole genome sequences of , and , nucleotides in length, respectively. the sequences of these two pedvs have been deposited into genbank (mg and mg ). phylogenetic analyses based on the whole genome sequences and the spike gene indicated that both ok - and ok - belong to the us non-s indel cluster (fig. ) . however, compared to the ok - and other non-s indel pedv strains, the ok - pedv had a large continuous deletion of -nt ( -aa) in the spike gene/protein (nt ∆ - ; aa ∆ - ; fig. ). the remaining genome of the ok - pedv, other than the s deletion region, had approximately . % nt identity to other non-s indel pedv strains. a gel-based rt-pcr [ ] was used to differentiate the ok - pedv from non-s indel pedv. twenty more samples were collected from the same farm; all of them contained non-s indel pedv but none of them contained ok - -like pedv, indicating the prevalence of ok - -like pedv in swine populations may be very low. virus isolation attempts on the sample # in vero cells (atcc ccl- ) were unsuccessful. the remaining sample # ( µl diluted in µl culture medium) was orally inoculated into two -day-old pedv-negative piglets ( ml/pig) but did not result in active infection. pedv spike (s) protein is a type i membrane glycoprotein with a signal peptide (amino acid residues - ), a large [ ] . the s protein assembles into homotrimers that form the clubshaped projections (spikes) on the virion surface. pedv s protein has multiple functions including ( ) mediating receptor binding through its s subunit (aa - ) and fusion of the viral and cellular membranes during cell entry through its s subunit (aa - ); ( ) harboring neutralization epitopes. specifically, the n-terminal domain (aa - ) exhibits sialic acid binding activity; the receptor-binding domain (aa - ) is believed to interact with a protein receptor; and a fusion peptide domain (aa - ) mediates virus-cell membrane fusion during cell entry [ ] . neutralization epitopes have been reported within the amino acid residues - , - , - , and - [ ] [ ] [ ] [ ] . the aminopeptidase n protein (apn) serves as a receptor for several alphacoronaviruses such as canine coronavirus type ii, feline coronavirus type ii, transmissible gastroenteritis virus (tgev), porcine respiratory coronavirus (prcv), and human coronavirus e [ ] . porcine apn was considered to be the putative receptor of pedv with some supporting evidence [ ] [ ] [ ] [ ] ; however, some recent studies indicate that porcine apn may not be a functional receptor for pedv [ , ] . the n-terminal domain of pedv s protein is one of the most variable regions in the pedv genome. the insertions and deletions of s indel pedv strains and the large deletion (aa ∆ - ) of the pedv variant (ok - ) identified in this study are all located in the n-terminal domain region. it is predicted that deletion of -aa at this region would not interfere with either the protein receptor binding or the neutralization epitopes - , - , and - . -d structural analyses of the s protein also suggest that this -aa deletion may not interfere with trimer formation. however, the neutralization epitope within residues - and the sialic acid binding activity of the virus may be affected by this aa deletion. in fact, some studies have shown that pedv strains having variations in the n-terminal domain of s protein exhibited different sialic acid binding activities [ , ] . it remains to be determined whether the activity of sialic acid binding by the s protein affects virus entry into cells, replication in cells, and pathogenicity in pigs. in addition, -aa deletion in the ok - pedv variant may affect virus virulence and pathogenicity. construction of a recombinant pedv carrying -aa deletion in this it was previously reported [ ] that a cell cultureadapted us pedv isolate tc-pc -p contained nt ( -aa) deletion in the s protein (aa ∆ - ) but such deletions were not present in the original clinical sample oh/pc / (fig. ) . a japanese pedv strain tottori / , identified in a clinical sample, contained -nt ( -aa) deletion in the s protein (aa ∆ - ) [ ] . a korean pedv strain mf / , identified in a clinical sample, contained a -nt ( -aa) deletion in the s protein but in a different location (aa ∆ - ) [ ] . a recent study reported the coexistence of pedv with a large s gene deletion and pedv with intact s gene in domestic pigs in japan [ ] . it has been demonstrated that the usa/ tc-pc -p and jpn/tottori / isolates harboring a large s gene deletion are less virulent than non-s indel pedvs in experimentally inoculated pigs [ , , ] . in terms of tgev, a large ( -aa) deletion in the spike gene changed the viral tropism from intestinal to respiratory and this tgev mutant was later renamed as prcv [ ] . in contrast, the pedv variant tc-pc -p with large s gene deletion did not change intestinal tropism [ , ] . in summary, a new pedv variant strain (usa/ ok - / ) belonging to the non-s indel cluster but with a -nt deletion ( -aa deletion) in the n-terminal domain of the s gene was identified in this study. this is the first report of a pedv strain with a large deletion in the s gene identified in clinical swine samples in the usa. this pedv with large s gene deletion was present on the same farm where non-s indel pedv with intact s gene was detected but it appeared that the prevalence of ok - -like pedv in swine populations may be low. additional molecular epidemiological studies are needed to monitor the emergence of novel pedv variants and determine their prevalence levels in us swine. pig farm acknowledgement this study was supported by the iowa state uni- key: cord- -h ftslps authors: junwei, ge; baoxian, li; lijie, tang; yijing, li title: cloning and sequence analysis of the n gene of porcine epidemic diarrhea virus ljb/ date: - - journal: virus genes doi: . /s - - -z sha: doc_id: cord_uid: h ftslps the nucleocapsid (n) gene of the porcine epidemic diarrhea virus (pedv) strain ljb/ which was previously isolated in heilongjiang province, china, was cloned, sequenced and compared with published sequences of other avian and mammalian coronavirus. the nucleotide sequence encoding the entire n gene open reading frame (orf) of ljb/ was bases long and encoded a protein of amino acids with predicted mr of kda. it consisted of adenines ( . %), cytosines ( . %), guanines ( . %) and thymines ( . %) residues. sequence comparison with other pedv strains selected from genbank revealed that the ljb/ n gene has a high sequence homology to those of other pedv isolates, . % with js , . % with chinju , . % with br / , and . % with cv . the encoded protein shared . % amino acid identities compared with cv , . % with brl/ , % with js , . % with chinju , respectively. the amino acid sequence contained seven potential protein kinase c phosphorylation sites, nine casein kinase ii phosphorylation sites, one tyrosine kinase phosphorylation site, two camp- and cgmp-dependent protein kinase phosphorylation sites. porcine epidemic diarrhea virus (pedv) is classified as a member of the coronaviridae and causes acute enteritis in pigs [ ] , which was first reported in england in and has been reported in many countries such as germany, canada, japanese, korea, france, belgium, switzerland, etc. [ ] [ ] [ ] [ ] . the disease was first reported in china in [ ] , caused serious economic losses due to the death of neonatal piglets and weight loss of the infected pigs. clinical signs of ped include anorexia, vomiting, diarrhea, and dehydration. morbidity and mortality in infected neonatal piglets less than days old approach % because of severe diarrhea and dehydration. however, mortality in infected piglets older than days is less than % [ ] . the coronavirus genome consists of a positive-sense, single-stranded rna molecule that is - kb in size [ ] . virions are enveloped, pleomorphic, and - nm in diameter, and they have club-shaped peplomers approximately nm in length. coronavirus possesses four major structural proteins including a phosphorylated nucleocapsid (n) protein and three envelope proteins, membrane protein (m), spike protein (s), and envelope protein (e); the first two envelope proteins are major envelope proteins, while the amount of e protein in virion is low [ ] [ ] [ ] , s glycoprotein makes up the large surface projections of the virion and the m and e proteins are essential for viral envelope formation and release [ , ] . studies indicate that the n proteins of coronaviruses are extensively phosphorylated, highly basic, and binds, to the viral genomic rna forming a helical ribonucleoprotein (rnp) [ ] . a variety of functional activities have been ascribed to the n proteins of previously known coronaviruses, including participation in transcription of the viral genome, the formation of viral core, and packaging viral rna [ ] . the n protein is highly immunogenic, more then, the cellular immune response against n protein of some animal coronaviruses can enhance the recovery from the virus infection [ , ] . n protein can accumulates intra-cellularly even before it is packed in the mature virus [ ] and is the most abundant virus derived-protein throughout the infection, probably because its template mrna is the most abundant subgenomic rna [ ] . these features make it a suitable candidate for the accurate and early diagnosis and develop genetically engineered vaccines [ ] . the aim of present study was to determine the complement nucleotide sequence of the pedv n gene and get more information about pedv isolates comes from different region. in this study, the rna of pedv was extracted directly from the feces samples of piglets naturally infected with pedv ljb/ . the n gene has been cloned, sequenced and compared with other pedv strains. these data are useful for further the study of molecular biology of pedv strains that are prevalent in china. virus strain pedv ljb/ was collected from the feces of piglets suffering from severe diarrhea in heilongjiang, china. the feces sample was operated following the methods of fan jinghui and li yijing [ ] . the feces sample was diluted - in a disruption buffer ( mm tris-hci [ph . ], % (w/v) pvp- , % (w/v) peg , mm nacl, . % (v/v) tween ), vortexed, incubated at room temperature for min, and centrifuged using a beckman f rotor at · g at °c for min. the supernatant was removed and used for the extraction of the viral rna using the trizol reagent (invitrogen usa) according to the manufacturer's protocol and dissolved in diethyl procarbonatetreated distilled water. a pair of sense and antisense primer was designed and aligned based on nucleotide sequences of the n gene of cv and brl/ available in genbank. the sense primer ¢-ttatggcttctgtcagcttt- ¢ and antisense primer ¢-acattgtttaatttcctgtatc- ¢ were used to amplify the n gene coding for the n protein of pedv strain ljb/ . synthesis of the first-strand cdna for n gene was carried out by reverse transcription using promega reverse transcription reagent. the viral rna ( ll) was mixed with . ll of pm of the antisense primer, incubated at °c for min, and then placed on ice for min. after that, ll of · rt buffer, ll of . mm dntp mixture, ll of rnase inhibitor ( u/ll), ll of reverse transcriptase ( u/ll), . ll h o was added and mixed gently. the reaction mixture was incubated for min at °c, and was terminated by heating for min at °c. rnase h ( ll) was added to degrade rna template for min at °c prior to pcr amplification. pcr was carried out in a ll volume by mixing the cdna above with . ll of each pm sense and antisense, mm each of datp, dgtp, dttp, dctp, ll of · pcr buffer ( mm tris-hcl, . mm mgc , mm kc , ph . ), and . u taq dna polymerase (takara biotechnology (dalian) co. ltd.). cycles were as follows: °c for s, followed by cycles of °c for s, °c annealing for s, °c extension for min and a final extension of °c for min. the pcr product was analyzed by electrophoresis through an agarose gel (fig. ) , and visualized by staining with ethidium bromide, the target cdna band was extracted from the gel using the qiagen Ò gel extraction kit according to the manufacturer's instructions. the purified pcr products were cloned into the pgem-t easy vector (promega, madison, usa) with t dna ligase. the plasmids were transformed into e. coli dh a using standard molecular technique. plasmid dna was extracted by alkaline-lysis from e. coli dh a culture and verified by using restriction enzyme digestion, pcr and electrophoresis in % agarose (fig. ) . colonies with correct sizes was named pgem-t-n and at least three independent plasmid clones were analyzed, confirmed and sequenced. the nucleotide sequence of the n gene of ljb/ was, determined by takara biotechnology (dalian) co. ltd. amino acid sequences were aligned using the clus-tal w method, and phylogenetic trees were constructed using the neighbor-joining method. analyses were done using the megalign application of the lasergene software package. the identification of sequence motifs was done with the psi-blast program using the swiss-prot database through the myhits web server (http://myhits.isb-sib.ch). by using rt-pcr method, we successfully amplified the nucleocapsid gene. the pcr products were approximately . kb in size and cloned into the pgem-t easy vector. the complete nucleotide sequence of nucleocapsid gene has been deposited in genbank, accession number is dq . sequence analysis indicate that the compete open reading frame (orf) for the nucleocapsid gene of pedv ljb/ consists of bases and codes for a basic protein of amino acid. it consisted of adenines ( . %), cytosines ( . %), guanines ( . %) and thymines ( . %) and a g+c content of . %. the result of motif blast indicated the ljb/ n protein had seven potential protein kinase c phosphorylation sites, nine casein kinase ii phosphorylation sites, one tyrosine kinase phosphorylation site, two camp-and cgmp-dependent protein kinase phosphorylation sites. the gene had nucleotide mismatches compared to cv , a substantial portion ( %, / ) of the substitutions was transversions, about % of the substitutions were non-synonymous mutations. table shows that the percent similarity of the n nucleotide sequences varied from . % to . % between ljb/ and the other four strains of pedv, and a high degree of identity ( . - . %) was observed between the nucleotide sequences of pedv strains. the alignment of the nucleotide sequences shows that no deletion or insertion event was detected, and there is a large region of absolute identity such as in the region from nucleotide to nucleotide ( - bases). the entire nucleocapsid protein of pedv ljb/ aligned with the published sequences of cv , brl/ , chinju and js . this alignment indicates that overall the sequences are, highly conserved with some regions showing no variation at all, and the nucleotide acid substitutions in the ¢ region ( - bases) did not arouse amino acid changes, which may suggestion the n-terminal of the protein had more homologous than the c-terminal. two-way comparisons among the nucleocapsid proteins of these five strains of pedv indicate that the identities range from . % to . %, with cv and brl/ having the most identity, and ljb/ and chinju the least. a phylogenetic tree was prepared to further examine relationships between pedv and other coronaviruses based on a comparison of n protein amino acid sequences (fig. ) . phylogenetic analysis showed that pedv was more closely related to group (tgev, hcv e and fipv) than to members of group korea are more closely related to each other than they are to those two isolates european cv and brl/ . in the present study, the n gene of ljb/ was cloned and sequenced. the result sequence revealed the n gene has a orf of nucleotides coding for a amino acids protein. sequence comparison with other pedv strains selected from genbank indicated that the n gene of pedv was highly conserved even though comes from different geographic region, and the alignment result showed there is some region of absolute identity in the sequences. previous studies showed the chinju n protein had potential t-or s-linked phosphorylation sites and seven potential casein kinase ii phosphorylation sites, the result in this study indicated the ljb/ n protein had seven potential protein kinase c phosphorylation sites, nine casein kinase ii phosphorylation sites, one tyrosine kinase phosphorylation site, two camp-and cgmp-dependent protein kinase phosphorylation sites. the entire nucleocapsid protein of pedv ljb/ aligned with the published sequences of cv , brl/ , chinju and js . this alignment of nucleocapsid protein sequences indicates that overall the sequences are highly conserved with some regions showing no variation at all. this can be the feasible information for the development of genetically engineered n protein for vaccine to prevent pedv infections. shuichi et al. developed a method of detection of pedv using polymerase chain reaction based on part of nucleocapsid nucleotide, and then compare the nucleocapsid nucleotide among strains of the virus, the result of restriction analysis the pcr products were that cv and all the korean strain can be digested with dra i, ecor i, but the korean strain was not digested with pst i. we found the n gene of ljb/ and js (another china isolate) have the same restriction patterns with the korean strains [ ] . coronaviruses have been subdivided into three major antigenic groups based on antigenic differences identified by serological analyses and nucleotide sequence analyses [ , ] . group i members are the porcine transmissible gastroenteritis virus (tgev) and epidemic diarrhea virus (pedv), feline and ca- amino acid sequences were aligned using the clustal method, and phylogenetic . trees were constructed using the neighbor-joining method. analyses were done using the megalign application of the lasergene software genbank accession numbers of sequences in the phylogenetic tree are: ljb/ dq ; chinju af ; cv nc ; brl/ z (britain isolate); js ay (china field isolate) nine coronavirus (fcov and ccov), and human coronavirus e (hcov- e). group ii includes porcine hemagglutinating encephalomyelitis virus (hev), murine hepatitis virus (mhv), bovine, equine, and rat coronavirus (bcov, ecov, and rtcov), and human coronavirus oc (hcov-oc ). group iii is specific for avian species including turkey coronavirus (tcov), pheasant coronavirus and avian infectious bronchitis virus (ibv). the coronavirus n protein has been shown to be highly variable in size as well as in amino acid composition between the viruses that comprise the three coronavirus antigenic groups but highly conserved within these groups. group i viral genomes have the smallest nucleocapsid protein with - residues, group ii genomes have the largest with - residues and group iii residues. all pedv strains had amino acid residues, and have a longer peptide than other group i members, which illuminate pedv, a particular case is an exception to the rule of the coronavirus n protein has been shown to be highly conserved within these groups. in the study, we acquired the nucleotide sequence of the n gene pedv ljb/ and did the nucleotide sequence analysis to establish the phylogenetic relationships between several strains of pedv. this work showed that the nucleotide sequence can form a base for further study on the epidemiological study of pedv infections. diease of swine pig farming acknowledgement the financial support of this work was provided by grants from ''project of the tenth-five'' of heilongjiang provincial scientific and technique committee, china. key: cord- -icedihm authors: pawestri, hana a.; nugraha, arie a.; han, alvin x.; pratiwi, eka; parker, edyth; richard, mathilde; van der vliet, stefan; fouchier, ron a. m.; muljono, david h.; de jong, menno d.; setiawaty, vivi; eggink, dirk title: genetic and antigenic characterization of influenza a/h n viruses isolated from patients in indonesia, – date: - - journal: virus genes doi: . /s - - - sha: doc_id: cord_uid: icedihm since the initial detection in , indonesia has reported human cases of highly pathogenic avian influenza h n (hpai h n ), associated with an exceptionally high case fatality rate ( %) compared to other geographical regions affected by other genetic clades of the virus. however, there is limited information on the genetic diversity of hpai h n viruses, especially those isolated from humans in indonesia. in this study, the genetic and antigenic characteristics of hpai h n viruses isolated from humans were analyzed. full genome sequences were analyzed for the presence of substitutions in the receptor binding site, and polymerase complex, as markers for virulence or human adaptation, as well as antiviral drug resistance substitutions. only a few substitutions associated with human adaptation were observed, a remarkably low prevalence of the human adaptive substitution pb -e k, which is common during human infection with other h n clades and a known virulence marker for avian influenza viruses during human infections. in addition, the antigenic profile of these indonesian hpai h n viruses was determined using serological analysis and antigenic cartography. antigenic characterization showed two distinct antigenic clusters, as observed previously for avian isolates. these two antigenic clusters were not clearly associated with time of virus isolation. this study provides better insight in genetic diversity of h n viruses during human infection and the presence of human adaptive markers. these findings highlight the importance of evaluating virus genetics for hpai h n viruses to estimate the risk to human health and the need for increased efforts to monitor the evolution of h n viruses across indonesia. electronic supplementary material: the online version of this article ( . /s - - - ) contains supplementary material, which is available to authorized users. highly pathogenic avian influenza (hpai) viruses are a global concern for both animal and human health [ ] . hpai h n viruses of the a/goose/guangdong/ / lineage were first discovered in in china and since have continued to circulate in poultry and wild birds across asia, the middle east, europe and africa [ , ] . in total, countries have been affected and millions of birds have succumbed to the disease or have been culled to prevent further spread of the disease [ ] . hpai h n viruses infect humans sporadically and may cause severe disease with a high case fatality rate among confirmed hospitalized patients. to date, there are confirmed human cases, of which have died [ ] . hpai h n viruses continue to evolve through genetic drift and reassortment events with other avian influenza a viruses, resulting in multiple genetic clades and subtypes [ , ] . edited by william dundon. the online version of this article (https ://doi.org/ . /s - - - ) contains supplementary material, which is available to authorized users. hpai h n viruses were first identified in indonesia from poultry outbreaks on the java island in and had since spread to other parts of the country [ , ] . during subsequent years, clade . viruses became enzootic in indonesia [ ] . however, a new hpai h n clade . virus was detected during poultry outbreaks since . human infections with hpai h n viruses have been reported in indonesia so far with higher case fatality rates among reported cases ( %) compared to the rest of the world afflicted by the virus. we previously showed that high nasopharyngeal viral load was associated with more severe outcome of human h n infections in indonesia and that the virus was more commonly detected in blood relative to other geographical regions affected by hpai h n [ , ] . strikingly, although the number of detections in humans peaked in [ ] [ ] in indonesia and subsequently declined in the following years, the case fatality rate in indonesia increased over time from % in to % since . this increase was associated with higher viral load prior to treatment and the presence of mutations in the matrix protein that confers adamantane resistance [ ] . however, reasons for the higher viral load and case fatality rate are still unclear. more detailed sequence analyses are warranted to investigate the presence of known virulence markers and substitutions related to possible human adaptation, which can help explain the higher viral loads and increased mortality. in response to the outbreaks, the indonesian government implemented a strategy to reduce the incidence of hpai h n virus infections in poultry including stamping out of infected poultry, culling of contiguous flocks and poultry vaccination [ ] . several vaccines were developed and implemented to match the circulating strain in the poultry and pandemic preparedness over the time [ , ] . the initial vaccine used was based on the a/chicken/ legok/ isolate, a clade . . virus [ , ] . by , the vaccine strain was updated to subclade . . and . . viruses, based on isolates a/chicken/west java/ / and a/chicken/nagrak/ / , respectively [ ] . to date, a new subclade of h n has emerged ( . . . ) and a new vaccine was developed based on isolate a/duck/sukoharjo/bbvw- - / [ ] . however, as a consequence of the largescale vaccination, antigenic drift was induced in poultry and consequently the vaccines became less effective [ ] . despite vaccination efforts, the number of poultry outbreaks remained high and the epidemic in poultry continued to spread among out of indonesian provinces with over , reported poultry outbreaks since [ ] . the large number of poultry outbreaks continues to pose a threat for future zoonotic infections in humans, antigenic drift and possible host adaptations that could increase the pandemic risk of circulating viruses [ , ] . improved insights into the genetic and antigenic characteristics of hpai h n viruses from indonesia provide a better understanding of its epidemiology, the high case fatality rate and for a pandemic risk assessment [ ] . sequencing data contain valuable information about viral genetic characteristics, including presence of known human adaptive markers, resistance against available antiviral drugs or other changes that can explain the high and rising mortality, while antigenic characterization will help assess the potential protection of pre-pandemic vaccines. here, we conducted a study to characterize the viral genetics of hpai h n viruses isolated from patients in indonesia between and that could explain the virulence leading to the high case fatality rate. we investigated the presence of known molecular determinants of virulence, receptor binding properties and antiviral susceptibility using whole genome sequencing of the hpai h n viruses. in addition, we investigated the antigenic properties of these human virus isolates and compared them to previous antigenic changes in hpai h n viruses from poultry, in order to assess the usefulness of and protection by current available pre-pandemic virus vaccines. as part of the national procedure for avian influenza case investigation in indonesia, respiratory specimens were collected from suspected h n cases admitted to hospitals throughout indonesia and sent to the national reference laboratory for influenza at the national institute of health research and development (nihrd) in jakarta. suspected cases were defined according to world health organization (who) criteria [ ] . the nihrd is the reference laboratory under the indonesian ministry of health responsible for laboratory testing and event-based surveillance of emerging infectious diseases in humans, including avian influenza a/h n virus. because indonesian clinical specimens are obtained from suspected h n cases as part of the national outbreak procedure for hpai h n case investigations, requirement for informed consent has been waived by the indonesian ministry of health. the specimens and data were collected from january to december , according to the national outbreak investigation protocol following circulation of hpai h n viruses in south east asia [ ] . all of the specimens collected were stored and analyzed at the nihrd. laboratory identification and confirmation was determined using realtime reverse transcriptase-polymerase chain reaction (rt-pcr) typing and subtyping assay according to the centers for disease control (cdc) (atlanta, united states) protocol [ ] . specimens for all laboratory-confirmed cases were selected for subsequent virus isolation and genetic analyses, based on the specimen with the lowest cycle threshold (c t ) value according to the real-time rt-pcr, available for each patient. the selected specimens positive for influenza a(h n ) virus with a ct value below were grown in -to -dayold specific pathogen-free (spf) embryonated chicken eggs in a biosafety level (bsl ) facility [ ] . after incubation at °c for h, the egg allantoic fluid was harvested and hemagglutination titers were determined by hemagglutination assay. a total of positive cultures were obtained due to variable specimen quality and limited availability of specimen volumes. the viral rna was extracted from µl of influenza virus positive allantoic fluids using high pure rna isolation kit (roche) with on-column dnase treatment according to the manufacturer's instructions. the rna was reverse transcribed into cdna using uni m primer (agc raa agc agg ) [ ] using superscript iii reverse transcriptase (invitrogen, carlsbad, ca, usa) according to the manufacturer's protocol. pcr amplification was performed using gene specific whole genome degenerative primer sets (primer sequences available upon request) [ ] [ ] [ ] using platinum taq dna polymerase high fidelity (invitrogen). the pcr products were then purified with the exosap-it purification kit (affimetrix, inc, santa clara, ca) according to the manufacturer's protocol. the complete coding sequences were sequenced using the big dye terminator v . cycle sequencing kit (applied biosystem, foster city, ca, usa). the products of the sequencing reactions were cleaned using big dye x terminator kit (applied biosystem, foster city, ca, usa) according to manufacturer's instructions and sequenced in a -capillary xl genetic analyzer (applied biosystem, foster city, ca, usa). all nucleotide sequences obtained from this study have been deposited in the gisaid database (see supplemental table s ). the assembly and editing process of sequences from all eight gene segments was performed using codon code software (gene codes, usa). all sequences were aligned using clustalw as available within bioedit software version . . . [ ] . to infer the evolutionary relationships between the viruses, maximum likelihood (ml) phylogenetic trees were constructed using raxml . . with the gtrgamma nucleotide substitution model [ , ] . a ml phylogenetic tree was constructed using the combined nucleotide alignment of hemagglutinin (ha) sequences from the newly sampled viruses and reference sequences used to defined the h nomenclature system (https ://www.who.int/ influ enza/gisrs _labor atory / _h sma lltre ealig nment .txt; fig. ) [ , ] . sequence data of human and avian h n viruses from indonesia with all eight influenza virus gene segments ( viral isolates as of january ) was downloaded from the (gisaid) epiflu database [ ] . individual ml trees were reconstructed for each gene segment to compare the genetic diversity of the newly sampled viruses against those previously collected from indonesia (fig. s ). tanglegrams were visualized using the baltic toolkit (https ://githu b.com/evogy tis/balti c). amino acid sequences were analyzed to identify substitutions potentially linked to human adaptation, virulence, antiviral resistance and antigenic properties as listed in the cdc h n genetic change inventory [ ] . in addition to this inventory, we also used flusurver to identify potentially relevant substitutions present in our sequence dataset (https ://www.gisai d.org, https ://flusu rver.bii.a-star.edu.sg). flu-surver is a web-based tool to rapidly screen the sequences for potential mutations based on the curated and published literature. virus titers were determined by hemagglutination assay and antigenic characterization was performed by hemagglutination inhibition (hi) assays according to who protocols [ , ] . the ferret antisera specifically reactive to defined h hemagglutinin clades were raised as described previously [ ] . all antisera were pretreated overnight at °c with receptor destroying enzyme (rde vibrio cholerae neuraminidase), followed by inactivation for h at °c. the hi assays were performed using the following procedures: twofold serial dilutions of µl antisera starting at a : were mixed with µl of a virus containing hemagglutinating units (hau) and were incubated at °c for min. then, µl of % turkey erythrocytes was added and incubated at °c for h. the hi titer is determined as the reciprocal value of the highest serum dilution that completely inhibited the hemagglutination of the turkey erythrocytes. antigenic properties were determined for representative novel isolates. selection was based upon available ha titer of virus stocks and availability of at least two independent replicate experiments, measuring hi titers for all available ferret sera. analysis of antigenic properties was conducted using antigenic cartography methods as described previously [ , ] . briefly, the hi titers are converted to a distance matrix in which the distance between one antigen and one antiserum corresponds to the difference between the log value of the maximum observed titer to the antiserum from any of sample collection. who reference strains are used to define the h nomenclature system [ , ] antigen and the titer of the antigen to the antiserum. this distance matrix is used as input for multidimensional scaling algorithms, which arrange the antiserum and antigen points in space to best satisfy the target distances specified by the hi data by minimizing the error. therefore, the distances between the points in an antigenic map represent antigenic distance as measured by the hi assay, in which the distances between antigens (virus isolates) and antisera are inversely related to the log hi titer. although only distances between antigens and antisera are measured in the hi assay, antigenic maps allow the indirect measure of antigenic distances between two viruses. during the course of this study between and , over poultry outbreaks of hpai h n viruses were reported [ ] and cases of laboratory-confirmed human hpai h n virus infection were collected. of these cases, we successfully cultured virus isolates for genetic and antigenic characterization. table shows a summary of the epidemiological and other data of these cases. among the patients, the median age was (range - ), ( %) were male, and ( %) were female, and ( %) received oseltamivir treatment. specimens were collected at median days post onset of symptoms (range - ). there were a total of specimens collected in , specimens in , specimens in , specimens in , specimens in , specimens in , specimens in and specimens in . these samples were collected from regions with high incidence of poultry h n outbreaks [ , ] , including west java ( %), followed by jakarta ( %) and the banten province ( %). a ml phylogenetic tree based on hemagglutinin (ha) sequences was constructed to infer the evolutionary relationships between the newly isolated viruses and hpai h n viruses circulating globally (see "materials and methods" (fig. ) . to further elucidate the phylogenetic relationships between the novel h n viruses and those collected from indonesia previously, ml phylogenetic trees were constructed for each individual influenza virus gene segment (fig. s ). there were also no clear distinct phylogenetic groupings between human and avian viruses in any of the gene segment analyzed, indicating that viruses infecting both host types in indonesia were genetically similar (fig. s d ). next, we compared the amino acid substitutions found in these newly isolated viruses against molecular markers known to alter viral phenotypes such as virulence, drug resistance and human host adaptation (table s ) [ ] [ ] [ ] [ ] [ ] . the ha protein can affect the virulence and host range of hpai h n viruses due to ( ) the presence of a multibasic cleavage site, as well as changes to ( ) host cell receptor specificity, ( ) n-linked glycosylation patterns and ( ) ha stability. the pathogenicity of avian influenza viruses is determined by the cleavability of the ha glycoprotein. the presence of multiple basic amino acid residues at the cleavage site of ha allows the glycoprotein to be cleaved into mature subunits ha and ha by furin-like proteases, which are ubiquitously expressed. to the contrary, ha containing a single basic residue are cleaved by trypsin-like proteases, predominantly expressed in the respiratory and intestinal tract of birds and the respiratory tract of humans. all of the new isolates analyzed in this study are highly pathogenic avian influenza viruses that encode a multibasic ha cleavage site [ , ] . the cleavage site motif pqresrrkkr↓g was found in of the newly isolated viruses while other variations (i.e., pqregrrkkr↓g, pqreskrkkr↓g, pqresrrrkr↓g and pqresrrkrr↓g) were observed in the remaining isolates. another key feature of ha related to both virulence and human adaptation is the receptor specificity. conserved residues within the receptor binding site (rbs) of ha are required for binding to sialic acid receptors (sia), while several other residues in domains surrounding the rbs are key determinants of receptor specificity. residues in these domains, the -loop, -helix and the -loop, determine the specificity for either the avian-type receptor or human-type receptor, α - -linked sia or α - -linked sia, respectively. several key residues within these domains have been identified at positions including , , , , and (h numbering). high conservation of amino acid sequences was found at the receptor binding site (rbs). all isolates possessed a conserved residue at position n , e , n , q and g , the most apparent residues involved in receptor binding specificity (as reviewed in [ ] ), indicating preferential binding of the viruses to avian like α - -linked sia [ ] . interestingly, polymorphism was observed at position for which a methionine or isoleucine was observed, instead of the more common arginine or lysine. however, the exact role in receptor specificity for this residue needs to be determined [ , ] . n-linked glycosylation of the influenza virus ha protein plays important roles in protein folding and modulates virus pathogenicity and evasion of neutralizing antibodies [ , ] . in addition, glycans within the vicinity of the rbs region may alter receptor binding affinity and/or specificity. like other clade . viruses, the h n viruses from indonesia contain seven potential n-linked glycosylation sites. n-linked glycosylation at positions , , , and are highly conserved among many ha subtypes [ ] . in addition, h n viruses can contain n-linked glycosylation sites at positions and . the absence of glycosylation site was linked to human receptor specificity and affinity, and aerosol transmission in ferrets, an animal model representative for aerosol transmission between humans [ , ] . however, no substitution (i.e., n x or t x) removing this glycosylation site was observed in the new indonesian samples. besides changes to receptor specificity and glycosylation patterns, the protein stability of ha is also important for human host adaptation, transmission and possibly virulence [ ] . nonetheless, we did not find in any of the virus isolates any ha substitutions (i.e., h y, t i [ , ] and y h, h q, an k i (h numbering) [ , ] that are known to increase replication and virulence of avian influenza virus h n or h n in mammalian animal models by mediating ha protein stability. however, it is expected that other positions and substitutions within the ha trimer could affect stability and therefore be involved in human adaptation and transmissibility of h n viruses, which would require further research to identify. all of the novel hpai h n isolates were found to contain the deletion of amino acids between positions and in the stalk region of its na glycoprotein. this shorter stalk length of na was previously linked to increase virulence of h n viruses in mammals [ ] [ ] [ ] . furthermore, the neuraminidase (na) protein serves as a target for na inhibitors (nai) such as oseltamivir, zanamivir, peramivir and laninamivir, which block the na enzyme active site to limit influenza virus egress. eighteen of patients were treated with oseltamivir in our study. however, none of the newly isolated viruses encoded known nai resistance mutations (i.e., v a, i v, e v, g k, v a, r k, d n, s n, h y, r k, n s (n numbering)) [ ] . this corresponds with our earlier study showing that acquisition of nai resistance is extremely rare in h n -infected individuals in indonesia, be it before or during treatment [ ] . of note, q h was observed in of the isolates. although q l is associated with reduced sensitivity to zanamivir and oseltamivir, q h had no effect on sensitivity to nai when tested in h n pdm or h n [ ] . we previously showed that substitutions related to amantadine resistance are common in h n viruses in indonesia [ ] even though amantadine treatment is not administered anymore. the prevalence of amantadine resistance-related substitutions increased over time from . % in , to % in and % during subsequent years [ ] . various amantadine resistance substitutions in the m protein were also found in all isolates, including v a ( viruses), v t ( virus), s g ( virus), and s n ( viruses). interestingly, isolates in this study collected in more recent years often encode resistance mutations in both positions and . these results indicate that indonesian h n viruses are sensitive to na inhibitors but resistant to m inhibitor, despite the absence of amantadine treatment [ ] . besides receptor specificity, polymerase activity is known to be a hallmark for host adaptation and virulence. the polymerase complex is a heterotrimer that consists of the pb , pb and pa subunits. the pb protein is an important determinant of virulence and host range. pb substitutions such as e k and d n [ ] can dramatically increase polymerase activity of avian influenza viruses in mammalian cells. in particular, pb -e k is a key molecular determinant of host range [ ] and a virulence factor during human infection with hpai h n [ ] . both pb -e k ( of viruses) and pb -d n ( virus) substitutions were observed in a limited number of the novel viruses presented in this study. this is in strong contrast with human h n viruses collected in other geographic regions where pb -e k substitution is common [ ] . we also checked if there are other known substitutions found in the polymerase complex and nucleoprotein that enhance polymerase activity of avian influenza viruses in human cells reported in previous studies [ ] . while there are some genetic variations present in some of these positions, there were no obvious markers of human adaptation or virulence that could be linked to the high case fatality rate of h n infections in humans in indonesia. both pb -f and ns have immune regulatory roles for influenza virus. the full-length pb -f protein ( aa) inhibits type i interferon response mediated by the mitochondrial antiviral signaling protein [ , ] . however, the open reading frame (orf) of the auxiliary protein which occurs in the second orf of the pb gene segment can be truncated or lost [ , ] . all of the indonesian h n viruses were found to encode the full-length pb -f protein. furthermore, the n s substitution in pb -f known to increase virulence [ ] was not found in any of the novel viruses. on the other hand, the four-amino-acid sequence motif (esev) at the carboxyl terminus of ns facilitates the nonstructural protein to bind to cellular pdz-containing proteins that are involved in host cellular signaling pathways [ ] . the esev motif was found in all of the new indonesian isolates. ns mutations such as p s, d e, l f and i m were also found to modulate the virulence of h n viruses [ ] . additionally, substitutions in ns (n s, g r) as well as ns (t a, m i) proteins may result in decreased antiviral responses in the host [ ] . however, none of these substitutions were found in the h n viruses. besides the presence of virulence and human adaptive markers, it is important to monitor and understand the antigenic properties of circulating influenza viruses. vaccination is a primary measure to control or prevent h n outbreaks in poultry and could be used to protect humans from h n infections, should these viruses become pandemic. influenza viruses can easily escape from available vaccines by substitutions in the major antigenic sites on the globular head domain of ha [ ] . here we investigated the antigenic properties of these human hpai h n viruses. to characterize the antigenic diversity of indonesian human influenza h n viruses by hi test, we selected a panel of ferret antisera able to detect antigenic variation between representative viruses [ ] . we included ferret antisera against clade . we determined the antigenic properties of isolates analyzed in hi assays using this panel of ferret antisera. antigenic cartography was used to visualize the antigenic relatedness of the hpai h n isolates in a d space (fig. ) . the antigenic map showed that the human hpai h n viruses from indonesia clustered into two antigenic groups. the first group of viruses clustered around two representative antisera of clade . this finding is similar with a previous study, describing different antigenic clusters within avian h n viruses in indonesia isolated from poultry [ ] . this study identified a small number of residues immediately adjacent to the rbs within the antigenic sites in the globular head domain of ha, which are primarily responsible for antigenic changes. we investigated genetic diversity at these positions. amino acid differences were identified at these six antigenically important positions located near the receptor binding sites [ ] : , , , , and (h numbering as there is no equivalent for position in h n viruses; table ). all viruses antigenically clustering into the first antigenic group possessed residues s , s , i , n , a and m , except for isolates , and , that contained t and i , respectively. the n t substitution does not seem to have an antigenic effect, although the m i could have a small antigenic effect as indicated by the placement on the outside of the cluster. the virus isolates of cluster contained residues s , s , i , d , a and r , typical for a/ indonesia/ / antigenic-like viruses. the study by koel et al. has previously shown that substitutions d n and r m are indeed responsible for the antigenic differences between these two clusters. isolate , contained a ; both axes represent antigenic distance: one square on the antigenic map represents a distance of one antigenic unit, corresponding to a twofold difference in the hi assay. the antigenic map was generated using antigenic cartography, a method that uses multidimensional scaling algorithms to place virus and antiserum points in a d space such that their relative position in the map reflects the hi titers with minimal error. the distance between a virus-and-antiserum pair is inversely related to the hi titer of the virus to that antiserum. the color coding of the human hpai h n isolates is based on their year of isolation as depicted in fig. . virus isolate names and antisera are abbreviated to isolate number/year however, this substitution does not seem to have major antigenic effects as this isolate clusters with other viruses. it was previously shown that a combination of substitutions at position and is necessary to result in antigenic changes [ ] . these data showed that viruses belonging to two distinct antigenic groups infected humans in indonesia. interestingly, the presence of viruses from different years of isolation in both clusters (as indicated by the color coding of the viruses in fig. ) indicates that these different antigenic variants were co-circulating. full protection in humans would therefore have likely required a multivalent vaccine, including at least a/indonesia/ / -and a/chicken/east java/ / -like viruses, which are currently under development or approved for human and/or poultry use [ , ] . indonesia has suffered numerous hpai h n virus outbreaks in poultry farms, live bird markets and backyard poultry, which have resulted in reported human cases with a case fatality rate of over %. this high case fatality rate is in sharp contrast with lower case fatality rates in we recently showed that the high case fatality rate of human indonesian h n cases correlated with viral load prior to treatment and increased from % in to % since [ ] . we found that this high case fatality rate coincided with the high prevalence of amantadine resistance-conferring m substitutions; however, no mechanistic explanation for the role of such substitutions in virulence of hpai h n is known yet. the aforementioned study did not include any further sequencing data looking into specific virulence and human adaptive markers. our analyses of the full genomic sequences for the isolates did not indicate any potential genetic changes that might explain the increase in case fatality rate and virulence over time. we did not find any known genetic markers associated with human adaptation or virulence in the ha or polymerase complex genes. there were no changes in the rbs region of ha that was indicative of a switch towards the human-type receptor. although some genetic diversity was observed in the polymerase genes, well-known substitutions such as pb -e k and pb -d n, which are often selected upon infection of humans and affects the virulence of avian influenza viruses such as h n [ , , , ] , were not commonly found in the new samples. however, there could also be other currently unknown adaptive substitutions present in the new human samples. confirmatory investigations into the effects of these substitutions on the activity of the polymerase complexes of both human and avian hpai h n viruses should be done in future studies. further sampling and research, involving the collection of more full genome sequences from avian and human viruses as well as both in vitro and in vivo characterization of virus replication and pathogenicity, are warranted to determine if indonesian hpai h n viruses are indeed more virulent than h n viruses of other genetic clades circulating in other geographic areas. this should also address whether there is a specific selection for more virulent viruses in humans only or whether indonesia hpai a/h n viruses are more virulent in general, also in poultry in indonesia, resulting in the consequence that zoonotic events happen with more virulent viruses, resulting in higher case fatality rates. further characterization of virus isolates from different periods of time will have to show if more recent viruses are indeed more virulent and whether this could be contributed to specific molecular markers. a possible reason why human h n cases have declined is the implementation of large-scale vaccination of poultry against h n . most countries affected by h n virus outbreaks, including indonesia, have implemented poultry vaccination as a key strategy for the control of h n infections. currently, the h n vaccine for poultry in indonesia is an inactivated bivalent vaccine containing h n viruses belonging to clades . . and . . . as for the selection of human seasonal vaccine strains, antigenic analyses are required to understand and predict vaccine effectiveness. from the current study, antigenic analyses of human h n viruses from till identified two antigenic groups of human clade . . viruses that co-circulate in indonesia. based on a previous study by koel et al., the antigenic differences could be explained by alternative amino acids present at several key residues at the rim of the receptor binding site. no evidence of antigenic change over time was observed or association with geographical location. therefore, a combination of two of the current available pre-pandemic human h n vaccines, a/indonesia/ / and a/indonesia/ nihrd / , would have been required to optimize protection against the two different antigenic groups. in summary, we performed genetic and antigenic analyses of h n influenza viruses isolated from humans between and . we observed low levels of genetic diversity and only sporadically prevalence of known substitutions associated with human adaptation and virulence (e.g., pb - k). however, the analysis only captured the majority variants and did not include the presence of minority variants present during infection. additionally, we have limited our genetic analyses to known substitutions only. to ascertain and better understand the high mortality associated with human hpai h n virus infections in indonesia, it is essential to perform more in-depth analysis of genetic diversity during human infections with hpai h n virus and to functionally characterize the observed substitutions. furthermore, our data showed that two antigenic groups co-circulated in indonesia, with no evidence of antigenic change over time. a combination of available pre-pandemic vaccines was required to be protective against circulating viruses of study period. global epidemiology of avian influenza a h n virus infection in humans, - : a systematic review of individual case data lessons from emergence of a/goose/ guangdong/ -like h n highly pathogenic avian influenza viruses and recent influenza surveillance efforts in southern china summary of avian influenza activity in europe world organization for animal health (oie) cumulative number of confirmed human cases of avian influenza a/(h n ) reported to who phylogenetic clustering by linear integer programming (phy-clip) world health organization/world organisation for animal hf, agriculture organization hnewg ( ) revised and updated nomenclature for highly pathogenic avian influenza a (h n ) viruses. influenza other respir viruses h n hpai global overview overview on poultry sector and hpai situation for indonesia with special emphasis on the island of java genetic characterization of clade . . . avian influenza a(h n ) viruses viral factors associated with the high mortality related to human infections with clade . influenza a/ h n virus in indonesia fatal outcome of human influenza a (h n ) is associated with high viral load and hypercytokinemia indonesia national committee for avian influenza control and pandemic influenza preparedness. national strategic plan for avian influenza control and pandemic influenza preparedness antigenic and genetic characteristics of zoonotic influenza viruses and development of candidate vaccine viruses for pandemic preparedness field effectiveness of highly pathogenic avian influenza h n vaccination in commercial layers in indonesia overview on poultry sector and hpai situation for indonesia with special emphasis on the island of java antibody titer has positive predictive value for vaccine protection against challenge with natural antigenic-drift variants of h n high-pathogenicity avian influenza viruses from indonesia evidence for differing evolutionary dynamics of a/ h n viruses among countries applying or not applying avian influenza vaccination in poultry fao ( ) fifth report on the global programme for the prevention and control of hpai avian influenza a (h n ) infection in humans risk factors of poultry outbreaks and human cases of h n avian influenza virus infection in west java province, indonesia pandemic preparedness and the influenza risk assessment tool (irat) world health organization ( ) who guidelines for investigation of human cases of avian influenza a(h n ) pedoman pengambilan dan pengiriman spesimen yang berhubungan dengan flu burung cdc realtime rtpcr (rrtpcr) protocol for detection and characterization of swine influenza (version antigenic variation in h n clade . viruses in indonesia from universal primer set for the full-length amplification of all influenza a viruses identification, characterization, and natural selection of mutations driving airborne transmission of a/h n virus genome analysis linking recent european and african influenza (h n ) viruses genetic diversity and host adaptation of avian h n influenza viruses during human infection bioedit: an important software for molecular biology raxml-iii: a fast program for maximum likelihood-based inference of large phylogenetic trees using raxml to infer phylogenies continued evolution of highly pathogenic avian influenza a(h n ): updated nomenclature world health organization/world organisation for animal hf, agriculture organization hewg. nomenclature updates resulting from the evolution of avian influenza a(h ) virus clades . . . a, . . , and . . during - . influenza other respir viruses gisaid: global initiative on sharing all influenza data: from vision to reality genetic changes inventory: a tool for influenza surveillance and preparedness studies of antigenic differences among strains of influenza a by means of red cell agglutination world health organization. manual for the laboratory diagnosis and virological surveillance of influenza antigenic variation of clade . h n virus is determined by a few amino acid substitutions immediately adjacent to the receptor binding site mapping the antigenic and genetic evolution of influenza virus unggas kondisi s/d oktober karlsson ea ( ) inventory of molecular markers affecting biological characteristics of avian influenza a viruses host and viral determinants of influenza a virus species specificity adaptation of avian influenza a virus polymerase in mammals to overcome the host species barrier host adaptation and transmission of influenza a viruses in mammals h n genetic changes inventory: a tool for influenza surveillance and preparedness molecular pathogenesis of h highly pathogenic avian influenza: the role of the haemagglutinin cleavage site motif the multibasic cleavage site of the hemagglutinin of highly pathogenic a/vietnam/ / (h n ) avian influenza virus acts as a virulence factor in a host-specific manner in mammals h n receptor specificity as a factor in pandemic risk structure and receptor specificity of the hemagglutinin from an h n influenza virus enhanced human-type receptor binding by ferret-transmissible h n with a k t mutation recent avian h n viruses exhibit increased propensity for acquiring human receptor specificity the n-linked glycosylation site at position on the head of hemagglutinin and the virulence of h n avian influenza virus in mice influenza virus n-linked glycosylation and innate immunity glycosylation focuses sequence variation in the influenza a virus h hemagglutinin globular domain airborne transmission of influenza a/ h n virus between ferrets experimental adaptation of an influenza h ha confers respiratory droplet transmission to a reassortant h ha/h n virus in ferrets amino acid substitutions that affect receptor binding and stability of the hemagglutinin of influenza a/h n virus amino acid residues in the fusion peptide pocket regulate the ph of activation of the h n influenza virus hemagglutinin protein influenza virus neuraminidase structure and functions the neuraminidase stalk deletion serves as major virulence determinant of h n highly pathogenic avian influenza viruses in chicken a -amino-acid deletion in the neuraminidase stalk and a fiveamino-acid deletion in the ns protein both contribute to the pathogenicity of h n avian influenza viruses in mallard ducks summary of neuraminidase amino acid substitutions associated with reduced inhibition by neuraminidase inhibitors zanamivir-resistant influenza viruses with q k or q r neuraminidase residue mutations can arise during mdck cell culture creating challenges for antiviral susceptibility monitoring a single amino acid in the pb gene of influenza a virus is a determinant of host range molecular basis for high virulence of hong kong h n influenza a viruses the effect of the pb mutation k on highly pathogenic h n avian influenza virus is dependent on the virus lineage multiple polymerase gene mutations for human adaptation occurring in asian h n influenza virus clinical isolates a single n s mutation in the pb -f protein of influenza a virus increases virulence by inhibiting the early interferon response in vivo influenza a virus pb -f protein contributes to viral pathogenesis in mice a novel influenza a virus mitochondrial protein that induces cell death a new influenza virus virulence determinant: the ns protein four c-terminal residues modulate pathogenicity the ha and ns genes of human h n influenza a virus contribute to high virulence in ferrets substitutions near the receptor binding site determine major antigenic change during influenza virus evolution summary of status of development and availability of a(h n ) candidate vaccine viruses and potency testing reagents a molecular and antigenic survey of h n highly pathogenic avian influenza virus isolates from smallholder duck farms in central java, indonesia during phylogenetic characterization of h n avian influenza viruses isolated in indonesia from selection of h n influenza virus pb during replication in humans