key: cord-0874656-km6cgk5i authors: Villalobos-Agüero, Ricardo A.; Ramírez-Carvajal, Lisbeth; Zamora-Sanabria, Rebeca; León, Bernal; Karkashian-Córdoba, James title: Molecular characterization of an avian GA13-like infectious bronchitis virus full-length genome from Costa Rica date: 2021-04-17 journal: Virusdisease DOI: 10.1007/s13337-021-00667-6 sha: 0df2ebec7b8dcfe19b01b2322b44a25694d7fb5f doc_id: 874656 cord_uid: km6cgk5i We describe the first whole-genome sequence of a GA13-like isolate of avian infectious bronchitis virus CK/CR/1160/16 (MN757859), obtained in 2016 in the province of Alajuela, Costa Rica. This virus caused an outbreak with great economic impact to the local poultry industry. The genome sequence is 27 696 bp in length, with the following genome organization 5′-UTR-Pol-S-3a-3b-E-4b-4c-M-5a-5b-N-6b-3′-UTR. The complete genome sequence has the highest sequence identity (94.03%) with DMV/1639/GA9977/2019 (MK878536) from Georgia, USA, and the lowest identity (86.03%) with ck/CH/LHLJ/08-6 (KX252788), from China. Analysis of the S1 subunit indicates that the Costa Rican isolate belongs to genotype I, lineage 17 (GI-17) and displays 96.89% identity with the S1 subunit of Ga-13/14255/14 (KM087780) (USA). Possible recombination events in genes S, E, M, 4b y 4c were detected, with Massachusetts, Connecticut, Arkansas and MA5 as potential parental types. This study highlights the importance of the epidemiological and molecular surveillance of avian infectious bronchitis. Infectious bronchitis (IB) is a viral disease that affects chickens. It causes high morbidity and economic losses in the industry around the world [1, 10] . The etiological agent is the Avian coronavirus [8] or avian infectious bronchitis virus (IBV), member of the genus Gammacoronavirus, family Coronaviridae, order Nidovirales [2, 8, 13] . The virion has a lipid envelope and the genome is a positivesense linear RNA of approx. 27.6 kb [1, 2, 10] , with the following as the most common genome organization 5 0 -UTR-Pol-S-3a-3b-E-M-5a-5b-N-3 0 UTR [24, 35] . The first gene encodes proteins involved in replication and transcription, and it has two open reading frames (ORFs): 1a and 1b [1, 4, 13, 34] which are translated into polyprotein 1a and 1ab due to a change in the reading frame [1, 4, 34] . The last third of the genome has the structural genes for the spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins, with additional ORFs that codify for the accessory proteins 3a, 3b, 5a y 5b [24, 28, 34] . Different strains of IBV have been described around the world including Massachusetts, Beaudette, Holte, 4/91, Arkansas, Connecticut, D274, and QX-Like, among others [2, 5, 12] . The classification of IBV genotypes is based on the S1 sequence of the spike gene, and six genotypes which comprise 32 lineages have been described [1, 34] . Masslike, Ark-like and Penn-like variants, plus one unassigned genotype designated as IBV-CR-53 [5, 23] , have been reported in Costa Rica since 1990 [17, 23] . Starting in May 2016 until mid 2017, there was an IBV outbreak throughout farms in Costa Rica. Poultry exhibited mild respiratory infections and mortality. However, great economic losses due to carcass condemnation were reported [30] . The IBV isolate associated to this outbreak was classified as a Georgia 13-like type (GA13-like), based on S1 gene sequence [30] . We isolated the virus associated to this outbreak in 9-to-11-day-old embryonated specific-pathogen-free (SPF) eggs [2] . For RNA isolation, allantoic fluid was collected and purified in a 30% sucrose cushion by ultracentrifugation at 27 800 rpm for 4 h [19] . The RNA from the pellet was extracted using TRIZOL (AmbionÒ CA, ES) in accordance with the manufacturer's instructions [35] . RNA was converted to double-stranded DNA using random hexamers, Superscript III and Klenow enzyme. The NGS library was prepared with Nextera TM XT (IlluminaÒ) reagents. RNA and dsDNA quantification and quality control were conducted using Quantus FluorometerÒ (Promega) and a QIAxcelÒ (QIAGENÒ). The library was sequenced on an Illumina MiSeq using a paired-end (2 9 250 bp) protocol. The quality control of the sequence run was analyzed with Sequence Analysis Viewer (SAV) (IlluminaÒ). Short reads quality was analyzed using FASTQC and quality trimming was conducted using Trimmomatic [7] . De novo assembly was conducted using SPADES [6] and the longest contig was compared with viral sequences at NCBI using BLAST. Bowtie2 was used to align all short reads of the draft genome and Artemis [9] was used for visualization of the alignment. Automatic annotation was done using PROKKA [31] followed by manual curation using information from the ViPr database. Each CDS and genome feature was compared to existing sequences using BLASTP (nr database). Possible recombination events were evaluated using the recombination detection program RDP4 V.4.95 [1, 2, 33, 35] , to detect recombinations in at least five out of the seven possible methods [35] . The phylogenetic analysis was conducted by taking sequences of the S1 gene region and the IBV whole-genome sequences available in the GenBank database. The sequences were aligned using MAFFT algorithm available in Guidance2 server [18, 32] , and the software PartitionFinder2 was used to determine the best substitution model [20, 21] . The phylogenetic trees were made using the Bayesian inference with Mr.Bayes 3.2.6 [15] . Phylogenetic analyses were performed in the CIPRES Science Gateway Cluster [26] . The whole-genome sequence of isolate CK/CR/1160/16 was uploaded to the GenBank database under the accession number MN757859 and raw data were deposited in the SRA under accession number SRR10547950, BioProject number PRJNA592262, and BioSample number SAMN13419001. The complete sequence was 27 696 bp long, consistent with previously reported lengths [2, 29, 35] , with thirteen ORFs, containing 9 genes, with two UTR regions, and a noncoding region between gene N and 6b (Table 1 ). The genome organization of isolate CK/ CR/1160/16 was 5 0 -UTR-Pol-S-3a-3b-E-4b-4c-M-5a-5b-N-6b-3 0 -UTR, which differs with the classic IBV gene distribution [24, 35] , but has been previously reported [1, 27, 33] . The phylogenetic analysis of the S1 region shows that isolate CK/CR/1160/16 forms a cluster with sequences that belong to the genotype 1 lineage 17, where the variants from California and Pennsylvania isolates (1990s) are found [34] (Fig. 1a) . Moreover, the Costa Rican isolate CK/CR/1160/16 shows a close phylogenetic relatedness to isolate GA-13/14,255/14 (KM087780) with a bootstrap value of 100 (Fig. 1a) . The sequences from other regions in America form two different clusters: GI-11 (unique for South America, including sequences from Argentina and Brazil), and GI-16 (reported in Asia and Italy and including sequences from Argentina and Chile) [34] (Fig. 1a) . The phylogeny obtained using the whole genome sequence shows that isolate CK/CR/1160/16 forms a cluster with isolates from USA, specifically with CAV/CAV56b/91 (GU393331) and Cal99/NE15172/95 (FJ904714) from California [25, 33] and with DMV/1639/GA9977/2019 (MK878536), from Georgia [14] (Fig. 1b) . These sequences correspond to the GI-17 clasification, based on the S1 region. Comparison of isolate CK/CR/1160/16 to other wholegenome sequences (Table 1) indicates that this isolate has the highest nucleotide sequence identity (94.03%) with DMV/1639/GA9977/2019, with which the nucleotide sequence identities for every gene were higher than 88%. The IBV sequence with the lowest identity (86.03%) was ck/CH/LHLJ/08-6. The S1 gene region is highly variable, with nucleotide sequence identities varying between 58.3-88.48% among the different IBV serotypes, due to a high mutation frequency as well as recombination events [2, 3, 11, 29] . For this reason, it is important to point out that the isolate in this study exhibits a very high sequence identity in the S1 gene (96.89%) with the Georgia 13 genotype (Ga-13/14255/14). Finally, ORF 6b shows the lowest sequence identity among all the coding regions in the genomes analyzed in this study. Two possible recombination points were detected, shown in at least six of the seven models using RDP4 software. The first event has a beginning breakpoint position at nucleotide 20,410 and an ending breakpoint position at 23,695 that is found in the S gene (Fig. 2a) . In this case, the minor parent was inferred as a Massachusetts (FJ904722) strain, and the putative major parent was determined as a Connecticut type (KF696629). The second recombination event starts at nucleotide 24,291 and ends at position 25,518 in the sequence. The major parent belongs to an Arkansas type (EU418976), and the minor parent to Ma5 (KY6226045), comprising a part of the E, M, 4B and 4C genes (Fig. 2b) . Recombination hotspots in region S of Cov: NGS coverage *Polyprotein 1ab frameshift **IBV variant with the highest percentage of similarity in the S1 region The boldface indicates the highest nucelotide sequence indentity in differents genes Underline indicates the lowest nucleotide identity in someone genes Molecular characterization of an avian GA13-like infectious bronchitis virus full-length… the genome, which are associated with the appearance of new virus variants have been described in the past [22, 33] . Recombination events in genes 3 and M have also been detected previously [22, 33] . Our results show that the 2016 outbreak of IBV in Costa Rica was caused by a virus that belongs to the GI-17 group, which includes strains native to the United States. More specifically, we confirmed that the outbreak was caused by a Ga-13/14255/14 strain similar to the one that circulated in the United States during in 2016 [16, 30] . Our whole genome analyses provide the first evidence that the isolate CK/CR/1160/16 may be the result of the recombination of at least four different variants (Mass, Connecticut, Arkansas and Ma5). Detection of recombination events supports the need to maintain epidemiological surveillance, monitor the variants present in Latin America and optimize vaccination schemes, as outbreaks usually originate from variants not covered by vaccine serotypes [35] . The raw data of isolate CK/CR/1160/16 of the GA13like strain has been deposited in the Sequence Read Archive (SRA) under number SRR10547950, BioProject number PRJNA592262, and BioSample number SAMN13419001 of the NCBI. The whole-sequence of the genome, has been uploaded in the GenBank database under accession number MN757859. Complete genome sequences of two avian infectious bronchitis viruses isolated in Egypt: evidence for genetic drift and genetic recombination in the circulating viruses Characterization and analysis of the full-length genome of a strain of the European QX-like genotype of infectious bronchitis virus Comparative features of infections of two Massachusetts (Mass) infectious bronchitis virus (IBV) variants isolated from Western Canadian layer flocks Complete genomic sequence analysis of infectious bronchitis virus Ark DPI strain and its evolution by recombination Global distributions and strain diversity of avian infectious bronchitis virus: a review SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing Trimmomatic: a flexible trimmer for Illumina sequence data Ratification vote on taxonomic proposals to the International Committee on Taxonomy of Viruses Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data Coronavirus avian infectious bronchitis virus Phylodynamic analysis and molecular diversity of the avian infectious bronchitis virus of chickens in Brazil Infectious bronchitis virus variants: a review of the history, current situation and control measures Phylogenetic and phylogeographic mapping of the avian coronavirus spike protein-encoding gene in wild and synanthropic birds First complete genome sequence of currently circulating inectious bronchitis virus strain DMV/1639 of the GI-17 lineage MRBAYES: Bayesian inference of phylogenetic trees Impact of respiratory diseases with special emphasis to emerging infectious bronchitis virus Diagnosis and epidemiology of IBV infections in Costa Rica MAFFT multiple sequence alignment software version 7: improvements in performance and usability Partial purification of IBV and subsequente isolation of viral RNA for next-generation sequencing PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses Partitionfinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses Evidence of genetic diversity generated by recombination among avian coronavirus IBV Infectious bronchitis virus and infectious bursal disease virus; a study performed at the Universidad Nacional of Costa Rica Infectious bronchitis viruses with a novel genomic organization Attenuated live vaccine usage affects accurate measures of virus diversity and mutation rates in avian coronavirus infectious bronchitis virus Creating the CIPRES science gateway for inference of large phylogenetic trees Complete genome analysis of Iranian IS-1494 like avian infectious bronchitis virus. Virusdisease Molecular characterization of infectious bronchitis viruses isolated from broiler chicken farms in Iran Full genome sequence analysis of a newly emerged QX-like infectious bronchitis virus from Sudan reveals distinct spots of recombination Effectiveness of a vaccine program based on the Pro-tectotypeÒ concept against an infectious bronchitis variant virus strain challenge (GA13) in Costa Rica Prokka: rapid prokaryotic genome annotation GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters Recombination in avian gamma-coronavirus infectious bronchitis virus S1 gene-based phylogeny of infectious bronchitis virus: an attempt to harmonize virus classification Molecular characterization of an infectious bronchitis virus strain isolated from northern China in 2012 Acknowledgements This study was funded by ''Fundación para el Fomento y Promoción de la Investigación y Transferencia de Tecnología Agropecuaria'' (FITACORI), Costa Rica, Grant No. 3-006-115123, and by the ''Vicerrectoría de Investigación'' and ''Sistema de Estudios de Posgrado'', University of Costa Rica.Funding The fundings of the study were provided by ''Fundación para el Fomento y Promoción de la Investigación y Transferencia de Tecnología Agropecuaria'' (FITACORI), Costa Rica, Grant No. 3-006-115123, and by the ''Vicerrectoría de Investigación'' and ''Sistema de Estudios de Posgrado'', University of Costa Rica. Conflict of interest The authors declare that they have no conflict of interest.Ethical approval All procedures performed in animals were in accordance with the ethical standards of the ''Comité Institucional de Cuido y Uso de Animales'' (CICUA) of the University of Costa Rica.