key: cord-0937633-xmiefdez authors: Michalakis, Yannis; Sofonea, Mircea T.; Alizon, Samuel; Bravo, Ignacio G. title: SARS-CoV-2 viral RNA levels are not “viral load” date: 2021-09-04 journal: Trends Microbiol DOI: 10.1016/j.tim.2021.08.008 sha: a01b7fdb03e7418df5666c8819d56cd9f35f4587 doc_id: 937633 cord_uid: xmiefdez Ct values are commonly used as proxies of SARS-CoV-2 “viral load”. Since coronaviruses are positive single stranded RNA ((+)ssRNA) viruses, current RT-qPCR target amplification does not distinguish replicative from transcriptional RNA. Although analyses of Ct values remain informative, equating them with viral load may lead to flawed conclusions as it is presently unknown whether (and to what extent) variation in Ct reflects variation in viral load or in gene expression. Keywords: SARS-CoV-2; viral load; Ct; RT-qPCR; viral replication; viral transcription The SARS-CoV-2 pandemic has prompted an unprecedented and large-scale use of diagnostic tests: serological tests detecting antibodies, antigenic tests detecting viral proteins, or RT-qPCR tests detecting viral RNA [1] . Because the genetic information in coronaviruses is carried by RNA molecules, the first step in PCR-based tests includes reverse-transcription (RT) of the viral RNA to DNA, which is subsequently amplified and quantified through quantitative PCR techniques (qPCR). DNA quantification is typically achieved by measuring the fluorescence emitted by certain molecules bound to the amplified double stranded DNA. The outcome is a numeric value, commonly called "Ct" for "cycle threshold" or "Cq" for "quantification cycle", corresponding to the amplification cycle at which the detected fluorescence exceeds baseline levels. Thus, larger amounts of viral RNA in a sample lead to larger amounts of retro-transcribed DNA and lower Ct values (see e.g. [2] for a rapid presentation of the RT-qPCR and a review of quantitative analysis methods). A relatively large number of PCR-based tests to detect infection by SARS-CoV-2 have been developed, most of them targeting several locations in the viral genome. A PCR test is considered positive if the Ct value is below a predefined threshold for at least one of the targets, i.e. genomic nucleotide sequence, amplified by the test. The number of genomic targets below the threshold and the precise values of the thresholds vary across manufacturers and tests. The simultaneous amplification of multiple genomic locations by some tests was originally conceived to introduce robustness and increase specificity, but serendipity turned it into a way to detect "variants of concern" as mutations in the target sequence prevented PCR amplification and led to negative results for specific variants [2] . Mass testing has resulted in the generation of extensive data consisting of Ct values corresponding to different viral targets per sample. Most often they serve diagnostic purposes, and their use in this context raises no conceptual concern. Several studies, however, have used these Ct values as proxies of viral load 1 , which is understandable, not only because these values were anyway available, but also because alternative quantification methods (e.g. plaque assays) are labor intensive and still not well standardised. Unfortunately, an important aspect of the biology of coronaviruses is neglected when using Ct values as a proxy of viral load. Given that they are positive single stranded RNA ((+)ssRNA) viruses, newly synthesized (+) strand RNAs can be used either for replication or transcription. This makes it unclear to differentiate the process, namely, replication or transcription, that is quantified by RT-qPCR. To make matters more complicated, coronaviruses produce two kinds of mRNA molecules, genomic (i.e. full-size) and a variety of subgenomic (sgmRNA) segments [4] . All genomic and sgmRNAs contain the genomic 5' leader sequence as well as the 3' polyadenylated end. All sgmRNAs are nested into the 3' end of the genomic mRNA: the smallest sgmRNA contains only the ORF at the 3' end of the genome (in SARS-CoV-2, the ORF located the closest to the 3' end and whose corresponding sgmRNA has hitherto been amplified is called N [5, 6] ); the second smallest contains the two ORFs lying at the 3' end of the full-size RNA (N and 8), and so forth up to the largest sgmRNA, which carries all viral ORFs except 1a and 1b. Only the genomic mRNA carries all viral ORFs. Thus, the ORF at the 3' end of the SARS-CoV-2 genome, i.e. N, is carried by 1 Pretty much every study referring to SARS-CoV-2 viral load uses Ct as its proxy. Rather than singling out one or several randomly, we opted to not cite any specific reference. Journal Pre-proof all viral mRNAs, the one after it, i.e. 8, is carried by all viral mRNAs but those carrying only N, while the ORFs at the 5' end, i.e. 1a and 1b, are carried only by full-size, genomic mRNA ( Figure 1 ). Upon translation and processing, the 1a and 1ab polyproteins form the RNA polymerase, while ORFs present in sgmRNAs encode structural and accessory proteins. If all viral genomic and sgmRNA types occurred at equal frequency (which is not the case; see next paragraph), the RNA sequences of the different ORFs would occur at different frequencies because the more they are located towards the 3' end the larger the number of sgmRNA types carrying them is. Coronaviruses have evolved mechanisms to regulate gene expression through translational and, more to the point of this article, transcriptional regulation [4] . Finkel and colleagues [6] showed, using RNA sequencing techniques on cell cultures, that different SARS-CoV-2 transcripts occur at different abundancies, with variation spanning one order of magnitude (see declining within a population should not be problematic, provided sufficient sampling and appropriate standardisation. On the other hand, using Ct values to predict contagiousness may be riskier for several reasons: sample variability [9] , Ct variation across qPCR targets [10] , variation among patients in their physiological status [10, 11] , infection age [10, 11] , and viral variants [12] , to name a few. It should be emphasised that, with the exception of sample J o u r n a l P r e -p r o o f variability, these sources of variation may result in changes in viral replication and/or viral gene expression though it is presently unknown whether and to what extent they would lead to relative increases or decreases. Despite experimental biases associated with RT-qPCR protocols and differential robustness with respect to input sample quality [9] , quantitative analyses of Ct values may be highly informative Diagnostics for SARS-CoV-2 detection: A comprehensive review of the FDA-EUA COVID-19 testing landscape Evaluation of qPCR curve analysis methods for reliable biomarker discovery: Bias, resolution, precision, and implications Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England Continuous and discontinuous RNA synthesis in Coronaviruses The Architecture of SARS-CoV-2 Transcriptome The coding capacity of SARS-CoV-2 Subgenomic messenger RNA amplification in coronaviruses Spike mutation D614G alters SARS-CoV-2 fitness Ct values from SARS-CoV-2 diagnostic PCR assays should not be used as direct estimates of viral load Epidemiological and clinical insights from SARS-CoV-2 RT-PCR cycle amplification values Ct threshold values, a proxy for viral load in community SARS CoV-2 cases, demonstrate wide variation across populations and over time Estimating infectiousness throughout SARS-CoV-2 infection course