key: cord-256607-wpywh8c9 authors: Itokawa, Kentaro; Sekizuka, Tsuyoshi; Hashino, Masanori; Tanaka, Rina; Kuroda, Makoto title: Disentangling primer interactions improves SARS-CoV-2 genome sequencing by the ARTIC Network’s multiplex PCR date: 2020-06-01 journal: bioRxiv DOI: 10.1101/2020.03.10.985150 sha: doc_id: 256607 cord_uid: wpywh8c9 Since December 2019, the coronavirus disease 2019 (COVID-19) caused by a novel coronavirus SARS-CoV-2 has rapidly spread to almost every nation in the world. Soon after the pandemic was recognized by epidemiologists, a group of biologists comprising the ARTIC Network, has devised a multiplexed polymerase chain reaction (PCR) protocol and primer set for targeted whole-genome amplification of SARS-CoV-2. The ARTIC primer set amplifies 98 amplicons, which are separated only in two PCRs, across a nearly entire viral genome. The original primer set and protocol showed a fairly small amplification bias when clinical samples with relatively high viral loads were used. However, when sample’s viral load was low, several amplicons, especially amplicons 18 and 76, exhibited low coverage or complete dropout. We have determined that these dropouts were due to a dimer formation between the forward primer for amplicon 18, 18_LEFT, and the reverse primer for amplicon 76, 76_RIGHT. Replacement of 76_RIGHT with an alternatively designed primer was sufficient to produce a drastic improvement in coverage of both amplicons. Based on this result, we replaced 12 primers in total in the ARTIC primer set that were predicted to be involved in 14 primer interactions. The resulting primer set, version N1 (NIID-1), exhibits improved overall coverage compared to the ARTIC Network’s original (V1) and modified (V3) primer set. In the original ARTIC prime set V1, PCR amplicons 18 and 76 were amplified by the primer 80 pairs 18_LEFT & 18_RIGHT and 76_LEFT & 76_RIGHT, respectively. Those primers were 81 included in the same multiplexed reaction, "Pool 2." We noticed that two of those primers, 82 18_LEFT and 76_RIGHT, were perfectly complementary to one another by 10-nt at their 3′ 83 ends (Fig 1) . Indeed, we observed NGS reads derived from the predicted dimer in raw 84 FASTQ data. From this observation, we reasoned that the acute dropouts of those amplicons 85 were due to an interaction between 18_LEFT and 76_RIGHT, which could compete for 86 amplification of the designated targets. Next, we replaced one of the two interacting primers, 87 76_RIGHT, in the Pool 2 reaction with a newly designed primer 76_RIGHTv2 (5′-88 TCTCTGCCAAATTGTTGGAAAGGCA-3′), which is located 48-nt downstream from 89 76_RIGHT. Figures 1A and 1B show the coverage obtained with the V1 set and the V1 set 90 with 76_RIGHT replaced with 76_RIGHTv2 for cDNA isolated from a clinical sample obtained 91 during the COVID-19 cruise ship outbreak, which was previously analyzed 92 (EPI_ISL_416596) (Sekizuka et al. 2020) . The replacement of the primer drastically improved 93 the read depth in the regions covered by amplicons 18 and 76 without any notable adverse 94 effects. The replacement of the primer 76_RIGTH improved coverage not only for amplicon 95 76, but also for 18 as well, supporting the hypothesis that the single primer interaction caused 96 dropout of both amplicons. 97 Given this observation, we identified an additional 13 primer interactions using in silico 99 analysis (Fig 2A and B) . Those primer interactions predicted by PrimerROC algorithm 100 (Johnston et al. 2019), which gave the highest score for the interaction between 18_LEFT 101 and 76_RIGHT among all 4,743 possible interactions, were likely involved in producing the 102 low coverage frequently seen in our routine experiments. Next, we designed an additional 11 103 alternative primers, which resulted in a new primer set (ARTIC primer set ver. NIID-1 (N1) 104 including 12 primer replacements from the original V1 primer set (Table S1 ). The N1 primer 105 set eliminated all interactions shown in Fig 2A, and was expected to improve amplification of 106 up to 22 amplicons (1, 7, 9, 13, 15, 18, 21, 29, 31, 32, 36, 38, 45, 48, 54, 59, 66, 70, 73, 76, 107 85, and 89) . Alongside with this modification, the ARTIC Network itself released another 108 modified version of primer set known as V3 in 24 th March 2020 (Loman and Quick 2020) after replacement for the same clinical sample (previously deposited to GISAID with ID EPI_ISL_416596, Ct=28.5, 1/25 input per reaction). Regions covered by amplicons with modified primer (76_RIGHT) and the interacting primer (18_LEFT) are highlighted by green and orange colors, respectively. For all data, reads were downsampled to normalize average coverage to 250X. Horizontal dotted line indicates depth = 30. These two experiments were conducted with the same PCR master mix (except primers) and in the same PCR run in the same thermal cycler. A B we reported our result on the replacement of primer 76_RIGHT in a preprint (Itokawa et al. 110 2020a) . The V3 primer set included 22 spike-in primers, which were directly added into the 111 V1 primer set to aid amplification of 11 amplicons (7, 9, 14, 15, 18, 21, 44, 45, 46, 76, and 112 89) . 113 the N1 primer set resulted in improved robustness of coverage over a broader range for 131 relevant amplicons. The improvement, however, made potentially weak amplicons 74 and 98 132 more apparent (Fig 3) . The abundance of amplicons 74 gradually decreased with decreasing 133 , in contrast, the abundance of amplicon decreased with increasing . These amplicons 134 seemed equally weak in all three primer sets rather than specific in N1 primer set. So far, we 135 have not yet identified interactions involving the primers for those amplicons. The gradient 136 experiment also revealed relatively narrow range of optimal temperature for for the V1 137 and V3 primer set, around 65 °C, which was broaden for the N1 primer set. Nevertheless, 138 while Ta = 65 °C is a good starting point, a fine tuning of this value may help improving sequencing quality since even slight difference between thermal cyclers, such as systematic 140 and/or well-to-well accuracy differences and under-or overshooting, may affect the results 141 of multiplex PCR (Ho Kim et al. 2008 ). Finally, we further compared the V1, V3 and N1 primer 142 sets for three other clinical samples using a standard temperature program ( = 65 °C). In 143 all three clinical samples (Fig 4 and S1) , the N1 primer set showed the most even coverage 144 Fig 3 Abundance of 98 amplicons at 8 different annealing/extension temperatures with the three different primer sets on a same clinical sample (previously deposited to GISAID with ID EPI_ISL_416584, Ct=16, 1/300 input per reaction). For all data, reads were downsampled to normalize average coverage to 500X before analysis. The green lines and points indicate the abundances of amplicons whose primers in V1 primer set were subjected to modification in the N1 primer set. The orange lines and points indicate the abundances of amplicons whose primers were not modified but predicted to be eliminated the adverse primer interactions in the N1 primer set. Other amplicons which were not subjected to the modification are indicated by black lines and points. The plots in the left column shows results of all 98 amplicons while only amplicons targeted by modification are shown in the plots in the right column. Horizontal dotted line indicates fragment abundance = 30. Red vertical lines indicate normal annealing/extension temperature, 65 °C. All those experiments were conducted with the same PCR master mix (except primers) and in the same PCR run in the same thermal cycler. Regions covered by amplicons with modified primers and with not modified but interacting primers are highlighted by green and orange colors, respectively. For all data, the reads were downsampled to normalize average coverage to 250X. Horizontal dotted line indicates depth = 30. These two experiments were conducted with the same PCR master mix (except primers) and in the same PCR run in the same thermal cycler. position several nucleotides toward the 5′ ends, but extension or trimming on either end were 160 applied when the medium dissociation temperature ( m) predicted by the NEB website tool 161 (https://tmcalculator.neb.com/) were considered too low or high. See details of modifications 162 on primers indicated in Table S1 Table_S1.pdf Differences between the N1 primer set and the original V1 primer set. 298 Fig_S1.pdf Depth plots of the original (V1) and two modified ARTIC primer sets (V3 and N1) for two 299 clinical samples (newly deposited to GISAID with ID EPI_ISL_416749, Ct = 27.3 for A and previously Towards a genomics-informed, real-time, global 244 pathogen surveillance system