key: cord-0279332-rbvyotew authors: Mueller, Stefan H.; Fitschen, Lucy J.; Shirbini, Afnan; Hamdan, Samir M.; Spenkelink, Lisanne M.; van Oijen, Antoine M. title: Rapid single-molecule characterisation of nucleic-acid enzymes date: 2022-03-03 journal: bioRxiv DOI: 10.1101/2022.03.03.482895 sha: ab73f14333d1aa48ec1219984701a6df80a309fb doc_id: 279332 cord_uid: rbvyotew The activity of enzymes is traditionally characterised through bulk-phase biochemical methods that only report on population averages. Single-molecule methods are advantageous in elucidating kinetic and population heterogeneity but are often complicated, time consuming, and lacking statistical power. We present a highly generalisable and high-throughput single-molecule assay to rapidly characterise proteins involved in DNA metabolism. The assay exclusively relies on changes in total fluorescence intensity of surface-immobilised DNA templates as a result of DNA synthesis, unwinding or digestion. Combined with an automated data-analysis pipeline, our method provides enzymatic activity data of thousands of molecules in less than an hour. We demonstrate our method by characterising three fundamentally different nucleic-acid enzyme activities: digestion by the phage λ exonuclease, synthesis by the phage Phi29 polymerase, and unwinding by the E. coli UvrD helicase. We observe a previously unknown activity of the UvrD helicase to remove proteins tightly bound to the ends of DNA. fluorimetry. These methods have the drawback of averaging over large ensembles of molecules and, therefore, provide no access to information on subpopulations, dynamic molecular mechanisms and intermediate states. However, knowledge of these properties is often crucial to a full understanding of the molecular processes underlying DNA metabolism and the enzymes involved. To describe such properties, researchers have developed techniques to observe single molecules in real time. These methods often rely on imaging fluorescent tags or manipulating molecules using optical tweezers [5] . In recent years these techniques have revealed unexpected dynamics [6] [7] [8] [9] and quantitatively characterised interactions on the molecular scale [10] , [11] . While these techniques have yielded new insights into molecular properties of enzymes and protein dynamics, a major disadvantage of single-molecule approaches is the time-consuming and complex nature of the experiments and data analysis needed to acquire statistically significant data. Because of these challenges, single-molecule studies are difficult to reproduce by other researchers and the statistical power of many studies is comparatively small. Here, we describe a single-molecule assay that can be used to characterise any enzyme that catalyses the conversion between double-stranded DNA (dsDNA) and single-stranded DNA (ssDNA). The assay provides kinetic information on large numbers of molecules in one experiment and is simple to implement relative to existing single-molecule experiments. By using fluorescent probes that selectively stain ssDNA or dsDNA and by monitoring fluorescence intensity changes of surface-immobilised, randomly-coiled DNA templates, we can visualise the conversion of dsDNA to ssDNA in real time for hundreds of molecules simultaneously. As a proof of principle, we characterise three enzymes with different functions. We visualise exonucleolytic degradation of the DNA template catalysed by phage λ exonuclease (λ exo) (Fig. 1a) , strand-displacement synthesis by the phage Phi29 DNA polymerase (Phi29 DNAp) (Fig.1b) , and unwinding of DNA by the E. Coli UvrD helicase (Fig. 1c) . We report rate constants and distributions, determined by characterising thousands of single-molecule reactions for each of the three enzymes. The statistical power of our study greatly exceeds that of previous single-molecule studies of these enzymes ( [12] - [15] , yet our assay is comparatively easy to implement and due to a highly automated data-analysis pipeline, less time-consuming than previously described methods. Using this assay we observed the removal of protein roadblocks from DNA ends by the UvrD helicasean activity that was previously unknown. DNAp mediated strand-displacement synthesis (left) . Phi29 DNAp binds the primed 3' end of the template. In the presence of dNTPs the template strand becomes single stranded, as the newly-synthesised daughter strand is displaced and eventually dissociates from the surface and therefore becomes invisible in TIRF-microscopy. This displacement leads to the instantaneous drop in DNA stain intensity (right; black line). In parallel, we use fluorescently-labeled RPA to visualise the increasing amount of exposed ssDNA (magenta line). (c) UvrD helicase assembles at the available 5' end (left) . In the presence of ATP dsDNA is converted to ssDNA, leading to a decrease in DNA stain intensity (right). Our assay uses easily-constructed 2.6-kb linear dsDNA templates. These templates have a biotin on one end to allow for attachment to the surface of a microfluidic flow chamber through biotinneutravidin binding. After surface immobolisation of the templates, we stain the dsDNA by introducing the DNA-intercalating dye SYTOX Orange into the flow cell. Finally, we initiate the enzymatic reaction by adding the enzyme and required cofactors. The activity of any enzyme that alters the amount of dsDNA can be monitored by measuring SYTOX Orange fluorescence intensity. The trimeric λ exo catalyses the removal of nucleotides from linear or nicked dsDNA in the 5′ to 3′ direction. During degradation of DNA, the enzyme encircles both strands [16] , [17] . On our template, λ exo loads at the free, non-tethered end and subsequently converts the dsDNA to ssDNA by digesting nucleotides from the 5′ end (Fig. 1a) . As dsDNA is converted to ssDNA staining by SYTOX Orange becomes much weaker. We can therefore monitor the digestion of dsDNA in real-time by integrating the DNA-stain intensity for each individual molecule over time (see Fig. 2 a,b) . Over a total of six experiments we record trajectories of over 2500 individual molecules. Our method is sensitive to complex stochastic kinetics of individual molecules, such as pausing (see Fig. 2a , middle trajectory). However, the vast majority of trajectories shows very uniform and linear behaviour. We use piecewise linear fits to determine the digestion rate for individual molecules. We find rate distributions with means of 10.9±7.1 and 21.4±6.2 (mean ± standard deviation (STD)) for reactions at 25°C or 35°C respectively (see Fig. 2c ), consistent with previously measured values [18] [12] , [19] . The large standard deviation of the observed distributions highlights the presence of intermolecular disorder. Such effects have previously been studied by single-molecule techniques, but typically with much smaller sample sizes [12] , [20] , [21] . During strand-displacement synthesis by Phi29 DNAp, the net amount of dsDNA stays constant until the template strand is fully replicated and the daughter strand dissociates from the template (see Fig. 2a ). This dissociation is visible as a sudden drop in DNA-stain intensity. To visualise the kinetics of DNA synthesis, we additionally introduce fluorescently-labelled S. cerevisiae replication protein A (RPA), a single-stranded DNA-binding protein with very high affinity for ssDNA. Furthermore, while free RPA is present in solution, bound RPA exchanges rapidly [22] [23] . This effect mitigates photobleaching and makes RPA a good marker for ssDNA (see supplementary Fig 1) . For every synthesised nucleotide on the daughter strand, one nucleotide of ssDNA is left behind on the surfacetethered strand. Therefore, the change in RPA signal over time corresponds to the replication rate by Phi29 DNAp, knowing that RPA binds ssDNA faster than new dsDNA is synthesised [24] . Fig. 3c shows data from our experiments. Surprisingly, most single-molecule trajectories seemed to exhibit a non-linear dependence between RPA signal and time. However, for individual traces this behaviour is difficult to distinguish from statistical noise and pausing kinetics. To increase our signal-to-noise ratio we synchronised all trajectories to the time of dissociation of the daughter strand, which is an event that is easy to identify. Subsequent averaging across many single-molecule traces yields a synchronised average trajectory that does not suffer from the same caveats as ensemble-averaging methods and contains information on the underlying kinetics [25] , [26] . Indeed, the post-synchronised average trajectory clearly exhibits non-linear behaviour (see Fig. 3d ). As a control, to prove that this is not an artefact of our analysis, we synchronised trajectories from the previous experiments on λ exonuclease (see supplementary Fig 2) . Unlike for Phi29 DNAp, the synchronised trajectory of λ exonuclease is linear. The observed non-linear Phi29 DNAp dynamics become more obvious for higher dNTP concentrations, i.e. at higher replication rates (see Fig. 3d ). This observation suggests that our initial assumption, that RPA binding kinetics are faster than DNA synthesis, is not generally true. Our data is well described by single-exponential functions, with a KM of Phi29 DNAp for dNTPs of (8±3) μM (mean±SEM) and a maximum synthesis rate of 160±25 bp•s -1 (mean±SEM, see Fig. 3d ). Our KM value is about four times lower than previously reported values and our vmax value is consistent with previous measurements [27] . The fact that we find lower KM values than expected is in agreement with the hypothesis that for high dNTP concentrations, the observed kinetics are limited by RPA binding. This is also confirmed by the fact that our estimated vmax is in good agreement with the literature, since the maximum increase in RPA signal is still limited by strand-displacement synthesis by Phi29 DNAp. To gain information on RPA binding kinetics within our system we examine the increase in RPA fluorescence signal immediately before the dissociation of the daughter strand (see Fig. 4c ). In this regime RPA binding is no longer limited by Phi29 DNAp activity. The observed data are well described by first-order binding kinetics and yield a bimolecular association rate constant kon of 11.1±0.9 nt•nM -1 •s -1 (mean±SEM), consistent with previously reported values [24] , [28] . A first-order kinetic model for RPA binding implies that the speed of binding at any given time depends on the number of binding partners available (see methods). We therefore hypothesise that in the very beginning of the reaction almost no free ssDNA is present, and RPA binding is therefore slow. As replication proceeds ssDNA is generated. As more binding sites become available, RPA binding becomes faster, the amount of ssDNA decreases again, and binding slowly converges to saturation as the daughter strand dissociates (see Fig. 4a ). This picture implies a fluctuation of the amount of free ssDNA, not bound by RPA but very weakly stained by Sytox Orange. Since ssDNA staining is much less efficient than dsDNA, such a minor increase is not visible in the individual single-molecule trajectories. However, the post-synchronised trajectories of the S.O. signal (see Fig. 4d ) indeed show a clear fluctuation in intensity. Our data indicates that RPA binding is stimulated by the amount of free ssDNA, and that RPA displaces Sytox Orange from ssDNA. Next, we sought to test if our assay is suitable for the study of helicases. Helicases are one of the biggest families of proteins, present in all domains of life. As an example we characterise the E. coli UvrD helicase, a member of the SF1 family of helicases. Apart from unwinding DNA in 3′ to 5′ direction in its dimeric form, it is also involved in methyl-directed mismatch repair and acts as an antirecombinase by removing recA filaments from ssDNA [29] [30] [31] . The monomeric form of UvrD processively translocates on ssDNA [15] . At first, we wanted to study ATP-dependent unwinding by the UvrD helicase on the previously used 2.6-kb forked template. We expected unwinding of DNA, and therefore a continuous decrease in the fluorescence intensity of S.O. stained template DNA, as UvrD unwinds the substrate, potentially from both ends. Surprisingly, instead of the expected continuous decrease in intensity, we observe a discrete drop in fluorescence intensity, i.e., diffraction limited spots simply disappear (see Fig. 5a ). This observation indicates dissociation of the full template from the cover slide surface rather than DNA unwinding. Together with control reactions lacking either ATP or using inactivated UvrD, this shows that the UvrD helicase can actively remove the neutravidin bound to the 5′-DNA end. We The trajectories still show discrete fluorescence drops within one frame, suggesting a dissociative process that is completed within 30 ms. UvrD loading on the free 3′-DNA end and unwinding DNA towards the surface within 30 ms would correspond to a rate of 80,000 nt•s -1 . Since this high rate would be in stark contradiction to the literature [14] , [30] , [32] , [33] , we conclude that UvrD is loading in close proximity to the surface and exhibits an enzymatic activity different to DNA unwinding. Next, we wanted to understand if loading on ssDNA is required for the removal of neutravidin. To do so, we made two different versions of our previous DNA substrate (henceforth referred to as Substrate 1). First, we removed the ssDNA region adjacent to the tethered 5′-end (Substrate 2) to examine if displacement of protein blocks required UvrD assembly on ssDNA. Second, we placed the biotin on the 3′-DNA end (Substrate 3), to see if this activity has the same 3′-5′ directionality as unwinding and translocation on ssDNA (see Fig. 5a ). DNA unwinding by UvrD was previously reported to be inefficient, if initiated from short 3′ overhangs or even blunt ends. Surprisingly, the removal of a neutravidin block is efficient, even from blunt ends (Fig. 5b, solid line) . However, neutravidin bound to 3′-biotinilated DNA cannot be displaced by UvrD at all (see Fig. 5b , dotted line). To gain more insight in the mechanism involved, we calculated first-passage time (FPT) distributions. FPT distributions are a powerful analysis tool, widely used to analyse and model stochastic processes, such as animal migration, the spread of COVID-19 virus particles and also helicase dynamics [34] [35] [36] . The FPT tn is the time from the start (addition of ATP) to the end of a reaction (dissociation of the DNA template) for an individual molecule (see Fig 5c) . The distribution of FPTs conveys information on the number and rate constants of all rate-limiting steps during the reaction [37] . We preincubate Substrate 1 with UvrD and subsequently initiate the reaction by adding ATP. For Substrate 2, we observe a single-exponential FPT-distribution, a hallmark of the absence of intermediate steps (see Fig. 5c grey histograms). For Substrate 2, which lacks available ssDNA for UvrD to assemble close to the 5′ end, the FPTs are well described by a gamma distribution (see Fig 5b and c, yellow histrograms). This observation indicates the presence of multiple slow reaction intermediates required to remove the neutravidin [37] . Since the mean of the measured FPT distributions is much longer for Substrate 2 than for Substrate 1, we conclude that the rate-limiting steps in this case correspond to unwinding of the template from the blunt end (away from the surface), to subsequently allow for UvrD binding on the 5' end next to the biotin. To obtain the number of reaction intermediates and corresponding rate-constants, we fit the data with gamma-distributions, as previously described [37] (see Fig. 5d and Methods). Our data suggest four intermediate steps, a number that does not vary with ATP concentration. Taken together with reported step sizes of unwinding by UvrD of 3-6 nucleotides [27] , [38] our data indicates that unwinding of 12-24 nt is required for subsequent displacement of neutravidin. Finally, we set out to observe DNA unwinding by UvrD. To do so, we utilise Substrate 3 (see Fig. 5a ). The 3′-biotin prevents disruption of the biotin-neutravidin bond, while UvrD can load on the opposite end. The 60-nt 3′-dT tail provides a substrate for UvrD dimer assembly and initiation of DNA unwinding in presence of ATP. Unwinding by UvrD results in a gradual reduction of the DNA-stain intensity as dsDNA is converted to ssDNA. We find that UvrD is capable of unwinding the 2.6-kb template (Fig. 6 a,b) . As before we use linear fits to determine a rate for each trajectory (see methods). We find a broad distribution, with a median of 29.5±28.3 bp•s -1 (median±STD, see Fig. 6c ), consistent with values measured before in bulk and single-molecule studies [30] , [38] , [39] . However, to our knowledge, this is the first study reporting unwinding of long (>100 bp) DNA substrates, despite its potential importance during biological processes, such as methyl-directed mismatch repair, which can require more than 1000 bp to be unwound [40] . We report a highly generalisable and high-throughput single-molecule assay with fully automated data analysis to study DNA-based enzymatic processes. This assay allows the extraction of features and kinetics otherwise hidden in the noise of single-molecule measurements. To demonstrate the strengths of this assay, we characterised DNA degradation, synthesis, and unwinding. Furthermore, we observe removal of a DNA-bound neutravidin by the UvrD helicase. Since proteins bound to DNA can present roadblocks to DNA replication in vivo [9] , this new activity might have physiological relevance. Reproducibility of fluorescence microscopy methods was previously identified as a major issue [41] . Quantitative fluorescence microscopy is inherently difficult to reproduce, due to the large number of factors involved. Fluorescence intensity varies dependent on the specific imaging apparatus, including the used light sources, as well as lenses and objectives and precise alignment thereof. Our assay produces data, in which mechanistic features are directly visible. This aspect allows for internal normalisation of fluorescence intensity and therefore circumvents this problem. Another factor of uncertainty in microscopy data is human bias during image analysis. We developed highly automated image analysis software for our assay, to minimise this problem. Our method and analysis pipeline should be broadly applicable to measure the activity of any enzyme that converts dsDNA to ssDNA or vice versa. Furthermore, due to its high-throughput nature, the method has potential to be implemented in evolution or drug-screening studies. As a starting material we used the 4kbf plasmid, a plasmid 4 kb in length and derived from pUC19, previously developed by Dr. Jacob Lewis. The plasmid was simultaneously digested with restriction endonucleases BsaI and BstXI (NEB). The resulting 2.6-kb fragment was separated from the 1.4-kb fragment and uncut plasmid by agarose gel purification (Promega Gel Wizzard Kit). A set of oligonucleotides that form a biotinylated and primed fork was ligated to one end of the fragment and the final product purified on a Sepharose-4B column as previously described [42] . The final product was stored at 4 °C. Full plasmid map and oligonucleotide sequences are described in the Supplementary Data section. Flow chambers for microscopy were prepared as described before [7] , [21] , [43] . Briefly, cover slips Dr. Michael O'Donnell, fluorescent labelling of RPA was performed as previously described [7] , [44] . Before the reaction was initiated, initial fluorescence intensities were recorded to determine the base line of RPA intensity at the fork. Next, 5 units of Phi29 DNAp (NEB) was loaded in the presence of 20 nM RPA and the specified concentration of dNTPs. The sequence-optimised UvrD gene tagged with 8xHis-tag at the N-terminus was cloned into pE- Data analysis was carried out using Fiji [45] and Python. The raw data was first corrected for a nonuniform excitation-beam profile and mechanical drift of the microscope stage during the measurement (see supplementary Fig. 2,3) . Next, all fluorescent spots corresponding to DNA templates bound to the surface were detected using a threshold approach (see supplementary Fig. 4 ) and the intensity of the DNA, and if present RPA, was measured over time. Next, all trajectories were fitted with the following piecewise linear function with three segments: The parameter a denotes the time when enzymatic activity begins and the slope changes from 0 to a constant value m. The parameter b denotes the time when the whole substrate was processed and the slope becomes 0 again. The intensity during the first segment (x