key: cord-0078837-phhjprlb authors: Hirten, Robert P; Tomalin, Lewis; Danieletto, Matteo; Golden, Eddye; Zweig, Micol; Kaur, Sparshdeep; Helmus, Drew; Biello, Anthony; Pyzik, Renata; Bottinger, Erwin P; Keefer, Laurie; Charney, Dennis; Nadkarni, Girish N; Suarez-Farinas, Mayte; Fayad, Zahi A title: Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers date: 2022-05-18 journal: JAMIA Open DOI: 10.1093/jamiaopen/ooac041 sha: 013bc3134eb11ffde262fb9e166a64caef954b5b doc_id: 78837 cord_uid: phhjprlb OBJECTIVE: To determine whether a machine learning model can detect SARS-CoV-2 infection from physiological metrics collected from wearable devices. MATERIALS AND METHODS: Health care workers from seven hospitals were enrolled and prospectively followed in a multicenter observational study. Subjects downloaded a custom smart phone app and wore Apple Watches for the duration of the study period. Daily surveys related to symptoms and the diagnosis of COVID-19 were answered in the app. RESULTS: We enrolled 407 participants with 49 (12%) having a positive nasal SARS-CoV-2 polymerase chain reaction test during follow-up. We examined five machine-learning approaches and found that gradient-boosting machines (GBM) had the most favorable validation performance. Across all testing sets, our GBM model predicted SARS-CoV-2 infection with an average area under the receiver operating characteristic (auROC) = 86.4% (Confidence Interval 84-89%). The model was calibrated to value sensitivity over specificity, achieving an average sensitivity of 82% (CI ±∼4%) and specificity of 77% (CI ±∼1%). The most important predictors included parameters describing the circadian heart rate variability mean (MESOR) and peak-timing (acrophase), and age. DISCUSSION: We show that a tree-based ML algorithm applied to physiological metrics passively collected from a wearable device can identify and predict SARS-CoV-2 infection. CONCLUSION: Applying machine learning models to the passively collected physiological metrics from wearable devices may improve SARS-CoV-2 screening methods and infection tracking. LAY SUMMARY: The goal of the study is to determine if SARS-CoV-2 infections, which cause Coronavirus Disease 2019 (COVID-19), can be detected using machine learning algorithms applied to the information collected by wearable devices. Four hundred and nine health care workers were enrolled from 7 hospitals in New York City. Participants downloaded a custom smart phone application and were provided with an Apple Watch, if they did not have one of their own. Daily questions collected information from participants about how they feel and whether they were diagnosed with COVID-19. We found that a type of machine learning algorithm, called gradient boosting machines was able to reliably predict SARS-CoV-2 infections by combining various metrics collected from the Apple Watch. We found markers of heart rate variability, or the calculation of the small-time differences between each heartbeat, to be important in identifying infections. These findings demonstrate that wearable devices may improve screening for SARS-CoV-2 infections and the overall tracking of infections. The goal of the study is to determine if SARS-CoV-2 infections, which cause Coronavirus Disease 2019 (COVID- 19) , can be detected using machine learning algorithms applied to the information collected by wearable devices. Four hundred and nine health care workers were enrolled from 7 hospitals in New York City. Participants downloaded a custom smart phone application and were provided with an Apple Watch, if they did not have one of their own. Daily questions collected information from participants about how they feel and whether they were diagnosed with COVID-19. We found that a type of machine learning algorithm, called gradient boosting machines was able to reliably predict SARS-CoV-2 infections by combining various metrics collected from the Apple Watch. We found markers of heart rate variability, or the calculation of the small-time differences between each heartbeat, to be important in identifying infections. These findings demonstrate that wearable devices may improve screening for SARS-CoV-2 infections and the overall tracking of infections. Infection prediction traditionally relies on the development of characteristic symptomatology, prompting confirmatory diagnostic testing. However, the SARS-CoV-2 infection poses a challenge to this traditional paradigm given its variable symptomatology, prolonged incubation period, high rate of asymptomatic infection, and variable access to testing. 1, 2 Ongoing case surges throughout the world, prompted by the delta variant, are characterized by greater infectivity and raise the possibility that SARS-CoV-2 may become endemic. While highly effective vaccines against SARS-CoV-2 have been developed, limited vaccine supplies, low vaccination rates in some communities and the evolution of variants, have prompted ongoing infectious spread. 3 Novel means to identify and predict SARS-CoV-2 infection are needed. Wearable devices are commonly used and can measure multi-modal continuous data throughout daily life. 4 Increasingly, they have been applied to applications in health and disease. 5 Researchers have previously demonstrated that the addition of wearable sensor data to symptom tracking apps can increase the ability to identify Coronavirus Disease-2019 (COVID-19) patients. 6 Additionally, the combination of heart rate, activity, and sleep metrics measured from wearable devices was able to identify 63% of COVID-19 cases before symptoms, further demonstrating the promise of this approach. 6, 7 Our group launched the Warrior Watch Study, which employed a custom smartphone app to remotely monitor health care workers (HCWs) throughout the Mount Sinai Health System. 8 This app delivered surveys to the subject's iPhones and enabled passive collection of Apple Watch data. We previously demonstrated that significant changes in heart rate variability (HRV), the small differences in time between each heartbeat that reflect autonomic nervous system (ANS) function, collected from the Apple Watch, occurred up to 7 days before a COVID-19 diagnosis. 8, 9 OBJECTIVE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 6 Building on these observations, our primary aim was to determine the feasibility to train and validate machine learning approaches combining HRV measurements with resting heart rate (RHR) metrics to predict COVID-19 before diagnosis via nasal polymerase chain reaction (PCR). We recruited HCWs for this prospective observational study from seven hospitals in New York City (The Mount Sinai Hospital, Morningside Hospital, Mount Sinai West, Mount Sinai Beth Israel, Mount Sinai Queens, New York Eye, and Ear Infirmary, Mount Sinai Brooklyn). 8 Subjects were ≥18 years, employees at one of these hospitals, had at least an iPhone series 6, and were willing to wear an Apple Watch Series 4 or higher. Underlying autoimmune or inflammatory diseases, as well as medications known to interfere with ANS function, were exclusionary. The study was approved by the Mount Sinai Hospital Institutional Review Board, and all subjects provided informed consent prior to enrollment. Subjects downloaded the Warrior Watch Study app, signed the electronic consent, and completed baseline demographic questionnaires. Prior COVID-19 diagnosis, medical history, and occupation classification within the hospital were collected via in-app assessments. Subjects completed daily surveys to report any COVID-19 related symptoms, symptom severity, the results for any SARS-CoV-2 nasal PCR tests, and SARS-CoV-2 antibody test results. A positive diagnosis was defined as a self-reported positive SARS-CoV-2 nasal PCR test. Each subject was asked to report the date he or she was diagnosed with a SARS-CoV-2 infection, which correlates with the date the nasal PCR took place. Subjects were asked to wear the Apple Watch for at least 8 hours per day (Figure 1a) . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Subjects wore an Apple Watch Series 4 or higher, which are commercially available wearable devices that connect via Bluetooth to participants' iPhones. The Apple Watch uses infrared and visible-light light-emitting diodes and photodiodes that act as a photoplethysmogram generating time series peaks from each heartbeat. 10 There is a moving average window during which heart rate measurements are calculated while the device is worn. HRV is automatically calculated in ultra-short 60 second recording periods as the standard deviation of the inter-beat interval of normal sinus beats (SDNN), a time-domain index. 9 SDNN reflects sympathetic and parasympathetic nervous system activity. The Warrior Watch Study app collects the generated SDNN and heart rate measurements at survey completion. Our primary analysis consisted of measurements of HRV. HRV follows a circadian pattern that can be characterized by three parameters, namely the MESOR (M: the mean HRV during the day), amplitude (A: maximum HRV during the day), and the acrophase (Ψ: describing when the maximum occurs). 8 We previously developed a mixed-effects COSINOR model to compare HRV circadian patterns at the group level and show that changes in those parameters were associated with infection. 8 Given these findings, daily measurements of HRV were incorporated as potential diagnostic biomarkers for our machine-learning approach. HRV measurements for each day were sparse and were not taken at regular intervals. Thus, daily estimates of HRV COSINOR parameters M, A and Ψ could not be calculated. Due to this limitation, we estimated the daily HRV parameters for each subject and day (t n ) using HRV data from a seven-day sliding window (t n -t n-6 ), thereby creating daily smoothed estimates reflecting changes in the last 7 days (Figure 1b) . To aid the optimization procedures, each subject's initial estimates are obtained using the first two weeks of data from each subject fitted to a mixed-effect COSINOR model with A, M, and Ψ as random effects. 8 From this model, the subjectspecific COVID-negative baseline A, M, and Ψ is derived and used to initialize the iterative 7-day smoothed estimates within each subject. If the number of days in the 7-day window was < 3, the window was expanded to 14 days (t n-14 ). In rare cases, no data was available over 14-days, and parameters were imputed using the Last 8 Observation Carried Forward (LOCF) imputation method. During each window, we also measured the maximum, minimum, mean, and standard deviation (SD) of the RHR. For each day and subject, there were a total of 8 digital biomarkers used to develop our predictive models: HRV-amplitude, HRV-MESOR, HRV-acrophase, daily RHR, RHR-max, RHR-min, RHR-sd, RHR-mean, and 3 demographic variables known to impact HRV-BMI, age, and gender. 11 This smoothed approach ensures that small and transient changes in HRV profile will not dramatically effect daily HRV metrics, rather, our feature engineering approach detects large and sustained changes from the subjects COVID-negative baseline. Data was split into independent training and testing sets, ensuring that observations with proximity in time (±4 days), for the same subject, were in the same set. The rational being those measurements taken on chronologically similar days (e.g., day 6 and day 7), would have similar HRV metrics, and thus would create timedependency bias if they appeared in different sets (eg: day 6 in training, day 7 in testing). This procedure created 100 training and testing sets, containing 90% and 10% of the data respectively. Care was also taken to ensure that the prevalence of COVID-19 positive (COVID+) diagnoses in each set was similar to the prevalence of the full data set. Machine learning model training and evaluation were performed using caret and pROC packages, with tuning parameters estimated using 25 validation sets, selected using the same sampling procedure as the testing data. To safeguard against biases induced by the low prevalence of COVID+ samples, we considered several sampling methods to balance the data during model training, ultimately using class weights to give more weighting to the minority class. Models were trained on each of the 100 training sets, and their performance (auROC, partial-auROC, auPRC, accuracy, precision, sensitivity/recall, specificity, balanced accuracy) was assessed on the corresponding testing set and presented as mean with 95% CI. The sensitivity of the diagnostic algorithm was prioritized since the application of wearable devices as a non-invasive screening modality would be to prompt a confirmatory PCR test. Our models were trained to maximize partial-auROC (sensitivity boundary of >75%), with tuning parameters estimated using the 25validation sample. When exploring the training data, validation performance for several different machine-learning algorithms was assessed (gradient-boosting machines, elastic-net, partial least squares, support vector machines and random 9 forests). However, a gradient boosting machine model (GBM) was selected as the best performing and was used to develop our statistical classifier. When calibrating the model, the validation predictions were used to optimize the probability threshold such that the sensitivity was above >78%. The average value of this probability threshold, over all 100 iterations, was then used to define the final decision rule where cases with a predicted probability above this threshold were considered COVID+. We used a previously described method to estimate each feature's relative influence/importance in the model, over all 100 training sets. 12 All analyses were performed by R, version 4.0.2, including the caret and pROC packages. 13, 14 Four hundred and seven HCWs were enrolled between April 29 th , 2020, and March 2 nd , 2021 ( Table 1 ). The mean age of participants at enrollment was 38 years (SD 9.8), and 34.2% were men. A positive SARS-CoV-2 nasal PCR was reported by 12.0% (49/407) of participants during follow-up (Figure 1c) . The median follow-up time was 73 days (range, 3-253 days) for a total of 28,528 days of observations. A median of 4 HRV samples were collected at varying times per participant per day, and daily measures of RHR. Subjects who were diagnosed with COVID-19 were less likely to report a baseline negative SARS-CoV-2 nasal PCR test (73.5% vs 96.6%, respectively; p<0.001). Given the low prevalence of COVID+ observations (<1% of all daily observations were COVID+), and to avoid biased performance metrics resulting from a single split, the data was split into 100 training (including ~90% of the data) and testing (~10%) sets, using a strategy that guarantees independence between testing and training sets. This procedure produced robust estimates of the model performance in the testing set as well as 95% CI (Figure 1d) . The validation performance of several different machinelearning methods was explored, but ultimately, GBM had the most favorable performance, particularly compared to linear methods such as elastic net regularization ( Table 2) , suggesting a non-linear relationship between HRV and SARS-CoV-2 infection. As would be expected, ROC curves calculated for GBM using all training samples show a high AUC (>99%) (Figure 2a and 2b) , whilst performance in validation sets achieved AUC= 85%. The validation sets were selected to minimize 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 11 time-correlation between training and validation, and to provide less biased performance estimates. We also calculated the area under the precision recall curve (AUPRC), a metric that is more informative for imbalanced data, which achieved 19%, much higher than the prevalence of positive outcomes. 15 It is important to note that, since this wearable device-based algorithm would be used as a screening test, we optimized the model to value sensitivity/recall metrics rather than metrics based on precision. We calibrated the final decision rule to guarantee high sensitivity, as a wearable device-based algorithm would be utilized as a screening test ( Table 3) . This calibrated decision rule increases the true positive rate by allowing for a larger rate of false-positive results. To keep the testing performance unbiased, we used the validation data to optimize the decision rule to guarantee a sensitivity >78% (Figure 2c ). This optimal decision rule was 0.21 (Figure 2c) and produced an average validation Accuracy (Figure 2d ) of 78% (CI ±~1%), with 77% sensitivity and 78% specificity, thus indicating a specificity loss of 18%, for a 19% gain in sensitivity compared to the standard 0.5 decision threshold. When the calibrated diagnostic rule was applied to testing data, an AUC >85% (Figure 2d-e) was achieved. Accuracy 12 was 77%, specificity was 77% (CI ±~1%) (Figure 2d) . The mean sensitivity was 82% (CI ±~4%). The four most important/influential predictors were HRV acrophase, HRV MESOR, age and BMI (Figure 3a) , with median importance >70%. RHR metrics (maximum, minimum, SD, mean) as well as HRV amplitude, were less influential (median importance 25-50%). Sex had importance equal to 0 in most models. To visualize the relationship between feature values and model prediction, we selected the 9 patients for which the model was best able to predict COVID-19 (AUC>79% validation), and plotted the acrophase, amplitude, MESOR and max RHR, as well as the predicted probability, for each day (Figure 3) . This analysis revealed a complex relationship between HRV parameters and SARS-CoV-2 infection. It was notable that, for some subjects, the predicted probability increased when HRV amplitude decreased, which is consistent with our previously published analysis. 8 Our results demonstrate that a machine learning approach applied to the physiological metrics measured by a wearable device identifies and predicts SARS-CoV-2 infections, in a manner suitable for a screening test. This highlights the potential utility of assessing individual changes in passively collected physiological data from wearable devices to facilitate the management of the COVID-19 pandemic. Infections alter physiological metrics differentiating infected and uninfected states. Changes in vital signs in the setting of infection, including increased heart rate, elevated respiratory rate, and altered body temperature, have been well described. 16, 17 In addition to these traditional physiological metrics, ANS function, measured by HRV, is altered during illness. Several small studies have shown that changes in HRV can identify and predict infections. 18, 19 Building on these observations and the growing capabilities of wearable technology, wearable devices have been increasingly explored in the setting of infection. They provide a unique means to measure physiological parameters and offer an advantage over periodic HRV has been evaluated in SARS-CoV-2 infections. A small study of 17 subjects with SARS-CoV-2 found that rises in inflammation markers were preceded by low HRV, while another study on 14 subjects with SARS-CoV-2 in the intensive care unit demonstrated that high frequency HRV was higher and SDNN was lower in patients who later passed away. 23,24 These findings were followed in a larger study of 271 subjects hospitalized with SARS-CoV-2 infections, which calculated HRV from 10 seconds of electrocardiogram recordings at admission. SDNN was predictive of survival (hazard ratio= 0.53) in subjects over 70 years of age. 25 These studies demonstrate that changes in HRV are useful in the context of COVID-19. While they demonstrate a relationship in a cross-sectional fashion, we sought to leverage the longitudinal nature of HRV collection using wearable devices to expand upon these observations. Our group previously demonstrated that changes in the circadian pattern of HRV were associated with a COVID-19 diagnosis. 8 We demonstrated that significant changes, particularly in the amplitude of SDNN, were observed over the 7 days before diagnosis in both symptomatic and asymptomatic individuals. Based on this observation, we built a machine learning algorithm that incorporated HRV Our findings highlight how changes in circadian features of HRV can be used to identify inflammatory events, such as SARS-CoV-2 infections. Traditionally HRV analyses rely on assessing relative sympathetic and parasympathetic ANS tone. However, by evaluating subtle alterations in HRV architecture, nuanced changes in the ANS can be identified to perhaps enhance identification of physiological changes. In our model, alteration of HRV features were more influential predictors of infection compared to heart rate metrics. This observation warrants further exploration in other disease states as well and may identify a physiological feature that can improve predictive wearable-based algorithms in other diseases. It is important to recognize that while wearable device derived physiological metrics offer the ability to identify SARS-CoV-2 infections, these changes are likely not specific to this condition. Other infections, such as influenza, or exacerbations of chronic inflammatory conditions, can result in physiological deviations in HRV and other metrics. 21, 26 Chronic diseases were excluded in our study, however, recognition of this limitation, in all wearable based algorithms, is important especially when applications to real-world data is considered. Operationalization of such algorithms therefore requires a minimum prevalence of the condition to be predicted which will improve its positive predictive value. While our study was able to control for prior infection with SARS-CoV-2 in the analysis, our prior work demonstrated that its impact on HRV circadian pattern was short lived with statistically significant alterations for 7 days from the date of COVID-19 diagnosis, mitigating the long-term impact of prior infection on machine learning models incorporating this data. There are several limitations to our study. First, HRV was collected sporadically by the Apple Watch. We employed statistical modeling to account for 15 this. However, a denser data set using continuous data would likely further improve our predictions. Second, the model we employed used a 7-day smoothing approach. This approach observed infection-induced changes in HRV later than if HRV was estimated using a single-day method. Thus, the approach we employed is conservative. It is important to mention that our approach relies on first establishing a COVID-negative baseline HRV profile for each patient and attempts to learn when changes from this baseline are associated with being COVID-19 positive. Thus, to mimic the clinical implementation of this approach, we used a data splitting approach that allowed samples from the same patient to be in training and test, albeit at different time points. This approach is not beyond critique since the testing and training sets are not fully independent and could lead to an overestimate of performance. Although we argue that our approach appropriately emulates the real- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 DC, EPB, LK, GNN, MSF, ZAF critically revised the manuscript for important intellectual content. RPH, LT, MD, EG, MZ, AK, DH, AB, RP, DC, EPB, LK, GNN, MSF, ZAF provided final approval of the version of the manuscript to be published and agree to be accountable for all aspects of the work. All authors approve the authorship list. All authors had full access to all the data in the manuscript and had final responsibility for the decision to submit for publication. RPH, ZAF, MSF, MD, and LT have verified the underlying data. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 The data was split into 100 training and testing sets, models were fit to the training data and performance was estimated using 10-fold CV. 10-CV predictions were used define a decision rule that increases sensitivity, this decision rule was applied to the predictions in the testing data to get the final performance. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 https://mc.manuscriptcentral.com/jamiao Manuscripts submitted to JAMIA Open 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 https://mc.manuscriptcentral.com/jamiao Manuscripts submitted to JAMIA Open 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Clinical features of patients infected with 2019 novel coronavirus in Wuhan Temporal dynamics in viral shedding and transmissibility of COVID-19 A comprehensive SARS-CoV-2 genomic analysis identifies potential targets for drug repurposing Wearable Devices Are Well Accepted by Patients in the Study and Management of Inflammatory Bowel Disease: A Survey Study Digital Health: Tracking Physiomes and Activity Using Wearable Biosensors Reveals Useful Health-Related Information Wearable sensor data and self-reported symptoms for COVID-19 detection Pre-symptomatic detection of COVID-19 from smartwatch data Use of Physiological Data From a Wearable Device to Identify SARS-CoV-2 Infection and Symptoms and Predict COVID-19 Diagnosis: Observational Study Heart rate variability. Standards of measurement, physiological interpretation, and clinical use. Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology Greedy function approximation: A gradient boosting machine pROC: an open-source package for R and S+ to analyze and compare ROC curves Building Predictive Models in R Using the caret Package The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets Fever and cardiac rhythm The Importance of Respiratory Rate Monitoring: From Healthcare to Sport and Exercise. Sensors (Basel) Sample asymmetry analysis of heart rate characteristics with application to neonatal sepsis and systemic inflammatory response syndrome Continuous multi-parameter heart rate variability analysis heralds onset of sepsis in adults Wearable devices for the detection of COVID-19 Harnessing wearable device data to improve state-level real-time surveillance of influenza-like illness in the USA: a population-based study Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area Heart Rate Variability as a Possible Predictive Marker for Acute Inflammatory Response in COVID-19 Patients Is the heart rate variability monitoring using the analgesia nociception index a predictor of illness severity and mortality in critically ill patients with COVID-19? A pilot study Longitudinal Autonomic Nervous System Measures Correlate With Stress and Ulcerative Colitis Disease Activity and Predict Flare