key: cord-310063-8nbmrjrw authors: Selva, K. J.; van de Sandt, C. E.; Lemke, M. M.; Lee, C. Y.; Shoffner, S. K.; Chua, B. Y.; Nguyen, T. H. O.; Rowntree, L. C.; Hensen, L.; Koutsakos, M.; Wong, C. Y.; Jackson, D. C.; Flanagan, K. L.; Crowe, J.; Cheng, A. C.; Doolan, D. L.; Amanat, F.; Krammer, F.; Chappell, K.; Modhiran, N.; Watterson, D.; Young, P.; Wines, B.; Hogarth, P. M.; Esterbauer, R.; Kelly, H. G.; Tan, H.-X.; Juno, J. A.; Wheatley, A. K.; Kent, S. J.; Arnold, K. B.; Kedzierska, K.; Chung, A. W. title: Distinct systems serology features in children, elderly and COVID patients date: 2020-05-18 journal: nan DOI: 10.1101/2020.05.11.20098459 sha: doc_id: 310063 cord_uid: 8nbmrjrw SARS-CoV-2, the pandemic coronavirus that causes COVID-19, has infected millions worldwide, causing unparalleled social and economic disruptions. COVID-19 results in higher pathogenicity and mortality in the elderly compared to children. Examining baseline SARS-CoV-2 cross-reactive coronavirus immunological responses, induced by circulating human coronaviruses, is critical to understand such divergent clinical outcomes. The cross-reactivity of coronavirus antibody responses of healthy children (n=89), adults (n=98), elderly (n=57), and COVID-19 patients (n=19) were analysed by systems serology. While moderate levels of cross-reactive SARS-CoV-2 IgG, IgM, and IgA were detected in healthy individuals, we identified serological signatures associated with SARS-CoV-2 antigen-specific Fc{gamma} receptor binding, which accurately distinguished COVID-19 patients from healthy individuals and suggested that SARS-CoV-2 induces qualitative changes to antibody Fc upon infection, enhancing Fc{gamma} receptor engagement. Vastly different serological signatures were observed between healthy children and elderly, with markedly higher cross-reactive SARS-CoV-2 IgA and IgG observed in elderly, whereas children displayed elevated SARS-CoV-2 IgM, including receptor binding domain-specific IgM with higher avidity. These results suggest that less-experienced humoral immunity associated with higher IgM, as observed in children, may have the potential to induce more potent antibodies upon SARS-CoV-2 infection. These key insights will inform COVID-19 vaccination strategies, improved serological diagnostics and therapeutics. In-depth characterization of cross-reactive SARS-CoV-2 Ab responses in healthy children compared 103 to healthy elderly is needed to understand whether pre-existing human coronavirus (hCoV)-mediated 104 Ab immunity potentially contributes to COVID-19 outcome. We designed a cross-reactive CoV 105 multiplex array, including SARS-CoV-2, SARS-CoV-1, MERS-CoV and hCoV (229E, HKU1, 106 NL63) spike (S) and nucleoprotein protein (NP) antigens (Extended Data Figure 1b) . CoV-antigen-107 specific levels of isotypes (IgG, IgA, IgM) and subclasses (IgG1, IgG2, IgG3, IgG4, IgA1, IgA2), 108 along with C1q binding (a predictor of ADCA via the classical pathway) and FcγRIIa, FcγRIIb and 109 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 6 147 To interrogate Ab functionality and cross-reactivity between antigens of selected CoV signatures, we 148 conducted a correlation network analysis, focusing upon significant correlations of Ab features 149 selected by Elastic-Net. The children's network (Figure 1f ) demonstrates how SARS-CoV-2 Abs that 150 engaged FcγRIIa-H131 are associated with SARS-CoV-2 IgG, specifically of IgG1 subclass. SARS-151 CoV-2 FcγRIIa-H131 immune complex formation was also significantly correlated with multiple 152 other SARS-CoV-2 Fc responses, including FcγRIIb, C1q, and FcγRIIIa. Of interest, SARS-CoV-1 153 Abs that engaged FcgRIIIa-V158, were highly correlated with SARS-CoV-2 Ab responses, 154 potentially due to their high sequence similarity (77%) (Extended Data Figure 3) with FcγR signatures to a range of CoV. Amino acid (aa) alignment analyses of NP and S1 proteins 162 showed that there is 91% (NP) and 77% (S1) aa similarity between SARS-CoV-2 and SARS-CoV-1 163 proteins, while SARS-COV-2 and MERS share 47% (NP) aa similarity, hCoVs NL63 and 229E share 164 29% and 26% aa similarity, respectively in NP, and hCoV HKU and 229E share 28% and 27% in S1, CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. Table 3 ) were screened for SARS-CoV-2 antigen-specific serological profiles (Extended Data 186 Figure 4 ). An individual who was SARS-CoV-2-exposed but remained SARS-CoV-2 PCR-negative, 187 was also assessed (Donor DD1). Elevated SARS-CoV-2-specific Ab responses in COVID-19 patients 188 relative to healthy or the exposed but PCR-negative individual were observed across multiple 189 titrations (Extended Data Figure 4) . In particular, we found that in the majority of COVID-19 190 patients, the SARS-CoV-2 antigen-specific Abs bound to FcγRIIIaV158 and FcγRIIaH131 soluble 191 dimers at high levels, even at 1:800 plasma titrations, suggesting potent ADCC and ADCP CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . https://doi.org/10.1101/2020.05.11.20098459 doi: medRxiv preprint 8 notably being the healthy exposed SARS-CoV-2 PCR-negative individual (Figure 3d) The majority of COVID-19 moderate/severe samples were collected upon hospital presentation, 235 whereas mild samples were collected upon convalescence (Extended Data Table 3 ). Moreover, there 236 was no significant difference between these groups after adjusting for multiple comparisons, which is 237 not surprising given the small sample size (Extended Data Table 4 with Ab avidity, we therefore conducted urea disassociation assays on a subset of children, elderly 257 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . https://doi.org/10.1101/2020.05.11.20098459 doi: medRxiv preprint 9 and COVID-19 plasma samples (Figure 4c -iv, 4d-iv). No differences in IgA avidity were found 258 between children, elderly and COVID-19 patients. Avidity of RBD-specific IgM from elderly was 259 significantly weaker than COVID-19 patients (p=0.0177), while children's responses, spanning a 260 large range of avidities, were not significantly different (p=0.0696). These data, in combination with 261 the overall higher IgM frequency in children, suggest that children may trend to have more potent 262 RBD-specific IgM which may mature more rapidly upon SARS-CoV-2 exposure, as compared to the 263 elderly. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 13 HLA-DQB1, -DRB1 and -DPB1 alleles in our healthy donor cohort (a,d,g) CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 14 Melbourne. Healthy elderly donors (age 65-92) were recruited at the Deepdene Medical Clinic 477 (Victoria). All healthy donors were recruited prior to SARS-CoV-2 pandemic. SARS-CoV-2-infected 478 patients (age 21-75) were recruited at the Alfred Hospital (AH). Convalescent individuals who 479 recovered from COVID-19 were recruited by James Cook University (DD) or University of 480 Melbourne (CP). Eligibility criteria for COVID-19-acute and convalescent recruitment were age ≥18 481 years old and having at least one swab PCR-positive for SARS-CoV-2. Each patient was categorized 482 into one of the following 6 severity categories: very mild (stay at home minimal symptoms), mild 483 (stay at home with symptoms), moderate (hospitalized, not requiring oxygen), severe/moderate 484 (hospitalized with low flow oxygen), severe (hospitalized with high flow oxygen) or critical (intensive 485 care unit (ICU)). Heparinised blood was centrifuged for 10 min at 300 g to collect plasma, which was CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 15 Krammer) 24 , SARS-CoV-2 Trimeric S (gift from Adam Wheatley) and SClamps of both SARS-CoV-514 2 and MERS-CoV (gift from University of Queensland) (Extended Data Figure 1b) . Tetanus toxoid 515 (Sigma) and influenza hemagglutinin (H1Cal2009; Sino Biological), were also added to the assay as 516 positive controls, while BSA blocked beads were included as negative controls. Magnetic 517 carboxylated beads (Bio Rad) were covalently coupled to the antigens using a two-step carbodiimide 518 reaction, in a ratio of 10 million beads to 100µg of antigen, with the exception of the deglycosylated 519 NPs mentioned above in which 40µg were used instead. Briefly, beads were washed and activated in 520 100 mM monobasic sodium phosphate, pH 6.2, followed by the addition of Sulfo-N- For the detection of IgM, biotinylated mouse anti-human IgM (mAb MT22; MabTech) was added at 547 1.3µg/ml, 25µl per well. After incubation at RT for two hours on a shaker, the plate was washed, and 548 streptavidin, R-Phycoerythrin conjugate (SAPE; Invitrogen) was added at 1µg/ml, 25µl per well. The 549 plate was then incubated at RT for two hours on a shaker before being washed and read as mentioned 550 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 17 588 Statistical Analysis 589 Children versus the elderly Volcano plot was conducted using Prism 8. Statistical significance 590 determined using the Holm-Sidak method, with alpha = 0.05 adjusted for 196 tests (Ab features). Each 591 feature was analyzed individually, without assuming a consistent SD. The overall multiplex dataset was 592 analysed for normal distribution using the Shapiro-Wilk test by Prism 8 The data were further 593 analysed by SPSS statistics 26 (IBM Corp.) using the Kruskal-Wallis one-way analysis with a 594 Bonferroni correction to determine the p-values, differences between groups were considered 595 significant at an adjusted p-value of 0.000035 (Extended data Table 2 ). ELISA data was analyzed 596 using Kruskal-Wallis one-way analysis with Dunn's multiple comparison using Prism 8. Differences 597 between very mild/mild and moderate/severe/critical patients were analysed using the Mann-Whitney 598 test and differences were considered significant at a p-value of 0.05 (Extended data individually if it contained any negative values, by adding the minimum value for that feature back to 610 all samples within that feature. Following this all data was log transformed using the following 611 equation, where x is the right shifted data and y is the right shifted log transformed data: y = 612 log10(x+1). This process transformed the majority of the features to having a normal distribution. In 613 all subsequent multivariate analysis, the data were furthered normalized by mean centring and 614 variance scaling each feature using the zscore function in Matlab. For the HLA analysis, the same 615 data normalization methods were used, except that positive controls were included and all samples 616 with any HLA typing were included. Samples with one copy of each most frequent allele were 617 removed to avoid double classification. Feature Selection Using Elastic Net/PLSR and Elastic Net/PLSDA 620 To determine the minimal set of features (signatures) needed to predict numerical outcomes (age, days 621 from symptom onset) and categorical outcomes (age cohort, COVID-19 infection status, HLA Allele) 622 a three-step process was developed based on 41 . First, the data were randomly sampled without 623 replacement to generate 2000 subsets. The resampled subsets spanned 80% of the original sample 624 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 PCA, PLSDA, and PLSR scores and loadings plots were plotted in Prism 702 version. Statistical analysis were performed in SPSS. Multiple sequence alignment was Jalview 2.10.5 and distance matrix for multiple alignments was done in Ugene 1 The source data underlying Figs 1-4, Extended data Figs 1, 2, 4, 5, 6, Extended Data table 2 and 4 are 708 provided as a Source Data file. The coding used for analysis can be found in the Source Coding file All other data are available from the authors upon request High-throughput, multiplexed IgG subclassing of antigen-specific 714 antibodies from clinical samples SARS-CoV-2 Seroconversion in Humans: A Detailed Protocol for a 717 Serological Assay, Antigen Production, and Test Setup Enhanced binding of antibodies generated during chronic HIV infection to 720 mucus component MUC16 Regularization and variable selection via the elastic net We thank all the participants involved in the study, Ebene Haycroft and Brendan Watts for 728 This work was supported by Jack Ma Foundation to KK Craig Foundation to KLF and KK, NHMRC Leadership Investigator Grant to KK 730 (1173871), NHMRC Program Grant to KK (1071916) Research Grants Council of the Hong Kong Special 732 Administrative Region, China (#T11-712/19-N) to KK. AWC is supported by a NHMRC Career 733 DLD 734 by a NHMRC Principal Research Fellowship (#1137285). SJK by NHMRC Senior Principal Research 735 CES has received funding from the European Union's Horizon 2020 research 736 and innovation program under the Marie Skłodowska-Curie grant agreement No 792532. LH is 737 supported by the Melbourne International Research Scholarship (MIRS) and the Melbourne 738 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity International Fee Remission Scholarship (MIFRS) from The University of Melbourne Correspondence and requests for materials should be addressed to Katherine Kedzierska and Amy Figure 1. Vastly different serological signatures between children and elderly 759 (a) Volcano plot of children (orange) versus elderly (dark blue), open circles are not significantly 760 different between two groups. Data was zscored prior to analysis. (b) PCA of all 196 Ab features for 761 healthy children, adults (light blue square), and elderly. PLSDA scores (c) and loadings plots (d) 762 using the 15-feature Elastic-Net selected signature for children versus elderly (0.88% calibration 763 error, 1.44% cross-validation error). Variance explained on each LV is in parentheses. Statistically 764 significant separation of groups was determined using a two-tailed t-test on LV1 scores p<0.0001, t = 765 21.60 (e) Hierarchical clustering of Elastic-Net selected features for children and elderly. Levels are 766 coloured from low (dark blue) to high (dark red). Correlation network analyses for children (f) and 767 elderly (g) Ab feature type (colour), antigen (shape), correlation strength (line thickness, alpha <0.05), and 769 correlation coefficient Analysis was performed on a subset of the healthy individuals (n=111) for whom HLA 778 class II type was available. Donors heterozygotes for the two most frequent HLA alleles were 779 excluded from PLSDA and loading analysis Figure 3. Healthy versus COVID-19 serological signatures Hierarchical clustering (d) and PLSDA model 784 scores (e) and loadings (f) were performed using the four-feature Elastic-Net selected SARS-CoV-2 785 antigen signature (1.49% calibration error, 1.49% cross-validation error). Variance explained by each 786 LV is in parentheses. (g) Correlation network analysis for COVID-19 patients was performed to 787 identify features significantly associated with the Elastic-Net selected features Ab feature type (colour), antigen (shape), correlation strength (line thickness, alpha <0.05) and 789 correlation coefficient (line colour). Data was zscored prior to analysis COVID-19 Ab responses over time and RBD Abs in healthy versus COVID-19 PLSR model scores plot (a) loadings plot (b) for all COVID-19 patient data on Elastic-Net 15-feature 793 signature. The model goodness of fit (R 2 ) was 0.8361 and goodness of predication Percent variance explained by each latent variable in parenthesis. Multiplex MFI data for IgM (c-i) Children (orange), adults (light blue), elderly (dark blue) and COVID-19 patients (red) Wallis with Dunn's multiple comparisons, exact p-values are provided Extended Data Figure 1. Multiplex Assay setup and optimization 802 (a-i-iii) Schematic of bead-based multiplex assay. (b) Overview of antigens included in the assay Multiplex was validated by measuring a subset of healthy samples both in singleplex and multiplex Strong correlations suggest that multiplexing did not affect measurement of Ab responses size, or sampled all classes at the size of the smallest class for categorical outcomes, which corrected 625 for any potential effects of class size imbalances during regularization. Elastic-Net regularization was 626 then applied to each of the 2000 resampled subsets to reduce and select features most associated with 627 the outcome variables. The Elastic-Net hyperparameter, alpha, was set to have equal weights between 628 the L1 norm and L2 norm associated with the penalty function for least absolute shrinkage and 629 selection (LASSO) and ridge regression, respectively 42 . By using both penalties, Elastic-Net provides 630 sparsity and promotes group selection. The frequency at which each feature was selected across the 631 2000 iterations was used to determine the signatures by using a sequential step-forward algorithm that 632 iteratively added a single feature into the PLSR (numerical outcome) or PLSDA (categorical 633 outcome) model starting with the feature that had the highest frequency of selection, to the lowest 634 frequency of selection. Model prediction performance was assessed at each step and evaluated by 10-635 fold cross-validation classification error for categorical outcomes and 10-fold goodness of prediction 636 (Q 2 ) for numerical outcomes. The model with the lowest classification error and highest Q 2 within a 637 0.01 difference between the minimum classification error or the maximum Q 2 were selected as the 638 minimum signature. If multiple models fell within this range, the one with the least number of 639 features was selected and if there was a large disparity between calibration and cross-validation error 640 (overfitting), the model with the least disparity and best performance was selected. Matlab, was used in conjunction with Elastic-Net, described above, to identify and visualize 656 signatures that distinguish categorical outcomes (age cohort, COVID-19 infection status). This 657 supervised method assigns a loading to each feature within a given signature and identifies the linear 658 combination of loadings (a latent variable) that best separates the categorical groups. A feature with a 659 high loading magnitude indicates greater importance for separating the groups from one another. Each 660 sample is then scored and plotted using their individual response measurements expressed through the 661 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020. . https://doi.org/10. 1101 /2020 19 latent variables (LVs). The scores and loadings can then be cross referenced to determine which 662 features are loaded in association with which categorical groups (positively loaded features are higher 663 in positively scoring groups etc). All models go through 10-fold cross-validation, where iteratively 664 10% of the data is left out as the test set, and the rest is used to train the model. Model performance is 665 measured through calibration error (average error in the training set) as well as cross-validation error 666 (average error in the test set), with values near zero being best. All models were orthogonalized to 667 enable clear visualization of results. Statistically significant separation of groups on the PLSDA score 668 plots was determined using a two-tailed t-test on LV1 scores in Prism 8. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020 and mild cases were coloured black, while moderate to severe cases were coloured orange. For 828 comparison, two healthy elderly plasma were included (green). DD1, who was SARS CoV-2-exposed 829 but remained SARS CoV-2 PCR-negative, was also included (purple). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted May 18, 2020 . . https://doi.org/10.1101 /2020