key: cord-0053802-puzf5cq0 authors: Juárez, Miguel A.; Pennisi, Marzio; Russo, Giulia; Kiagias, Dimitrios; Curreli, Cristina; Viceconti, Marco; Pappalardo, Francesco title: Generation of digital patients for the simulation of tuberculosis with UISS-TB date: 2020-12-14 journal: BMC Bioinformatics DOI: 10.1186/s12859-020-03776-z sha: 7ee4a705d4e35da8cea006ad623718801eb4458b doc_id: 53802 cord_uid: puzf5cq0 BACKGROUND: The STriTuVaD project, funded by Horizon 2020, aims to test through a Phase IIb clinical trial one of the most advanced therapeutic vaccines against tuberculosis. As part of this initiative, we have developed a strategy for generating in silico patients consistent with target population characteristics, which can then be used in combination with in vivo data on an augmented clinical trial. RESULTS: One of the most challenging tasks for using virtual patients is developing a methodology to reproduce biological diversity of the target population, ie, providing an appropriate strategy for generating libraries of digital patients. This has been achieved through the creation of the initial immune system repertoire in a stochastic way, and through the identification of a vector of features that combines both biological and pathophysiological parameters that personalise the digital patient to reproduce the physiology and the pathophysiology of the subject. CONCLUSIONS: We propose a sequential approach to sampling from the joint features population distribution in order to create a cohort of virtual patients with some specific characteristics, resembling the recruitment process for the target clinical trial, which then can be used for augmenting the information from the physical the trial to help reduce its size and duration. Once a person is diagnosed with TB, one of the most critical issues is the duration of the therapy, because of the high costs involved, the increased chances of non-compliance (which increase the probability of developing an MDR strain), and the time the patient is still infectious to others. One exciting possibility to shorten the duration of the therapy are novel host-reaction therapies (HRT), as an adjuvant for antibiotic therapy. Typical endpoints in the clinical trials for HRTs are time to sputum culture conversion, and incidence of recurrence. While for the first it is in some cases possible to have a statistically powered evidence for efficacy in a phase II clinical trial, recurrence almost always requires a phase III clinical trial with thousands of patients involved, and huge costs. The in silico trials for tuberculosis vaccine development (STriTuVaD) project is an EU funded, multidisciplinary consortium testing the RUTI vaccine in a Phase IIb clinical trial. RUTI ® antitubercular vaccine, provided by Archivel Farma S.L, is a polyantigenic liposomal vaccine containing fragments of Mycobacterium tuberculosis cells, currently being developed as therapeutic vaccine in patients with pulmonary tuberculosis. The vaccine, shown to be one of the most advanced therapeutic vaccines against drug sensitive TB and MDR-TB, has already been studied in healthy volunteers and for the prevention of active TB in patients with latent TB [2] . To help in this development, we extend Universal Immune System Simulator (UISS) [3, 4] to include the relevant determinants of such clinical trial, we establish its predictive accuracy against the individual patients recruited in the trial, use it to generate digital patients, predict their response to the host-reaction therapy being tested, and combine them to the observations made on physical patients using a new in silico-augmented clinical trial approach that uses a Bayesian adaptive design. This approach, where found effective could drastically reduce the cost of innovation in this critical sector of public healthcare. To reproduce biological the diversity of the subjects to be simulated, an appropriate strategy for the generation of libraries of digital patients is developed by identifying a vector of features involving both biological and pathophysiological parameters, facilitating the personalisation of the digital patient. In this paper we sketch the strategy we adopt to generate the cohort of digital patients, and show some preliminary results about the dynamics of TB on a subset of these patients. First, we briefly describe UISS and its extension to TB. We will briefly describe here the UISS computational framework and its extension to model tuberculosis, UISS-TB. The interested reader can find more detail in [5] . UISS is a multi-agent framework for the simulation of the immune system dynamics that can be extended to track specific diseases and related treatments. Unlike classical top-down approaches, where mean behaviours are modelled through systems of differential equations [6] [7] [8] , agent based models and multi-agent systems track individual entities. It is the interactions between these entities that can give rise to global nonlinear behaviours. UISS has been developed as a multi-scale computer simulator of the immune system, as it takes into account both cellular and molecular entities and processes. UISS has a proven track record, for instance it has been used for modelling the effects of a vaccine against the onset of mammary carcinoma [9, 10] and consequent lung metastases [11] ; for the initial stages of atherosclerosis [12] , for melanoma [3] ; more recently, in the study of multiple sclerosis [4, 13] and for testing the efficacy of citrusderived adjuvants for influenza vaccines and human papilloma virus [14, 15] . For its use within STriTuVaD, we have extended UISS to include TB dynamics along with the artificial immunity induced by vaccination strategies as presented in [5] . In order to depict individuals, a vector of features comprising biological and pathophysiological parameters has been identified. The list of parameters, their relative range and units are displayed in Table 1 . In order to create an in silico patient, one needs to provide a single value for each feature. These values could be taken from individual physical patients; however, if a cohort of digital patients is to be produced, one should have a mechanism for producing as many different input vectors as needed, that are biological/physiological plausible. Formally, this requires the characterisation of the joint distribution of the inputs in the population. We have compiled typical values and standard deviations for each feature, providing a way to generate plausible values for each component at a time. Proceeding in this way would neglect the biological correlations between features and thus would not guarantee a physiologically plausible input vector. Hence, we must take into account these In theory, one could elicit the joint distribution of the features vector, i.e. describe mathematically how each feature relates to the others in a space of 22 dimensions; but this would be not only extremely difficult, but also time consuming and data demanding. Our approach is to rely on current mathematical biology consensus and use a Gaussian to represent the population distribution. The additional advantage of using this approach will be discussed in the next section. Formally, we say that the vector f = f 1 , . . . , f d follows a d-variate Gaussian distribution with joint probability density function, with mean µ = {µ 1 , . . . , µ d } and covariance matrix, where, So, if we are able to elicit a measure of correlation between two inputs, we can calculate their covariance. The elements in the diagonal, σ 2 i are the marginal variances of each element, f i , and µ i the corresponding marginal mean. As mentioned above, we already have compiled a list with these values, so we have elicited values for µ and the diagonal elements of , σ 2 i . Once µ and have been elicited, generating an in silico profile is a relatively trivial task: one must sample a point in the 22-dimensional space, consistent with N 22 (f |µ, �) . However, we can exploit the properties of the Gaussian distribution to produce a cohort consistent with some specific characteristics. Say, for instance, that our target population has a particular range of BL, we would like then to produce digital patients consistent with that specific profile. Formally, let f 1 represent BL and f −1 = f 2 , . . . , f 22 , the rest of the features; we would like to sample from N 21 (f −1 |f 1 , µ, �) , ie the conditional distribution of the rest of the features, given that BL has a specific value. This is a standard procedure, which can be readily implemented. We can go further and sort the list of features according to either their importance in determining the profile of a patient, or to the precision of their elicited mean, variance and covariance, and then proceed to sample from the conditional distributions. In general, let f s denote the vector of features with pre-specified values, so that f = f s , f r , f s ∈ R d−q , where f r ∈ R q is the vector of free features. The conditional distribution, p(f r |f s = a) = N q (f r |ν, �) with where the Schur complement of rr in . Judicious choice of f s and f r enables sampling sequentially, e.g. from least to most important feature. We created an R script [17] for the generation of digital patents, available from the corresponding author upon request. We report results from three groups of 15 patients with different profiles, each with fixed (Age, BMI and MtbSputum) to roughly represent different profiles in the population and initial bacterial load. Profile 1 has (35, 21.4, 15), Profile 2 (45, 28.2, 502), and Profile 3 (55, 31.8, 910), the full set of values can be obtained from the Additional file 1. These can be used as input to the UISS-TB web interface, available from www.strit uvad.eu (accessed on 28/07/20), by selecting the Tuberculosis disease model, hence accessible to any user with a conventional computer and access to the internet. The GUI panel displays default values and admissible ranges for the vector of features parameters. Once the specific vector of features is completed, the user can click on the Submit button and a unique identification simulation number is assigned. The user can check the simulation status by clicking on the check status button, after selecting the appropriate simulation id. When the simulation is complete, the user can visualise results of immune system dynamics. In our case, the progression of each patient was simulated 50 times for 1 year, with levels of the various species recorded every 600 seconds. The data from each patient requires roughly 100 MB of disk storage. We use the total (Ab) to exemplify some characterisation of the output; e.g. Fig. 1 shows the total Ab count for one simulation of the 15 patients in Profile 1. In order to characterise the mean behaviour, we average the 50 repetitions per patient. Figure 2 depicts the median and quartiles for a selection of patients (columns) for each profile (rows). It is clear there is an increased variability around the main and secondary peaks; while levels consistently fall back to nought after roughly 16 days (3500 h). The distribution of time at the peak level is illustrated in Fig. 3 , it occurs consistently within 112-116 days for all profiles, while Profile 3 shows a slightly increased variability. In order to produce virtual cohorts of patients, we propose a sequential approach based on a characterisation of the distribution of these features in the population of interest; the approach allows to fix any combination of features, enabling mimicking patient selection criteria, thus yielding a method for setting up augmented in silico clinical trials. Supplementary information accompanies this paper at https ://doi.org/10.1186/s1285 9-020-03776 -z. Additional file 1: Profile traces. Ab: Antibody count; MDR: Multi-drug resistant; STriTuVaD: In silico trials for tuberculosis vaccine development; TB: Tuberculosis; UISS: Universal Immune System Simulator. WHO: Global tuberculosis report RUTI vaccination enhances inhibition of mycobacterial growth ex vivo and induces a shift of monocyte phenotype in mice SimB16: modeling induced immune system response against B16-melanoma Agent based modeling of the effects of potential treatments over the blood brain barrier in multiple sclerosis Predicting the artificial immunity induced by RUTI ® vaccine against tuberculosis using universal immune system simulator (UISS) ODEs approaches in modeling fibrosis: comment on "Towards a unified approach in the modeling of fibrosis: a review with research perspectives" by Martine Ben Amar and Carlo Bianca Modeling biology spanning different scales: an open challenge Induction of T-cell memory by a dendritic cell vaccine: a computational model Analysis of vaccine's schedules using models In silico modeling and in vivo efficacy of cancer-preventive vaccinations Modeling the competition between lung metastases and the immune system using agents Modeling immune system control of atherogenesis Agent based modeling of relapsing multiple sclerosis: a possible approach to predict treatment outcome A computational model to predict the immune system activation by citrus-derived vaccine adjuvants. Bioinformatics Combining agent based-models and virtual screening techniques to predict the best citrus-derived vaccine adjuvants against human papilloma virus Host-directed therapy of tuberculosis based on interleukin-1 and type I interferon crosstalk R: a language and environment for statistical computing. R Foundation for Statistical Computing Generation of digital patients for the simulation of tuberculosis with UISS-TB This is an extended version of [18] . This article has been published as part of BMC Bioinformatics Volume 21 Supplement 17 2020: Selected papers from the 3rd International Workshop on Computational Methods for the Immune System Function (CMISF 2019). The full contents of the supplement are available at https ://bmcbi oinfo rmati cs.biome dcent ral.com/artic les/suppl ement s/volum e-21-suppl ement -17. UISS-TB is a state-of-the-art agent based model capable of tracking the dynamics of TB infection in humans. Individual digital patients are defined by a vector features, known to be fundamental in TB infection dynamics and normally measured clinically, hence often readily available. The datasets generated and analysed during the current study are not publicly available due to size restrictions but are available from the corresponding author on reasonable request. Not applicable. Not applicable. The authors declare that they have no competing interests.