key: cord-103435-yufvt44t authors: van Aalst, Marvin; Ebenhöh, Oliver; Matuszyńska, Anna title: Constructing and analysing dynamic models with modelbase v1.0 - a software update date: 2020-10-02 journal: bioRxiv DOI: 10.1101/2020.09.30.321380 sha: doc_id: 103435 cord_uid: yufvt44t Background Computational mathematical models of biological and biomedical systems have been successfully applied to advance our understanding of various regulatory processes, metabolic fluxes, effects of drug therapies and disease evolution or transmission. Unfortunately, despite community efforts leading to the development of SBML or the BioModels database, many published models have not been fully exploited, largely due to lack of proper documentation or the dependence on proprietary software. To facilitate synergies within the emerging research fields of systems biology and medicine by reusing and further developing existing models, an open-source toolbox that makes the overall process of model construction more consistent, understandable, transparent and reproducible is desired. Results and Discussion We provide here the update on the development of modelbase, a free expandable Python package for constructing and analysing ordinary differential equation-based mathematical models of dynamic systems. It provides intuitive and unified methods to construct and solve these systems. Significantly expanded visualisation methods allow convenient analyses of structural and dynamic properties of the models. Specifying reaction stoichiometries and rate equations, the system of differential equations is assembled automatically. A newly provided library of common kinetic rate laws highly reduces the repetitiveness of the computer programming code, and provides full SBML compatibility. Previous versions provided functions for automatic construction of networks for isotope labelling studies. Using user-provided label maps, modelbase v1.0 streamlines the expansion of classic models to their isotope-specific versions. Finally, the library of previously published models implemented in modelbase is continuously growing. Ranging from photosynthesis over tumour cell growth to viral infection evolution, all models are available now in a transparent, reusable and unified format using modelbase. Conclusion With the small price of learning a new software package, which is written in Python, currently one of the most popular programming languages, the user can develop new models and actively profit from the work of others, repeating and reproducing models in a consistent, tractable and expandable manner. Moreover, the expansion of models to their label specific versions enables simulating label propagation, thus providing quantitative information regarding network topology and metabolic fluxes. Mathematical models are accepted as valuable tools in advancing biological and medical research [1, 2] . In particular, models based on ordinary differential equations (ODE) found their application in a variety of fields. Most recently, deterministic models simulating the dynamics of infectious diseases gained the interest of the general public during our combat of the Covid-19 pandemic, when a large number of ODE based mathematical models has been developed and discussed even in nonscientific journals (see for example [3] [4] [5] ). Such focus on mathematical modelling is not surprising, because computational models allow for methodical investigations of complex systems under fixed, controlled and reproducible conditions. Hence, the effect of various perturbations of the systems in silico can be inspected systematically. Importantly, long before exploring their predictive power, the model building process itself plays an important role in integrating and systematising vast amounts of available information [6] . Properly designed and verified computational models can be used to develop hypotheses to guide the design of new research experiments (e.g., in immunology to study lymphoid tissue formation [7] ), support metabolic engineering efforts (e.g., identification of enzymes to enhance essential oil production in peppermint [8] ), contribute to tailoring medical treatment to the individual patient in the spirit of precision medicine (e.g., in oncology [2] ), or guide political decision making and governmental strategies (see the review on the impact of modelling for European Union Policy [9] ). Considering their potential impact, it is crucial that models are openly accessible so that they can be verified and corrected, if necessary. In many publications, modelling efforts are justified by the emergence of extraordinary amounts of data provided by new experimental techniques. However, arguing for the necessity of model construction only because a certain type or amount of data exists, ignores several important aspects. Firstly, computational models are generally a result of months, if not years, of intense research, which involves gathering and sorting information, simplifying numerous details and distilling out the essentials, implementing the mathematical description in computer code, carrying out performance tests and, finally, validation of the simulation results. Our understanding of many phenomena could become deeper if instead of constructing yet another first-generation model, we could efficiently build on the knowledge that was systematically collected in previously developed models. Secondly, the invaluable knowledge generated during the model construction process is often lost, mainly because of the main developer leaves the research team, but also due to unfavourable funding strategies. It is easier to obtain research funds for the construction of novel, even if perfunctory models, than to support a long-term maintenance of existing ones. Preservation of the information collected in the form of a computational model has became an important quest in systems biology, and has been to some extend addressed by the community. Development of the Systems Biology Markup Language (SBML) [10] for unified communication and storage of biomedical computational models and the existence of the BioModels repository [11] already ensured the survival of constructed models beyond the academic lifetime of their developers or the lifetime of the software used to create them. But a completed model in the SBML format does not allow to follow the logic of model construction and the knowledge generated by the building process is not preserved. Such knowledge loss can be prevented by providing simple-to-use toolboxes enforcing a universal and readable form of constructing models. We have therefore decided to develop modelbase [12] , a Python package that encourages the user to actively engage in the model building process. On the one hand we fix the core of the model construction process, while on the other hand the software does not make the definitions too strict, and fully integrates the model construction process into the Python programming language. This differentiates our software from 2/13 many available Python-based modelling tools (such as ScrumPy [13] or PySCeS [14] ) and other mathematical modelling languages (recently reviewed from a software engineering perspective by Schölzel and colleagues [15] ). We report here new features in modelbase v1.0, developed over the last two years. We have significantly improved the interface to make model construction easier and more intuitive. The accompanying repository of re-implemented, published models has been considerably expanded, and now includes a diverse selection of biomedical models. This diversity highlights the general applicability of our software. Essentially, every dynamic process that can be described by a system of ODEs can be implemented with modelbase. Implementation modelbase is a Python package to facilitate construction and analysis of ODE based mathematical models of biological systems. Version 1.0 introduces changes not compatible with the previous official release 0.2.5 published in [12] . All API changes are summarised in the official documentation hosted by ReadTheDocs. The model building process starts by creating a modelling object in the dedicated Python class Model and adding to it the chemical compounds of the system. Then, following the intuition of connecting the compounds, you construct the network by adding the reactions one by one. Each reaction requires stoichiometric coefficients and a kinetic rate law. The latter can be provided either as a custom function or selecting from the newly provided library of rate laws. The usage of this library (ratelaws) reduces the repetitiveness by avoiding boilerplate code. It requires the user to explicitly define reaction properties, such as directionality. This contributes to a systematic and understandable construction process, following the second guideline from the Zen of Python: "Explicit is better than implicit". From this, modelbase automatically assembles the system of ODEs. It also provides numerous methods to conveniently retrieve information about the constructed model. In particular, the get * methods allow inspecting all components of the model, and calculate reaction rates for given concentration values. These functions have multiple variants to return all common data structures (array, dictionary, data frames). After the model building process is completed, simulation and analyses of the model are performed with the Simulator class. Currently, we offer interfaces to two integrators to solve stiff and non-stiff ODE systems. Provided you have installed the Assimulo package [16] , as recommended in our installation guide, modelbase will be using CVode, a variable-order, variable-step multi-step algorithm. The CVode class provides a direct connection to Sundials, the SUite of Nonlinear and DIfferential/ALgebraic equation Solvers [17] which is a powerful industrial solver and robust time integrator, with a high computing performance. In case when Assimulo is not available, the software will automatically switch to the SciPy library [18] using lsoda as an integrator, which in our experience showed a lower computing performance. Sensitivity analysis provides a theoretical foundation to systematically quantify effects of small parameter perturbations on the global system behaviour. In particular, Metabolic Control Analysis (MCA), initially developed to study metabolic systems, is an important and widely used framework providing quantitative information about the response of the system to perturbations [19, 20] . This new version of modelbase now has a full suite of methods to calculate response coefficients and elasticises, and plotting them as a heat-map, giving a clear and intuitive colour-coded visualisation of the results. An example of such visualisation, for a re-implemented toy model of the upper part of glycolysis (Section 3.1.2 [21] ), can be found in Figure 1 . Many of the available relevant software packages for building computational models restrict the users by providing unmodifiable plotting routines with predefined settings that may not suit your personal preferences. With modelbase v1.0 we constructed our plotting functions allowing the user to pass optional keyword-arguments (often abbreviated as **kwargs), so you still access and change all plot elements, providing a transparent and flexible interface to the commonly used matplotlib library [22] . The easy access functions to visualise the results of simulations were expanded from the previous version. They now include plotting selections of compounds or fluxes, phase-plane analysis and the results of MCA. Models for isotope tracing modelbase has additionally been developed to aid the in silico analyses of label propagation during isotopic studies. In order to simulate the dynamic distribution of isotopes all possible labelling patterns for all intermediates need to be created. By providing an atom transition map in the form of a list or a tuple, all 2 N isotope-specific versions of a chemical compound are created automatically, where N denotes the number of possibly labelled atoms. Changing the name of previous function carbonmap to labelmap in v1.0 acknowledges the diversity of possible labelling experiments that can be reproduced with models built using our software. Sokol and Portais derived the theory of dynamic label propagation under stationary assumption [23] . In steady state the space of possible solutions is reduced and the labelling dynamics can be represented by a set of linear differential equations. We have used this theory and implemented an additional class LinearLabelModel which allows Figure 2 . Labelling curves in a linear non-reversible pathway. Example of label propagation curves for a linear non-reversible pathway of 5 randomly sized metabolite pools, as proposed in the paper by Sokol and Portais [23] . Circles mark the position at which the first derivative of each labelling curve reaches maximum. In the original paper this information has been used to analyse the label shock wave (LSW) propagation. To reproduce these results run the Jupyter Notebook from the Additional file 2. rapid calculation of the label propagation given the steady state concentrations and fluxes of the metabolites [23] . modelbase will automatically build the linear label model after providing the label maps. Such a model is provided in Figure 2 , where we simulate label propagation in a linear non-reversible pathway, as in Figure 1 in [23] . The linear label models are constructed using modelbase rate laws and hence can be fully exported as SBML file. Many models loose their readability due to the inconsistent, intractable or misguided naming of their components (an example is a model with reactions named as v1 -v10, without referencing them properly). By providing meta data for any modelbase object, you can abbreviate component names in a personally meaningful manner and then supply additional annotation information in accordance with standards such as MIRIAM [24] via the newly developed meta data interface. This interface can also be used to supply additional important information compulsorily shared in a publication but not necessarily inside the code, such like the unit of a parameter. With the newly implemented changes our package becomes more versatile and user friendly. As argued before, its strength lies in its flexibility and applicability to virtually any biological system, with dynamics that can be described using an ODE system. There exist countless mathematical models of biological and biomedical systems derived using ODEs. Many of them are rarely re-used, at least not to the extent that could be reached, if models were shared in a readable, understandable and reusable way [15] . As our 5/13 package can be efficiently used both for the development of new models, as well as the reconstruction of existing ones, as long as they are published with all kinetic information required for the reconstruction, we hope that modelbase will in particular support users with limited modelling experience in re-constructing already existing work, serving as a starting point for their further exploration and development. We have previously demonstrated the versatility of modelbase by re-implementing mathematical models previously published without the source code: two models of biochemical processes in plants [25, 26] , and a model of the non-oxidative pentose phosphate pathway of human erythrocytes [27, 28] . To present how the software can be applied to study medical systems, we used modelbase to re-implement various models, not published by our group, and reproduced key results of the original manuscripts. It was beyond our focus to verify the scientific accuracy of the corresponding model assumptions. We have selected them to show that despite describing different processes, they all share a unified construct. This highlights that by learning how to build a dynamic model with modelbase, you in fact do not learn how to build a one-purpose model, but expand your toolbox to be capable of reproducing any given ODE based model. All examples are available as jupyter notebooks and listed in the Additional Files. For the purpose of this paper, we surveyed available computational models and subjectively selected a relatively old publication of significant impact, based on the number of citations, published without providing the computational source code, nor details regarding the numerical integration. We have chosen a four compartment model of HIV immunology that investigates the interaction of single virus population with the immune system described only by the CD4 + T cells, commonly known as T helper cells [29] . We have implemented the four ODEs describing the dynamics of uninfected (T), latently infected (L) and actively infected CD4 + T cells (A) and infectious HIV population (V). We have reproduced the results from Figure 3 from the original paper [29] showing the decrease in the overall population of CD4+ T-cell (uninfected + latently infected + actively infected CD4+) over time, depending on the number of infectious particles produced per actively infected cell (N). To reproduce these results run the Jupyter Notebook from the Additional file 3. In the Figure 3 , we reproduce the results from Fig. 3 from the original paper, where by changing the number of infectious particles produced per actively infected cell (N) we follow the dynamics of the overall T cell population (T+L+A) over a period of 10 years. The model has been further used to explore the effect of azidothymidine, an antiretroviral medication, by decreasing the value of N after 3 years by 25% or 75%, mimicking the blocking of the viral replication of HIV. A more detailed description of the timedependent drug concentration in the body is often achieved with pharmacokinetic models. Mathematical models based on a system of differential equations that link the dosing regimen with the dynamics of a disease are called pharmacokinetic-pharmacodynamic (PK-PD) models [30] and with the next example we explore how modelbase can be used to develop such models. The technological advances forced a paradigm shift in many fields of our life, including medicine, making more personalised healthcare not only a possibility but a necessity. A pivotal role in the success of precision medicine will be to correctly determine dosing regimes for drugs [31] and PK-PD models provide a quantitative tool to support this [32] . PK-PD models have proved quite successful in many fields, including oncology [33] and here we used the classical tumour growth model by Simeoni and colleagues, originally implemented using the industry standard software WinNonlin [34] . As the pharmacokinetic model has not been fully described we reproduced only the highly simplified case, where we assume a single drug administration and investigate the effect of drug potency (k 2 ) on simulated tumour growth curves. In Figure 4 we plot the simulation results of the modelbase implementation of the system of four ODEs over the period of 18 days where we systematically changed the value of k 2 , assuming a single drug administration on Day 9. With the MCA suite available with our software we can calculate the change of the system in response to perturbation of all other system parameters. Such quantitative description of the systems response to local parameter perturbation provides support in further studies of the rational design of combined drug therapy or discovery of new drug targets, as described in the review by Cascante and colleagues [35] . Finally, compartmental models based on ODE systems have a long history of application in mathematical epidemiology [36] . Many of them, including numerous recent publications studying the spread of coronavirus, are based on the classic epidemic Susceptible-Infected-Recovered (SIR) model, originating from the theories developed by Kermack and McKendrick at the beginning of last century [37] . One of the most critical information searched for while simulating the dynamics of infectious disease is the existence of disease free or endemic equilibrium and assessment of its stability [38] . Indeed periodic oscillations have been observed for several infectious diseases, including measles, influenza and smallpox [36] . To provide an overview of more modelbase functionalities we have implemented a relatively simple SIR model based on the recently published autonomous model for smallpox [39] . We have generated damped oscillations and visualised them using the built-in function plot phase plane (see Figure 5 ). In the attached Jupyter notebook we present how quickly and efficiently in terms of lines of code, the SIR model is built and how to add and remove new reactions and/or compounds to construct further variants of this model, such as a SEIR (E-exposed) or SIRD (D-deceased) model. Figure 4 . Compartmental pharmacokinetic-pharmacodynamic model of tumour growth after anticancer therapy. We have reproduced the simplified version of the PK-PD model of tumour growth, where PK part is reduced to a single input and simulated the effect of drug potency (k 2 ) on tumour growth curves. The system of four ODEs describing the dynamics of the system visualised on a scheme above is integrated over the period of 18 days. We systematically changed the value of k 2 , assuming a single drug administration on Day 9. We have obtained the same results as in the Figure 3 in the original paper [34] . To reproduce these results run the Jupyter Notebook from the Additional file 4. SIR model with vital dynamics including birth rate has been adapted based on the autonomous model to simulate periodicity of chicken pox outbreak in Hida, Japan [39] . To reproduce these results run the Jupyter Notebook from the Additional file 5. We are presenting here updates of our modelling software that has been developed to simplify the building process of mathematical models based on ODEs. modelbase is fully embedded in the Python programming language. It facilities a systematic construction of new models, and reproducing models in a consistent, tractable and expandable manner. As ODEs provide a core method to describe the dynamical systems, we hope that our software will serve as the base for deterministic modelling, hence it's name. With the smoothed interface and clearer description of how the software can be used for medical purposes, such as simulation of possible drug regimens for precision medicine, we expect to broaden our user community. We envisage that by providing the MCA functionality, also users new to mathematical modelling will adapt a working scheme where such sensitivity analyses become an integral part of the model development and study. The value of sensitivity analyses is demonstrated by considering how results of such analyses have given rise to new potential targets for drug discovery [35] . We especially anticipate that the capability of modelbase to automatically generate labelspecific models will prove useful in predicting fluxes and label propagation dynamics through various metabolic networks. In emerging fields such as computational oncology, such models will be useful to, e.g., predict the appearance of labels in cancer cells. If you have any questions regarding modelbase, you are very welcome to ask them. It is our mission to enable reproducible science and to help putting the theory into action. Project name: modelbase Project home page: https://pypi.org/project/modelbase/ Operating system(s): Platform independent Programming language: Python Other requirements: None Licence: GNU General Public License (GPL), version 3 Any restrictions to use by non-academics: None Computational systems biology Computational oncology -mathematical modelling of drug regimens for precision medicine Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China An updated estimation of the risk of transmission of the novel coronavirus (2019-nCov) COVID-19 outbreak on the Diamond Princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures How Computational Models Can Help Unlock Biological Systems Model-driven experimentation: A new approach to understand mechanisms of tertiary lymphoid tissue formation, function, and therapeutic resolution Mathematical Modeling-Guided Evaluation of Biochemical, Developmental, Environmental, and Genotypic Determinants of Essential Oil Composition and Yield in Peppermint Leaves Modelling for EU Policy support: Impact Assessments The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models BioModels Database: A Repository of Mathematical Models of Biological Processes Building Mathematical Models of Biological Systems with modelbase ScrumPy : metabolic modelling with Python Modelling cellular systems with PySCeS Required characteristics for modeling languages in systems biology: A software engineering perspective. bioRxiv ScienceDirect Assimulo: A unified framework for ODE solvers SUNDIALS : Suite of Nonlinear and Differential / Algebraic Equation Solvers Algorithms for Scientific Computing in Python The Control of Flux: 2 1 A Linear Steady-State Treatment of Enzymatic Chains. General Properties, Control and Effector Strength Systems Biology: A Textbook Matplotlib: A 2d graphics environment Theoretical Basis for Dynamic Label Propagation in Stationary Metabolic Networks under Step and Periodic Inputs Minimum information requested in the annotation of biochemical models (MIRIAM) Short-term acclimation of the photosynthetic electron transfer chain to changing light: a mathematical model A mathematical model of the Calvin photosynthesis cycle Comparison of computer simulations of the F-type and L-type non-oxidative hexose monophosphate shunts with 31P-NMR experimental data from human erythrocytes 13C n.m.r. isotopomer and computersimulation studies of the non-oxidative pentose phosphate pathway of human erythrocytes Dynamics of hiv infection of cd4+ t cells Modeling of Pharmacokinetic/Pharmacodynamic (PK/PD) Relationships: Concepts and Perspectives Precision medicine: an opportunity for a paradigm shift in veterinary medicine HHS Public Access Precision dosing in clinical medicine: present and future Different ODE models of tumor growth can deliver similar results Predictive Pharmacokinetic-Pharmacodynamic Modeling of Tumor Growth Kinetics in Xenograft Models after Administration of Anticancer Agents Metabolic control analysis in drug discovery and disease The Mathematics of Infectious Diseases A contribution to the mathematical theory of epidemics Qualitative and bifurcation analysis using an SIR model with a saturated treatment function Emergence of oscillations in a simple epidemic model with demographic data The authors declare that they have no competing interests. Jupyter notebook with the modelbaseimplementation of the minimal pharmacokineticpharmacodynamic (PK-PD)model linking that linking the dosing regimen of an anticancer agent to the tumour growth, proposed by Simeoni and colleagues [34] . Jupyter notebook with the modelbase implementation of the classic epidemic Susceptible-Infected-Recovered (SIR) model parametrised as the autonomous model used to simulate periodicity of chicken pox outbreak in Hida, Japan [39] . The official documentation is hosted on ReadTheDocs.