key: cord-0048642-wa772ea8
authors: Stünkel, Patrick; König, Harald; Lamo, Yngve; Rutle, Adrian
title: Towards Multiple Model Synchronization with Comprehensive Systems
date: 2020-03-13
journal: Fundamental Approaches to Software Engineering
DOI: 10.1007/978-3-030-45234-6_17
sha: d51392dad6ac99796e93f2123ffb1412ac45bbcc
doc_id: 48642
cord_uid: wa772ea8

Model management is a central activity in Software Engineering. The most challenging aspect of model management is to keep models consistent with each other while they evolve. As a consequence, there has been increasing activity in this area, which has produced a number of approaches to address this synchronization challenge. The majority of these approaches, however, is limited to a binary setting; i.e. the synchronization of exactly two models with each other. A recent Dagstuhl seminar on multidirectional transformations made it clear that there is a need for further investigations in the domain of general multiple model synchronization simply because not every multiary consistency relation can be factored into binary ones. However, with the help of an auxiliary artifact, which provides a global view over all models, multiary synchronization can be achieved by existing binary model synchronization means. In this paper, we propose a novel comprehensive system construction to produce such an artifact using the same underlying base modelling language as the one used to define the models. Our approach is based on the definition of partial commonalities among a set of aligned models. Comprehensive systems can be shown to generalize the underlying categories of graph diagrams and triple graph grammars and can efficiently be implemented in existing tools.

Conceptual models, i.e. abstract specifications of the system under development, are recognized to be of major importance in software engineering [52] . Representing the whole system in a single global model is generally unfeasible, hence, different teams design and maintain several models which focus on different aspects of the system. This collection of inter-related models is often referred to as a multimodel. A rigorous use of these models within the engineering process eventually requires consistency management of multimodels. This is because the collection of models must obey global consistency rules and as models are inevitably subject to change, global consistency becomes an issue [16] .

Model Synchronization represents a means to maintain global consistency of inter-related models by combining consistency verification with (semi-)automatic consistency restoration. The cross-disciplinary research field Bidirectional Transformations (BX) [8] investigates such means within different communities and it provides a number of theoretical and practical results (see [2] for a recent survey). However, the majority of these approaches is limited to a binary setting, i.e. keeping pairs of models consistent. Stevens [44] recognized this limitation in her outreach to the modelling community that lead to an increased momentum in this area as evident from a recent Dagstuhl seminar on Multidirectional Transformations (MX) [7] . One way to address multiary synchronization is to consider it as a network of well-understood binary synchronization problems. However, not every multiary consistency rule can be factored into binary ones [9] ; e.g. the class diagrams A 1 , A 2 and A 3 in fig. 1 are pairwise consistent but not altogether-since class inheritance is acyclic. Thus, multiary model synchronization is needed to keep global consistency. Another approach to global consistency management is the model merge approach [6] : It constructs the union of all models wherein the related elements are identified, see lower half of fig. 1 (inter-relations given by sameness of class' names). Thus, global consistency can be verified within a single artifact, the merge. However, the major drawback of this approach, apart from requiring additional computational overhead, is that it forgets the origin of elements; e.g. that class C was contained in A 1 and A 2 but not in A 3 . This is a problem if global consistency rules depend on this containment information.

The most important information in multiary model synchronization are the inter-relations between models and their elements. We call the latter commonalities and cannot generally assume that they are always given by equality of names as it was the case in fig. 1 . Thus, multimodels must be extended with such commonality information, which allows element traceability and global consistency verification. Aligning models via an additional commonality structure has some tradition, e.g. it is the foundation of Triple Graph Grammars (TGGs) [40] , a formal and mature BX approach with a focus on Model Driven Engineering (MDE). In the TGG approach, models are considered to have a graph based structure, i.e. there is a common underlying base modelling language and we will also stick to this idea of a common base language.

In this paper, we propose a novel construction called comprehensive system which serves as a foundation for various ways of multiary model management. It is based on a simple, non-intrusive and easy-to-handle linguistic extension of the base modelling language with commonality specifications, which allows to work with an arbitrary number n ≥ 2 of heterogeneously typed (local) models as one single (global) model. Moreover, we will show that we are still able to apply mature methods for model verification and restoration in the same way as for single local models. Furthermore, we show that this approach is more expressive than, and overcomes the obstacles of, the model merge approach, and that it generalizes TGGs and graph diagrams [48] -a recent generalization of TGGs.

Before defining comprehensive systems and their properties (sect. 5 and 6), we clarify terminology (sect. 2), introduce of a running example (sect. 3), and provide an overview of the state of the art (sect. 4). An extended version of the proofs in sect. 6 is given in the technical report [47] .

Every fast moving research field is prone to produce separate terms for the same concepts. Thus, we begin with a short definition of the most important terms in multi-model consistency management. We will stick to the imperative of MDE [42] and consider all Software Engineering (SE) artifacts as models:

Model A model is an abstract specification of the system (or parts of it) under development. Models are atomic elements in the multimodel consistency management process. To be amenable for electronic processing, we assume them to be formal, i.e. following the format of a specific modelling language. We denote models by capital letters A, A , A 1 , A 2 etc. Metamodel and Conformance Every modelling language is specified by an artifact called metamodel. We denote metamodels by capital letters M, M , M 1 , M 2 etc. Models must conform to their respective metamodel, i.e. the model must be well-structured w.r.t. the metamodel and fulfill all constraints imposed on the metamodel, thus further narrowing admissible model structure. The model is then called an instance of the metamodel. Conformance is also called local or intra-model consistency. We denote a single constraint by lowercase φ and a set of constraints by uppercase Φ. A metamodel with a set of constraints Φ imposed on it will be written M Φ . Correspondence is a relation among a set of models. It is a consequence of commonalities (common concepts) shared by these models. A collection of models together with a correspondence among them is called a multimodel. In the similar way as for local models, global consistency rules can be imposed on a multimodel. It is considered (globally) consistent, if all local constraints and global consistency rules are fulfilled. Consistency of a multimodel is also referred to as inter-model consistency. Model Space A model space is a set of models together with changes among them. In an MDE setting it can be considered to be given by a metamodel M :

The set of all instances of M together with M -respecting instance changes, which describe how an instance A is the result of edits on A. We write Mod(M Φ ) to denote the respective model space.

We depict a collaborative modelling example within healthcare. More concretely, the task is to develop ICT support for a patient referral process. A referral is "the act of sending a patient to another physician for ongoing management of a specific problem with the expectation that the patient will continue seeing the original physician for co-ordination of total care" [41] . It is an important and recurring process in the healthcare domain. Hence, ICT-support is desirable [51] . At the same time, development remains tricky since it requires multiple actors (software vendors, government officials, hospitals and physicians) to agree on common data structures, processes and interfaces. For our example, let us assume that the design of the system follows a model-based approach and there are three different models, each covering a different aspect of the system: There is a process model A 1 denoted in Business Process Model and Notation (BPMN) [30] These three models are depicted in fig. 2 (ignore the cyan lines for the moment). The central ingredient is the process model A 1 . It represents a simplified version of the process developed in [51] . The process is triggered by a patient's appeal beginning with an introductory consultation. Afterwards the main part of the process begins: Information about the patient and its medical history is extracted while in parallel a consultant is selected via a business-rule activity. The patient information is then sent to the consultant. The consultant can either approve the referral or reject it. In the latter case, another consultant has to be found. If a consultant accepts the referral, the process is finished. The other models in fig. 2 contain the respective data types (A 2 ) and specify the domain-specific behaviour of the "Select Consultant" activity (A 3 ). The latter is depicted as a table that assigns, for a given combination of values in input side columns, a combination of values in output side columns, i.e. based on diagnosis and urgency, an appropriate consultant is selected (which is identified by a practicionerId and specialization).

All models could be edited completely independent of each other would there not be a correspondence between them. It arises from the existence of abstractly "the same" information simultaneously contained in multiple models. Consider e.g. the column called diagnosis in A 3 , which is reflected by a process variable in A 1 (visualized by a file symbol) and an attribute named description in A 2 . We call these relations commonalities and depict them via cyan lines in fig. 2 .

But the arising multimodel (models A 1 , A 2 , A 3 plus their commonalities) underlies consistency rules [11] (see sect. 2) which define consistency of a multimodel. For our example, assume the following consistency rules: CR1 For every business-rule activity in A 1 , there must exist a corresponding decision table in A 3 and vice versa. CR2 Every column type in A 3 must refer to an existing data type in A 2 with the same name. CR3 Every column in A 3 must have a corresponding public attribute (denoted by +) in A 2 and should be reflected by a process variable in A 1 . CR4 Every process variable in A 1 must either be reflected by a class or an attribute in A 2 .

To actually maintain consistency of A 1 , A 2 and A 3 , w.r.t. CR1-CR4, we begin by a review of the state of the art how commonalities are identified, consistency is verified and if needed restored.

A seminal exposition of the process of multimodel consistency management is already given in [43] . It comprises four phases: (i) Detection of overlaps (we call them commonalities, see sect. 3, (ii) Detection of inconsistencies, (iii) Diagnosis of inconsistencies, and (iv) Handling of inconsistencies. The first step is also called model alignment. Many approaches do not consider an explicit diagnosis stage and combine (iii) and (iv) into a phase called consistency restoration a.k.a. model repair [28] . Hence, existing work can be grouped into these three categories:

Alignment The goal of model alignment is to identify relations between models, i.e. finding their commonalities. This procedure, a.k.a. model matching, has been studied in several domains: databases [35] , ontologies [15] , MDE [23] , graph transformation [14] and software product lines [53] . Automatic model matching, in general, is NP-hard [36] . However, there may be domain-specific heuristics [53] which exploit underlying global identification mechanisms, e.g. social security numbers for persons or the ICD-10 ontology [54] for diseases. Surveys on this topic can be found in [15] (focus on ontologies), [35] (focus on databases) and [23] (focus on MDE). Further, it is important to note that model element matching requires that elements are transferable between models. This is e.g. directly given within the UML or multi-viewpoint modelling as there is a single underlying metamodel [3] . If this is not given a priori, matching on the level of metamodels [38, 10] has to preceed the matching of model elements.

Verification The goal of consistency verification is to find all consistency violations. A recent survey on this topic is found in [22] . The focus of the authors is on UML but the results are universal. They present four categories to classify verification approaches: system model (SMV), universal logic (ULV), heterogeneous transformation (HTV) and dynamic metamodelling (DMV). In the SMV approach every model is translated into a comprehensive artifact where the verification is executed. ULV is a variant of the former where the translation is executed on the level of an underlying logic. HTV define translations between each pair of models and DMV considers extensions of each metamodel with elements from other metamodels or models to express global consistency.

Restoration A comprehensive survey about model repair approaches is found in [28] , whereas [2] is a recent survey about BX based approaches. Insights from these surveys show that there are basically three categories of consistency restoration approaches: programming based (PBR) approaches where consistency and its restoration is explicitly defined simultaneously, solver based (SBR) approaches where consistency is abstractly posed as logic formula and restoration is implemented using a solver or search-based algorithm, and finally, grammar based (GBR) approaches such as TGGs [19] , which place themselves somewhere in between. The big majority of these approaches, however, considers binary synchronization only. There are only few notable exceptions, e.g. the solver based Echo [29] and the graph diagram framework [48, 49] .

Architecture Analyzing the underlying system architecture of these approaches, there are, in principal, two designs: We call them the network design and the span design. Consider the multimodel as a graph where nodes represent models and edges represent correspondences (for alignment), consistency relations (for verification) or repair functions (for restoration). In the network design there are edges between each pair of models. In the span design the graph has a hub-and-spoke layout, i.e. there is an additional hub-node that has an edge towards every model. Approaches in the categories SMV, ULV and SBR are associated with a span design since they perform a translation into a an intermediate model, while approaches in the categories HTV, DMV and PBR are associated with the network design because they directly act on a pair of models. GBR approaches have used either of them.

Comparing the architecture, the network design puts the complexity on the edges whereas the span design puts complexity on the nodes (more specifically on a single node: the hub). The drawback of the network design is that the number of edges grows quadratically with the number of participating models and if consistency relations cannot be factored into binary relations, hyperedges are required, which further increase the complexity. Another issue with this design is the coordination of concurrent changes. The drawback of the span design is the additional overhead of the hub-node model, however, the hub-node provides a means to coordinate concurrent changes.

In this section, we introduce comprehensive systems (sect. 5.1 to 5.3), which follow a SMV-approach and mitigate the drawbacks of the span design. We will show in sect. 5.4 that comprehensive systems are a foundation for the PBR restoration approach and we conjecture that the same is true for SBR, because they do not fundamentally differ from the structure of local models, such that they can be fed into existing means for model verification and restoration. Moreover, sect. 5.5 shortly reports why our approach eliminates the model merge obstacles (see the discussion in the introduction and fig. 1 ).

Before introducing comprehensive systems concretely, we want to illustrate where they occur in typical conceptual workflows for multimodel consistency management. Fig. 3 depicts such a workflow which is more or less informally used in many approaches of multimodel management, e.g. [16] . It comprises the phases mentioned in sect. 4: alignment, verification and restoration. The result of the first stage are the comprehensive metamodel and global consistency rules imposed upon it, and metamodel element commonalities, which are stored persistently to avoid expensive re-computation and possible information loss, cf. motivation in [25] . These commonalities are then used to compute the comprehensive system under consideration, e.g. a model merge. It can be used in the subsequent phases shown in fig. 3 .

In contrast to this additional computation, our definition of comprehensive system is based on a non-intrusive extension of existing models by commonalities without extensive computations. Furthermore, it enables natural internalizations of inter-relations between different local models into a single artifact. Our intention is to demonstrate this internalization informally in this section and formalize it in sect. 6, where we will also state that the resulting structure generalizes triple graphs [40] and graph diagrams [48] ; hence it is ready to be used in GBR approaches, too. 

We begin on the level of metamodels: Fig. 4a depicts a simplified metamodel M 1 of BPMN for our example. We do not endorse any specific MDE-framework and denote metamodels in a UML class diagram-like style. Metamodels M 2 and M 3 for UML class diagram and DMN models can be defined in the same way as metamodel M 1 (excerpts of them are shown in fig. 5 ). E-graphs [12] (see fig. 4b ) give a formal interpretation to the class diagram syntax, which may serve as an appropriate base modelling language B for our purposes, i.e. a shared linguistic (meta-)metamodel [26] . It consists of Graph Nodes GN and Data Nodes DN (complex and primitive types in the UML terminology), as well as Graph Edges GE (associations) and Node Attribute Edges N AE (attributes) together with appropriate owner and target functions. For the sake of simplicity we omitted edge attribute edges, which are usually included in E-graphs. Every model A must conform to a metamodel M . Since models and metamodels can be depicted as E-Graphs, the conformance relation is a typing homomorphisms t : A → M between the E-Graphs A and M . If, e.g. a is a flow node in A 1 , see fig. 2 , then t(a) = FlowNode ∈ M 1 . Hence, model space Mod(M ) is the category of E-graphs typed over M . E-graphs are only one possible base language and we will work with arbitrary base languages in sect. 6. Nevertheless will we use the term "graph" to subsume all artifacts under consideration (models and metamodels). Thus, we will use the terms (graph-and data-) " nodes" and (graph-and node attribute-) "edges" for the contents of these graphs, see [12] for the original terminology. If a set Φ of constraints (e.g. a set of formulas given in a specific logic) is imposed on M , then the space is reduced to the full subcategory Mod(M Φ ) of all consistent models typed over M w.r.t. Φ. Besides UML-internal constraints (e.g. the 1..1-multiplicity on src and tgt in fig. 4a ) given in the modelling technique, there are often attached constraints φ ∈ Φ. An example for an attached constraint is φ :=control_flow, see the note at FlowNode in fig. 4a self . oclIsTypeOf ( Event ) and self . eventType = EventType :: START ) implies self . incoming -> count () = 0 and ( self . oclIsTypeOf ( Event ) and self . eventType = EventType :: END ) implies self . outgoing -> count () = 0 OCL is just an example of a possible means for defining attached constraints. As we do not endorse a specific metamodelling framework and thus also not endorse a specific technique for the definition of attached constraints, we treat all constraints uniformly and assume that all internal and external constraints can be modelled as diagrammatic constraints [37] . A diagrammatic constraint φ imposed on a metamodel M possesses an "arity graph" S φ and is imposed on M by a scope d φ : S φ → M (a homomorphism). The semantics is provided by a predicate check φ : Mod(S φ ) → Bool, which verifies whether a given structure typed over the arity fulfills this constraint. The scope highlights a fragment (the image of d) of metamodel M , e.g. the blue coloured fragment in fig. 4a is the scope of the constraint φ from listing 1.1. For a typed graph t : A → M , the verification procedure verif y(t) = check φ (query(t)) comprises two steps: First, query forgets all elements of A not typed over the scope, then it retypes the remaining elements w.r.t. d such that they are typed over S φ . That is, query implements the pullback of d and t. Finally, check φ is invoked on the pullback result.

As seen in sect. 3, consistency rules play a major role in multimodelling. However, we cannot directly formalize them via the diagrammatic constraints described above since their definition involves elements spanning multiple models. Note that inter-relations between models arise from models sharing abstractly the "same" real-world concepts (see the intuitive cyan lines in fig. 2 ). We name these structural relations commonalities and they are also well-known in practice as traceability links [16, 39, 1] . There are different interpretations of what such a link can mean, e.g. identity, subset, extension? etc. [16] . In our framework commonality semantics are kept abstract, i.e. considering them as any kind of structural relation allowing us to define diagrammatic constraints in multimodels. For example, in order to formalize CR2, we need to declare a commonality between the terms DataType (in M 2 ) and ColumnType in M 3 . In addition to these binary commonalities in which only two terms are matched, there are also ternary commonalities, e.g. String occurs in all three metamodels and it is necessary to relate BPMN-term ProcessVariable with UML-term Attribute and DMN-term Column together with their respective name-and type-features to express CR3. These declarations may be formulated in an intuitive domain-specific language (DSL) shown in listing 1.2. The specification in listing 1.2 extends the modelling artifacts M 1 , M 2 and M 3 and we call its syntax a linguistic extension. Each relate-statement translates to an object, which is identified by an alias (keyword as) and which reifies the "tupling" of terms it relates. E.g. the object Var in lines 4-7 specifies a commonality of the triple ProcessVariable (M 1 ), Attribute (M 2 ), and Column (M 3 ). Var is an object in its own right and we call it a (commonality) representative.

However, not only the nodes (of the graphs) should be related: In listing 1.2 we see that the keyword with defines the two features, i.e. edges, type and name of the respective graphs to be related as well. Common edges require that their respective source and target nodes are also related, e.g. the typecommonality entails commonality of Attribute and Column, which is already given by the surrounding relate-statement, as well as commonality of DataType and ColumnType (see lines [8] [9] . Hence, commonality specifications must preserve edge-node-incidences.

Consequently, it is reasonable to use the same language B for commonality representatives. In such a way, a commonality specification is itself an E-graph: The semantic interpretation of listing 1.2 is depicted in cyan in fig. 5 . The proper linguistic extension further comprises mappings, which assign to each commonality representative w the elements it relates. E.g. Decision is mapped to Activity and to Table in the respective metamodels. Since the assignment syntax in the above DSL also contains the target metamodel of the related elements (e.g. BPMN in relate(BPMN.Activity...)), these mappings decompose into 3 projection mappings p j : M 0 → M j (j ∈ {1, 2, 3}), depicted by dotted arrows in fig. 5 , e.g. p 1 (Decision) = Activity ∈ M 1 , as well as p 2 (Type) = DataType ∈ M 2 , the target metamodel now encoded in p's index. Since the corresponding tuples can be of arbitrary arity, these mappings may be partial:

if w = Type. Finally, the above required edge-node-incidence means that definedness of p j (e) entails definedness of p j (v), where v is the source of e, and p j (v) = source of p j (e)

for all edges e in M 0 (and likewise for targets). 

The previous section showed that a linguistic extension of the base language with projection functions between commonality representatives and the elements they relate yields an alignment of metamodels M 1 , . . . , M n . The result is a comprehensive metamodel, in which commonalities are accurately specified with the help of (a graph of) commonality representatives. Formally, we obtain a new graph M 0 and partial projections

for all i ∈ {1, . . . , n}. Since all artifacts under consideration (models and metamodels) conform to the base B, see sect. 5.1, commonalities among models A 1 ∈ Mod(M 1 ), . . . , A n ∈ Mod(M n ) can be encoded in the same way, i.e. there is a graph A 0 of commonality representatives together with partial projections

for all i ∈ {1, . . . , n}. Again they can be specified in the same language as in listing 1.2, and can be stored physically, given that the modelling technique offers means to identify elements, e.g. primary keys in a database, position in an XML document, Uniform Resource Identificators (URIs) [5] , etc. The alignment of models A 1 , A 2 , and A 3 together with their commonalities is shown in fig. 2 . Each cyan line represents a commonality representative and each line ends at the value under the respective projection. Some of the lines are binary, some ternary. In general, we would expect any arity, especially when the number n of model spaces increases. The complete contents of fig. 2 is called a comprehensive system: the cyan connections its commonalities and models A 1 , . . . , A n its components. Models A i are typed over their metamodels, i.e. there are typing morphisms t i : A i → M i which can be combined to one big typing of all components. This typing extends to A 0 as well because elements a j and a k (j = k) of model components A j and A k are relatable only if their types t j (a j ) and t k (a k ) are related via a representative w ∈ M 0 . Hence, a natural typing t 0 of a commonality representative v of a j and a k is t 0 (v) := w, such that

which shows that the typing extension t 0 integrates smoothly (respecting commonalities) into a typing of all parts of the comprehensive model, such that we end up with a single typed comprehensive system: t : A → M .

Consider the OCL example and its generalization in terms of diagrammatic constraints in sect. 5.1. Theorem 1 in sect. 6 will show that comprehensive systems constitute a category basically with the same properties as the base language B. Especially, pullbacks can be computed in a similar way, see Corollary 1 in sect. 6. Thus, we can define the consistency rules CR1-CR4 from sect. 3 as diagrammatic constraints (φ i ) i∈{1,...,4} , now imposed on the comprehensive metamodel, which treat the commonality witnesses and projections as regular nodes and edges. Local constraints can be encoded as global constraints as well [24] , such that we obtain comprehensive system M Φ with a set Φ of constraints spanning local model elements but also elements of the linguistic extension. Any typed system t : A → M can then be checked against a constraint φ imposed via scope d : S φ → M by pullback of d and t in the category of comprehensive systems, see Theorem 1 in sect. 6. Hence, query implementation by pullbacks carries over from local models to comprehensive systems and we can reuse the theory of diagrammatic constraints to verify global consistency, which e.g. can be implemented by a straightforward translation of a respective model fragment and constraint to Alloy [20] . This can be used to formally verify that Fig. 2 is consistent w.r.t. CR1-CR4.

A merged model is an artifact which is computed additionally from local models A i . Basically, it is the union of all elements of the A i 's modulo their commonalities, see fig. 1 . E.g. in the merge of models A 1 , A 2 , A 3 in fig. 2 there remains a single node, say Diag/descr of type Var (a type in M 0 , see fig. 5 ), which represents sameness of Diagnosis ∈ A 1 , description ∈ A 2 and diagnosis ∈ A 3 . We could implement global consistency rules on the merge by including the merge computation in the check-function as described in the algorithm in [24] . However, this leads to problems if the verification of a global constraint depends on the knowledge of containment in local models. This can be seen with consistency rule CR3 which relies om the containment of elements (in this case containment in A 2 and A 3 ). After merging Diagnosis and description into the single node Diag/descr, distinguishing its original local model would no longer be possible. In contrast, we do not loose this differentiation in comprehensive systems and can successfully check the validity of this constraint.

This section is devoted to the formalization of comprehensive systems from sect. 5. In order to relate comprehensive systems to the TGG framework we need to employ category theory (CT) because TGGs are usually formulated in terms of CT. We recall the central terminology in the following section and refer to the introductory textbooks [4, 34, 50] for further references about CT.

A category C is a collection of mathematical objects and of morphisms, which are means to compare objects. For a category C, the set of objects is denoted |C| and for each pair A, B ∈ |C| the (hom-)set of morphisms from A to B is denoted by Arr C (A, B) . For each object A ∈ |C| there exists a special identity morphism id A : A → A. Moreover there is a neutral and associative composition operation • : Arr C (A, B) × Arr C (B, C) → Arr C (A, C) for all A, B, C ∈ |C|. The most prominent example is the base language of mathematics: Set, the category of sets and total mappings. A category C is said to be small, if |C| is itself a set. Equivalence of two categories C and D, written C ∼ = D, means that the network of objects and morphisms in C is identical to the one in D up to isomorphisms (e.g. bijections in Set) between objects.

A functor provides the means to compare two categories C and D: It is denoted F : C → D and maps objects of C to objects of D and morphisms of each set Arr C (A, B) to Arr D (F(A), F(B)). Moreover, it preserves identities and composition. F is called an embedding, if it is injective on objects of C and injective on Arr C (A, B) for all A, B ∈ |C|. For fixed categories C and D and functors F, F : C → D, a natural transformation n : F F is a family (n A : F(A) → F (A)) A∈|C| of D-morphisms compatible with images of F and F , i.e. for all C-arrows f :

In such a way we get a new category, the functor category D C with objects all functors from C to D and arrows the natural transformations. Functors F : C → Set where C is small play a special role: F assigns to each S ∈ |C| a (carrier ) set F(S) and for every op ∈ Arr C (S, S ) a mapping F(op) : F(S) → F(S ), i.e. C is a signature (think metamodel) that is interpreted by F (think instantiated). Hence, this is also called functorial or indexed semantics and Set C corresponds to the class of algebras for a signature C (instance worlds for a metamodel). E.g. objects of G := Set B are E-Graphs, if B is the category depicted in fig. 4b (identities are omitted) and E-Graph-homomorphisms are exactly the natural transformations. For set-based structures, we use the notation A → B to indicate included structures (A in B) such as subsets or subgraphs.

Universal constructions in categories have proven to be of importance in many software theoretical methods. Intuitively universal constructions can be described as a generalization of meets and joins in a preorder. Some well known examples for universal constructions in Set are cartesian products or disjoint unions (coproduct). It is important to note that Set possesses all these universal constructions and thus every category Set C does as well, where the computation of universal constructions is carried out "pointwise".

We begin the formalization of comprehensive systems by fixing a sufficiently large natural number n and considering a synchronization scenario with model spaces (Mod(M j Φj )) j∈{1,...,n} . E.g. UML class diagrams, BPMN process models and DMN tables.

Definition 1 (Base Modelling Language). The base modelling language is a small category B.

In order to distinguish between the different system components, we will work with copies B j of B. We let |B j | = {s j | s ∈ |B|} and similarly op j : s j → s j be an arrow in Arr Bj , if op : s → s is an arrow of Arr B . 1 Definition 2 (Comprehensive Systems, Components, Commonalities).

A comprehensive system C consists of -Functors C j : B j → Set for each j ∈ {1, . . . , n}, called Components -A functor C 0 : B 0 → Set determining the Commonality representatives, and -A collection of partial functions (C 0 (s) pj,s C j (s)) s∈|B|,1≤j≤n , called projections, establishing the commonalities of C,

such that for all op : s → s ∈ B and 1 ≤ j ≤ n the following statement holds:

If p j,s (x) is defined, then p j,s (C 0 (op 0 )(x)) is defined (5) and p j,s (C 0 (op 0 )(x)) = C j (op j )(p j,s (x)).

Note that (5) and (6) generalize the edge-node-incidences, see sect. 5.2, which we already semi-formalized in (1). In the sequel, the index of functors C i will be omitted, since it can be derived from the domain of definition. Hence, a comprehensive system is a single functor C with domain the n + 1 copies of B and (n + 1)b carrier sets, if b is the cardinality of |B|: In view of the introductory remarks on functors in sect. 6.1, C 0 , . . . , C n can be seen as n + 1 instance worlds for metamodel B, e.g. E-Graphs, each with b = 4 carrier sets.

The fundamental linguistic extension are the partial functions. They act according to our example in sect. 5.2: In the tuple (p 1 (w), . . . , p n (w)) the p j determine sameness of its components based on representative w. The next definition deals with different comprehensive systems. In this case, it is necessary to tell the respective partial mappings apart, such that we write p C j,s , if we depict the mappings in the particular system C.

Let C, C be comprehensive systems as defined in Def.2. A homomorphism between comprehensive systems is a family 

where we write f instead of f j,s , if the indexing becomes clear from the context.

A typical example is a typing morphism t : A → M for two comprehensive systems A and M . Then equation (7) reflects property (4), i.e. compatibility of commonalities and typing. This can be seen in fig. 2 : The complete contents of it is a comprehensive system A typed over the comprehensive metamodel M partly depicted in fig. 5 . A 0 consists of all cyan (binary or ternary) lines and p j,s assigns to a line its line end in model A j , where s is the respective element type (node or edge).

Comprehensive Systems together with homomorphisms between them constitute a category CS.

Proof. An identity is a family of identities, composition is composition of mappings f j,s . This yields neutrality and associativity. Moreover, composed homomorphisms are still compatible with arrows. Whereas this follows in the usual way for op : s → s , transitivity of the definedness implication in (7) also yields compatibility with partial functions.

An alternative but closely related approach to our construction is to consider commonalities, i.e. commonality representatives A 0 together with projections (p A j ) 1≤j≤n , not represented internally by means of the modelling technique but externally as n spans of morphisms [24, 46] . Let for this G := Set B , see the remarks on functor categories in sect. 6.1. The resulting artifacts of the category in [46] The proof of the following theorem relies mainly on cartesian closedness of the category of small categories, i.e. G I ∼ = Set B×I (internalization) and the fact that spans with one monic leg represent partial mappings, the middle object of the span being the domain of definition of the partial map. A detailed proof of the theorem is given in [47] . Proof. Follows from Theorem 1 and the fact that functor categories possess all pullbacks, their pointwise construction guaranteeing that spans with one monic leg are preserved, because pullbacks preserve monomorphisms.

Auxiliary commonality structures have been used for model synchronization in the TGG framework [40] : Consistency relations between two model spaces are defined declaratively by a grammar. The grammar rules are defined over triple graphs, i.e. pairs of graphs connected by special correspondence-graphs, which resemble structural commonalities. From the grammar rules, procedures for consistency verification [27] , model transformation [13] and (concurrent) model synchronization [19, 18] can automatically be derived. The solution space, however, is limited to binary scenarios. Trollmann and Albayrak [48, 49] generalized the TGG framework to cope with multiple models within a graph diagram (GD) framework. If we assume that the involved models are also objects of the graphlike category G (see above), then graph diagrams are the objects of a functor category G X , but with a different schema category X: It has objects |X| = R N and all non-identity morphisms connect a source from R (relations) to a target from N (models). There is at most one arrow in Arr X (r, m) for fixed r ∈ R and m ∈ N . In such a way graph diagrams, i.e. functors D : X → G can specify relations of different arities.

They are, however, static: If r ∈ R has k outgoing morphisms with targets m 1 , ..., m k ∈ N , D(r) is a k-ary correspondence relation with representatives which relate exactly one element in each of the k models D(m j ). Consequently, the schema category has to change each time a new relation is added! Graph diagrams (GD) subsume TGGs, which have schema X T GG := 1 s ← 0 t → 2, i.e. R = {0} and N = {1, 2}. Computations of triple graphs (and graph diagrams) during rule application as well as decomposing GD rules for forward and backward transformations are based on pushout constructions in G X . In the rest of the section we show that our framework is more general than graph diagrams in that there is an embedding functor T : G X → CS, the translation functor, which preserves pushouts and hence is able to replay all GD computations in our framework, yet being able to cope with new relations without changing the schema category.

We use the following notations: For a morphism f : A → B in a category C we write A = dom(f ) and B = codom(f ) for its domain and codomain and we use the shorthand notation Arr C (_, B) := {f ∈ Arr C | codom(f ) = B}. We write i∈I D i to depict the coproduct of a collection (D i ) i∈I of G-objects. Note that a collection (D i fi → D) i∈I of morphisms yields the morphism i∈I f i : i∈I D i → D by the universal property of coproducts, i.e. the morphism, which acts as f i on each D i .

By Theorem 1, it suffices to define a functor from G X to M. The composition of this functor with the equivalence will yield the desired result. This functor will also be called T. Let a schema category X for graph diagrams be given with |X| = R N and let n be the cardinality of N . Without loss of generality, we assume N = {1, . . . , n}. Let D be a graph diagram, then we define a multimodel The definition of T on arrows is straightforward and we give it only informally: If n : D D is an arrow between graph diagrams, then (1) T(n) i is a morphism which acts in the same way as n i on D(i), if i > 0, (2) it amalgamates the actions of n on relations, if i = 0, which (3) naturally restricts to the respective actions, if i < 0. It is then easy to see, that ν := T(n) is again a natural transformation.

Theorem 2. Functor T : G X → CS is an embedding and preserves pushouts.

For a detailed proof of this theorem consult [47] . To sketch the idea, note that we cannot rely on pointwise pushout construction alone: Given a span (ν, µ) in M as in fig. 7 , pointwise pushout construction may fail to belong to M! E.g. if ν and µ are arbitrarily given, then M 3 in fig. 7 may not be admissible for M because the mapping M 3 (−j) may fail to be monic, an effect already studied in [25, Example 6] Instead the proof uses the fact that naturality squares in ν are pullbacks, if ν is in the image of T. Then hereditariness [17] of pushouts in G yields admissibility of M 3 and nevertheless allows for pointwise pushout construction. We obtain as a consequence: Corollary 2. Every sequence of rule applications in G X has a unique representation of corresponding rule applications in CS and hence can be replayed in the general framework of comprehensive systems.

Our work can be summarized by the slogan "from many models to one model": Multimodelling is addressed by a construction that yields a single artifact, where existing means for consistency verification and restoration can be reused. Over many years such global artifacts were computed via merging [38, 6, 36, 10] , which poses several difficulties especially if the verification of a global constraint depends on the knowledge of which local model the elements came from. Hence, we proposed comprehensive systems that mitigate issues with the former and represent a generalization of graph diagrams and triple graphs-alternatives to our approach. Comprehensive systems stress the utility of partial mappings in commonality specifications, which have been promoted in [46] and were also picked up in [25] .

Related work on multimodel consistency management was surveyed in sect. 4. Thus, at this point we mainly want to place our contribution in this landscape. Our approach can be considered as a structural one and is in tradition with other approaches based on traceability links. Recent other representatives in this line are [16] , which uses binary links to relate different artifacts in a practical scenario, and [21] , which develops a language, similar to ours, for expressing commonalities for global consistency restoration. All these works share the requirement for a common meta-metalanguage: In our case, given by graph-like structures (presheaf topoi). A rather different approach is the framework proposed by Stevens [45] : It considers consistency restoration to be performed locally by a builder. The concrete implementation of the builder is up to the user and thus there is no requirement for a common meta-metalanguage. The global coordination of multiple builder is handled by the framework, controlled by an orientation model. Comparing Stevens approach to structural approaches, the former is more abstract and thus allows more directions for tooling implementation, whereas structural approaches allow formal analysis of the nature of consistency rules. It will be worthwile to investigate the relationship between both approaches in the future.

This paper provides the framework for performing multi model consistency management by reusing existing restoration techniques. We plan to address the momentary lack of practical evidence by investigating model repair [28] as the next step. Being conceptually close to TGGs, grammar-based approaches seem a natural fit but we plan to experiment with solver-based approaches as well, further taking into account: Human interaction and learning. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Model traceability

Benchmarking bidirectional transformations: theory, implementation, application, and assessment. Software and Systems Modeling

Orthographic Software Modeling: A Practical Approach to View-Based Development

Category theory for computing science

Uniform resource identifiers (uri): Generic syntax

A Manifesto for Model Merging

Multidirectional Transformations and Synchronisations (Dagstuhl Seminar 18491)

Bidirectional Transformations: A Cross-Discipline Perspective

Multiple Model Synchronization with Multiary Delta Lenses

Specifying Overlaps of Heterogeneous Models for Global Consistency Checking

Fixing inconsistencies in UML design models

Fundamentals of algebraic graph transformation

Information Preserving Bidirectional Model Transformations

Proceedings

From Model Transformation to Model Integration based on the Algebraic Approach to Triple Graph Grammars

Ontology Matching

Managing intermodel inconsistencies in model-based systems engineering: Application in automated production systems engineering

On pushouts of partial maps

Concurrent Model Synchronization with Conflict Resolution Based on Triple Graph Grammars

Correctness of model synchronization based on triple graph grammars

Alloy: A Lightweight Object Modelling Notation

Commonalities for Preserving Consistency of Multiple Models

Multi-view Consistency in UML: A Survey

Different models for model matching: An analysis of approaches to support model differencing

Efficient Consistency Checking of Interrelated Models

Adhesive Subcategories of Functor Categories with Instantiation to Partial Triple Graphs

Matters of (Meta-) Modeling. Software & Systems Modeling

Leveraging incremental pattern matching techniques for model synchronisation

A Feature-Based Classification of Model Repair Approaches

Least-change bidirectional model transformation with QVT-R and ATL

OMG: Business Process Model And Notation (BPMN) v

OMG: Object Constraint Language (OCL) v.2.3.1

Basic Category Theory for Computer Scientists

A Survey of Approaches to Automatic Schema Matching

N-way Model Merging

A Diagrammatic Formalisation of MOF-Based Modelling Languages

An Algebraic Framework for Merging Incomplete and Inconsistent Views

EVL+Strace: a novel bidirectional model transformation approach

Specification of Graph Translators with Triple Graph Grammars

The Dictionary of Modern Medicine

Model-driven engineering: A survey supported by the unified conceptual model

Inconsistency Management in Software Engineering: Survey and Open Research Issues

Bidirectional Transformations In The Large

Towards Sound, Optimal, and Flexible Building from Megamodels

Multimodel correspondence through inter-model constraints

Towards multiple model synchronization with comprehensive systems: Extended version

Extending model to model transformation results from triple graph grammars to multiple models

Extending Model Synchronization Results from Triple Graph Grammars to Multiple Models

Categories and Computer Science

Pragmatic Interoperability for Ehealth Systems: The Fallback Workflow Patterns

The State of Practice in Model-Driven Engineering

Variability Mining of Technical Architectures