key: cord-0046557-t8wjfo7o
authors: Bhayat, Ahmed; Reger, Giles
title: A Polymorphic Vampire: (Short Paper)
date: 2020-06-06
journal: Automated Reasoning
DOI: 10.1007/978-3-030-51054-1_21
sha: 74ee2afc0300771ee736c40f7a03c5fcd91c77c0
doc_id: 46557
cord_uid: t8wjfo7o

We have modified the Vampire theorem prover to support rank-1 polymorphism. In this paper we discuss the changes required to enable this and compare the performance of polymorphic Vampire against other polymorphic provers. We also compare its performance on monomorphic problems against standard Vampire. Finally, we discuss how polymorphism can be used to support theory reasoning and present results related to this.

Vampire is a well known automated theorem prover for first-order logic with equality [14] . For a long period, Vampire supported only untyped first-order logic. Around 2011 it was extended to support monomorphic first-order logic (FOL). As part of recent work on supporting higher-order logic reported on elsewhere [1] , Vampire has been extended to support rank-1, polymorphic, firstorder logic 1 .

Polymorphic types have a number of advantages over their monomorphic counterparts. Firstly, they provide the user with a more succinct language for describing their problem. Secondly, they provide an elegant solution to dealing with theories. For example, when dealing with the theory of arrays, rather than having to provide separate sets of axioms for arrays of different sorts, polymorphism allows us to provide a single set of axioms. Thirdly, polymorphism also permits higher-order logic to be finitely axiomatizable in first-order logic by introducing polymorphic axioms for the SK-combinators.

There are several ways of encoding polymorphism. However, many of these are cumbersome and some even unsound [8] . Blanchette et al. [2] list a number of common translation methods including the use of type tags, type guards and type arguments. Of these, the last is unsound and the first two cumbersome. As the type tags and guards are ubiquitous in the literature, we provide a comparison between native handling of polymorphism and the use of these encodings in Sect. 5. Given issues with encodings, it makes sense to deal natively with polymorphism if possible. We are certainly not the first to attempt to do so. Bobot al. [5] have introduced polymorphism into their SMT solver Alt-Ergo. Similarly, the first-order provers ZenonModulo [11] and Zipperposition [9] support some form of polymorphism. However, it remains the case that few first-order provers can handle polymorphism.

In this short paper we begin by describing the relatively modest changes that had to be made to Vampire to support polymorphism (Sect. 2). We then present experimental results demonstrating that these changes are useful (Sect. 3). Finally, we discuss work-in-progress to use these polymorphic extensions to improve theory reasoning in Vampire both in terms of proof search and implementation (Sect. 4).

Before this, we give a brief (and informal) reminder of what rank-1 polymorphism is. A polymorphic type is a type variable, or n-ary type constructor applied to n types. The type of all types is represented as $tType in TPTP syntax [17] which is used throughout this paper. Terms, in polymorphic FOL, are either a variable or a function symbol applied to m type arguments and n term arguments. Rank-1 polymorphism allows type and term variables to be quantified with the rule that an existentially quantified type may not occur underneath a universal term quantifier. On skolemisation such a construct would become a dependent type and require superposition into types.

To support polymorphism modifications had to be carried out in three main areas. Firstly, changes had to be made to the concept of types in Vampire. Secondly, some inferences had to be modified slightly and finally preprocessing required consideration. We describe the work undertaken in this order. Problem 1 is an example of a problem in TPTP TF1 syntax [3, 17] . We use this problem to illustrate our implementation. The major change undertaken was to replace types with terms. In monomorphic Vampire each type in the input problem is stored as an unsigned integer. Function symbols are then assigned a type signature which is merely a list of unsigned integers representing the argument and return types. In polymorphic first-order logic, types have all the structure of terms. Therefore, it made sense to replace the types with terms. Type signatures then become a list of terms.

The type of a term of the form func_sym(arg_1 ... arg_n) can be calculated by substituting the type arguments that the head symbol is applied to into its result type. For example, the type of update($int, $i, map, 1, X) would be map($int, $i). For a term func_sym(arg_1 ... arg_n), the type of the ith term argument can be calculated in the same way. The one problem that arises is with two variable literals such as X = Y. In this case, the type of the terms X and Y have to be stored as a separate field in the literal.

The elegance of treating types as terms can be gauged when attention is turned to unification. Had types and terms been kept separate, unifying terms would have become an involved process requiring the unification of term and type arguments separately. As it is, type unification comes for 'free' with one caveat as shall be seen. Consider unifying the terms update($int, $i, map, 1, X) and update(Y, Z, map, Z', a). The existing unification procedure in Vampire can handle this and return the type and term unifier {Y → $int, Z → $i, Z' → 1, X → a}. The one hitch occurs when unifying a term with a variable. As variables carry no type information, a second call must be made to the unification procedure to ensure that the type of the variable and the type of the term are unifiable.

As far as changes to inferences are concerned, no updates were required for inferences that do not work at subterms such as resolution or equality factoring. For inferences that work at subterms such as superposition and demodulation, we modified the iterators that return candidate subterms so that they do not return type arguments as superposition into types is unnecessary. We mentioned that the modifications required to support polymorphism were light. They also (in theory, see later experiment) add no overhead when dealing with monomorphic problems. In this case all types are constants and unifiability checking of types in the variable case degenerates to a syntactic equality check.

Finally, regarding preprocessing, implementing skolemisation posed a subtle issue. A skolem function must be applied to the free term and type variables in its context. For example, the skolemisation of ![X: $int, Y: $tType] : ?[Z : $i] : (func_sym(Y, X, Z)) would be ![X: $int, Y: $tType] : ?[Z : $i] : (func_sym(Y, X, sk(Y, Z))). This required us to update the notion of free variable within the code (e.g., when iterating over the free variables of a formula).

To test our implementation we ran two experiments. All experiments were carried out with a CPU time limit of 300 s on StarExec [16] nodes equipped with four 

Firstly we ran Vampire on the set of 539 TF1 (rank-1 polymorphic) problems in the TPTP library. We compared the results against those of two other provers able to parse TPTP syntax and handle polymorphism that we are aware of, Leo-III [15] and ZenonModulo [11] . 3 Vampire solved 15 more problems than Leo-III and 21 problems that neither Leo-III nor ZenonModulo could solve (see Table 1 ), although both solvers also solved problems Vampire was unable to solve. Vampire solves 7 previously unsolved rating 1.00 problems.

We also wanted to ascertain how much overhead had been added for non-polymorphic problems, so we tested the polymorphic version of Vampire, Vampire-poly, against the previous version on the set of all 33,843 monomorphic or untyped first order problems in the TPTP library not containing arithmetic. Note that this simply tests whether we go from solving a problem to not solving it (or vice versa) and not the time taken to find a solution, i.e., we test the impact on proof search and whether any time overhead takes us past the given time limit. The results (see Table 1 ) are interesting. For TF0 problems Vampire 4.4 does indeed outperform its polymorphic sibling. However, at the time of writing, there is a bug in the polymorphic parser that resulted in 324 problems being incorrectly rejected. Even taking this into account, the performance of Vampire-poly lags behind and the cause of this remains to be fully investigated, although is likely to be due to the fragile nature of proof search in Vampire. Note that Vampire-poly solves 88 problems unsolved by Vampire 4.4.

2 https://github.com/vprover/vampire_publications/tree/master/experimental_data /IJCAR-2020-POLY-VAMP -this contains the results themselves and a link to the Vampire executable that produced them. Polymorphism is not yet supported in the main branch of Vampire but is available in the polymorphic_vampire branch, which may be merged in the future. 3 At a late stage, we realised that Zipperposition [10] can also parse TF1 syntax.

Unfortunately, it was too late to incorporate it into the results.

Vampire has built-in support for the polymorphic theory of arrays [12] and the polymorphic theory of first-class tuples [13] . Here we briefly discuss work-inprogress to improve the implementation of these theories (and future similar theories) using polymorphism. Both theories are supported by detecting instances of the polymorphic theory and adding the relevant instances of that theory's axioms to the input problem. For example, for the polymorphic theory of arrays, for each array sort array(t1, t2) detected in the input problem, we add instances of the axioms With support for polymorphism, as soon as we detect that arrays of any kind are used we can simply add the three polymorphic axioms This has a minor impact on proof search. Instead of adding 3n clauses when we have n different instances of the polymorphic theory, we only add 3 clauses. As n is usually small, this is unlikely to have a significant impact. At the same time, we should not see any negative impact, these polymorphic axioms will act in the same way as the 3n instances did.

The main impact is on the implementation of theories within Vampire. A non-trivial amount of complexity is required within the Vampire codebase to support the current mechanisms for the two supported polymorphic theories. Adding a new polymorphic theory of this kind involves a lot of boilerplate code and updating of various definitions. Replacing this with polymorphic theory axioms will simplify the code significantly. For example, if the SMT-LIB language is extended to support polymorphism in the future (this has been discussed, e.g., [6] but not implemented), internal support for polymorphism would make supporting the polymorphic theory of term algebras straightforward.

Moreover, not all polymorphic theories are supported by the mechanism described above; our current approach of adding instantiated axioms based on the input is complete for the theory of arrays, but cannot be complete in general as shown by Bobot and Paskevich [4] . For the theory of combinatory logic for example, no decision procedure can exist for selecting a set of monomorphic combinator axioms to add to a problem and ensure completeness (even though such a set must exist).

The polymorphism of TPTP's TF1 language is inspired by ML-style polymorphism but differs in the use of type quantifiers. As pointed out by Blanchette et al. [3] , ML-style polymorphism avoids explicit type quantifiers, choosing to determine type signatures by the types of arguments, results or additional annotations (which are sometimes needed to guide Hindley-Milner type inference). Comparatively, type checking is more straightforward in TF1 due to an explicit signature and explicit type quantifiers.

As mentioned earlier, there are two main methods for reasoning in polymorphic logic: natively or via translations. We discuss related work for each direction.

Zipperposition [9] was built using explicit polymorphism -types are explicitly represented in terms and inferences perform unification on both terms and types. The main difference with our approach is that we are 'retro-fitting' polymorphism into a monomorphic theorem prover. Additionally, our 'types as terms' internal representation (mostly) removes the additional book-keeping required when performing separate term and type unification.

There are three main approaches to translation -type tags, type guards, and type arguments. The purpose of both the type tag and type guard encoding is to ensure that unsound inferences that violate typing constraints cannot occur in the untyped problem. We do not provide details of the encodings here, but refer readers to [2] with a further example given by Brown et al. [7] in their work translating between different TPTP formalisms. Consider the following satisfiable polymorphic problem with a polymorphic predicate p: tff(a,type, p : !> [X : $tType] : X > $o). tff(b, conjecture, ?[X:$i, Y:$int] : p($i,X) => p($int,Y)).

The negated conjecture becomes the two clauses~p($i,X) and p($int,Y). Clearly, if we drop the types (i.e. via type erasure) then this satisfiable problem becomes unsatisfiable as we can no longer differentiate between the two versions of p. Using type tags we would get~p(ti(X, $i)) and p(ti(Y, $int)) and with type guards we would get~isi(X) |~p (X) and~isint(Y) | p(Y)both prevent the unsatisfiability from type erasure at the expense of introducing extra functions or predicates. We achieve the same through type inference and unification. The type argument translation looks similar to our internal representation of types, e.g. types are encoded as terms. However, without being aware of the type of equalities where at least one side is a variable (as we are in our translation) this encoding can be unsound as equalities can capture cardinality constraints between types.

We have successfully extended a state-of-the-art first-order prover to support prenex polymorphism and shown that the difficulty in doing so is not as great as may be expected. We hope to encourage other researchers to do the same.

Theoretically, extending Vampire to polymorphic FOL should be graceful in the sense that no degradation of performance should be seen on non-polymorphic problems. Our experimental results do not bear this out. In future work, we hope to achieve two objectives. Firstly, to fix and refine our implementation of polymorphism such that no degradation on monomorphic or untyped problems is experienced. Secondly, as outlined above, to utilise polymorphism to simplify and extend theory reasoning in Vampire for polymorphic theories such as arrays.

A combinator-based superposition calculus for higher-order logic

Encoding monomorphic and polymorphic types

TFF1: the TPTP typed first-order form with rank-1 polymorphism

Expressing polymorphic types in a many-sorted language

Implementing polymorphism in SMT solvers

Extending SMT-LIB v2 with λ-terms and polymorphism

GRUNGE: a grand unified ATP challenge

Handling polymorphism in automated deduction

Extending superposition with integer arithmetic, structural induction, and beyond

Superposition with structural induction

Zenon Modulo: when achilles outruns the tortoise using deduction modulo

The vampire and the FOOL

A FOOLish encoding of the next state relations of imperative programs

First-order theorem proving and Vampire

The higher-order prover Leo-III

StarExec: a cross-community infrastructure for logic solving

The TPTP problem library and associated infrastructure