HflflBHHB mCflMDODfl KHhS H ■BBSS WMm HnHaRraH H HflHHBL. mm LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAICN T.SUt>r «rvo.45i-4S(o cop. Z The person charging this ma ferial is « sponsible for its return to the libra y from which it was withdrawn on or before Latest Date stamped below. ■ _ j _ _l JUN 6 197 JAM 1 2|1983 JAN 2 ? L161 — O-1096 /V. S T , yj-^2 Report No. 1*52 ¥ ~7riosc£. A RESOLUTION STYLE PROOF PROCEDURE FOR HIGHER -ORDER LOGIC by Lawrence Joseph Henschen June, 1971 lEOBRARYQEIHE NOV 9 1972 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS charging ^t^V^ •'•' e, ? ,0 S*s««-«* a,,,0,n v M U^ASAX^l^ Qfc£22 1.161- .1096 Report No. k^2 A RESOLUTION STYLE PROOF PROCEDURE FOR HIGHER-ORDER LOGIC* by Lawrence Joseph Henschen June, 1971 Department of Computer Science University of Illinois Urbana, Illinois 61801 This work was supported in part by the National Science Foundation under Grant No. US NSF GJ-812 and was submitted in partial fulfillment for the Doctor of Philosophy degree in Mathematics, 1971* A RESOLUTION STYLE PROOF PROCEDURE FOR HIGHER -ORDER LOGIC Lawrence Joseph Henschen, Ph.D. Department of Computer Science University of Illinois, 1971 The purpose of this thesis is to suggest an alternative to the \-calculus for automating higher-order logic. It is proposed that higher-order problems "be formulated using an n-sorted logical language. Such a language is a first-order language in which there are a finite number of distinct types of individuals. The motivation for this can be seen as follows. Let there be two types of objects, say type 1 and type 2. Let a be an object of type 1, b an object of type 2, and P a two-place relation symbol whose first argument is of type 1 and second argument is of type 2. In a natural way, a can be associated with a predicate over objects of type 2 as follows: the associated predicate holds for b if and only if P(a, b) holds. Then, if one quantifies over a type 1 variable, one is essentially quantifying over the names of predicates. By adding more types, one can extend this idea to the full hierarchy of higher-order logic. The main body of the work consists of essentially three sections. In the first section, the two languages dealt with, the X-calculus and n-sorted logic, are described, and the relavent facts about each are mentioned. It is also notes; that trivial modifications to the proofs of standard theorems on resolution make those proofs hold for n-sorted resolution as well. In the second section, it is shown how to translate a sentence A of a X-calculus into a well-formed expression B of an associated n-sorted logic. The translation preserves validity, or equivalently, inconsistency; that is, if A is valid (in the Henkin Digitized by the Internet Archive in 2013 http://archive.org/details/resolutionstylep542hens sense ), then B is valid. Finally, it is shown that by adding a set T of axioms, called Henkin axioms, to E one has that A is valid if and only if 3 U T is valid. The Hehkin axioms include among others, a set of axioms which are analogues of the standard comprehension axioms of logic. In the last section, it is shown that the Henkin axioms can be put into a special form which is compatible with the structure of data used by resolution theorem-provers . A new rule of inference, called naming, is proposed to eliminate the need for actually storing the Henkin axioms in memory. The work contains an appendix with some examples of higher- order problems solved using an n-sorted representation and the resolution and naming rules. Ill ACKNOWLEDGMENT The author would like to express his deep appreciation to Professor W. W. Boone for his encouragement and inspiration. While Professor Boone was unable to serve on the author's final committee because of ill -health, he "was certainly there in spirit. The author would also like to thank Professor D. B. Gillies for his aid in preparing this thesis and his financial support. This work was supported in part by the National Science Foundation under Grant No. GJ-8:L2. Finally, the author wishes to thank his mother and Mrs. Linda Bridges for their aid in preparing the manuscript and the Offset Department of the Department of Computer Science for printing the final copy. IV TABLE OF CONTENTS Page 1. INTRODUCTION 1 1.1 Historical Background ..... 1 1.2 Outline of the Work k- 2. THE \-CALCULUS AND n-SORTED LOGIC 6 2.1 Preliminary Remarks on Defining a Logical System 6 2.2 The \-Calculus 6 2.3 n-Sorted Logic. 16 2.k Resolution for n-Sorted Logic 2k 3- EQUIVALENCE OF A.-CA1CULI TO n-SORTED THEORIES 28 3.1 Locally Henkin Frames 28 3*2 Representation of the ^.-Calculus in an n-Sorted System. ... 30 3.3 The Associated Realization of K 35 k. COMPUTER IMPLEMENTATION OF n-SORTED LOGIC FOR HIGHER-ORDER LOGIC 52 k.l Preliminary Comments 52 k.2 The Naming Rule 53 k.3 Another Look at the Closure Axioms 58 k.k Some Completeness Results 63 5. SUGGESTIONS FOR FUTURE RESEARCH 70 LIST OF REFERENCES 72 APPENDIX 7^ VITA 79 1. INTRODUCTION 1.1 Historical Background In this section, a brief history of automatic theorem-proving is given . One approach to the general topic of artificial intelligence is to use the language of mathematical logic to express one's problems. One can then apply the we 11 -developed theories of mathematical logic to obtain algorithms for actually solving problems. These problems are almost always of the form, "Determine if a particular statement S of the language is always true". For one of the basic languages of mathematical logic called first- order logic (see Section 2.3 for a precise definition of first-order logic), the property of a statement S always being true is equivalent (Church [2]) to S being a formal theorem (again see Chapter 2 for a precise definition of formal theorem) . Thus, algorithms that handle such problems are called theorem-proving algorithms or theorem-provers, and computer programs that embody such algorithms are called automatic theorem-provers. Of course, it is well known that there does not exist an algorithm which will determine for an arbitrary first-order statement S, whether or not S is a formal theorem (Church [2]). That is, first-order logic is undecidable. However, there do exist algorithms which, if given a first-order statement S which is always true, will demonstrate that S is always true. Such an algorithm is called a proof procedure. Determining that S is always true is equivalent to determining that the negation or denial of S is always false or 2 inconsistent. Most algorithms for handling first-order logic actually demonstrate that the denial of S is inconsistent and are called refutation procedures. However, because S is always true if and only if the denial of S is always false, the two terms, proof procedure and refutation procedure, are used interchangeably. All recent proof procedures (except the simple minded procedure of listing all formal proofs, and hence, all theorems) are based on a fundamental theorem of Herbrand [ 9] and a method proposed by Skolem [ 19] in 1928 (see Prawitz [13]). Herbrand's Theorem showed that one could demonstrate the inconsistency of a first-order statement S by generating a sequence S,, S p , ... of statements of Boolean algebra. The Boolean statements are calculated from S itself, and Herbrand's Theorem states that S is inconsistent if and only if some S, is inconsistent. However, Boolean algebra is simpler than first-order logic in that it can be determined for each Boolean statement S., whether or not S. is inconsistent; that is, Boolean algebra is decidable. Skolem then proposed as a proof procedure calculating the sequence S,, S , ... specified by Herbrand's Theorem and examining each S. for inconsistency. Unfortunately, this method was of little practical use at that time. The calculations for most problems were too long to be carried out by hand. Moreover, if the statement S was not inconsistent, no S. in the sequence (which is usually infinite) is inconsistent. Thus, theoretically, a person would continue calculating forever. (This last problem is still present, today, of course.) In the late 1950's, after the development of high-speed digital computers, mathematicians again considered Skolem' s procedure for automating first-order logic. While the problem of calculating, or in this case computing, forever when S was not inconsistent was still present, researchers developed faster ways of generating and testing the S.'s (see, for example, Gilmore [6] or Davis and Putnam [ 5] ) ■ in the hopes of overcoming the magnitude of the calculations. However, even these programs required times in the neighborhood of half an hour on an IBM 7090 to prove some elementary theorems from group theory. The size of the Boolean expressions S. increased tremendously. Thus, even the fastest methods for checking inconsistency were still not fast enough to make a practical automatic theor em-pr over . Then in 19&5, J> A. Robinson [15] published a simple method consisting of a single operation called the resolution rule. The proof that the resolution rule was sufficient to demonstrate inconsistency was still based on Herbrand's Theorem. However, the resolution rule operates on S itself, thus eliminating the need to generate the rapidly expanding sequence of Boolean expressions. Since 1965, researchers have developed many refinements of the resolution rule and many strategies for applying these rules to the subexpressions of S (see, for example, Andrews [1], Loveland [11], Luckham [12], Robinson [l6], and Wos, Robinson, and Carson [21]). The problems of group theory that previously required several minutes to a half -hour on an IBM 7090 can now be solved in a few seconds on an IBM 3^0. There are two directions being taken by researchers in automatic theorem-proving, today. First, work is being done to develop special purpose theorem-provers for handling particular kinds of problems. Work on question- answering and information retrieval programs (see, for example, Green [7] or Darlington [3]), theorem-provers that handle equality (Robinson and Wos [1^]), k and theorem-provers incorporating induction schemata fall in this category. The second direction is the attempt to develop theorem-provers for a class of logical systems grouped together under the heading higher-order logic (Robinson [IT])- It has been pointed out (Davis [h]) that these higher- order systems can be expressed in first-order logic. However, until now no such representation has been given which would result in first-order statements that were not beyond the capabilities of any automatic theorem-prover avail- able today. On the other hand, many mathematical problems can be expressed very easily and naturally in these higher-order languages. It is the pur- pose of this thesis to show how higher-order logic can be expressed, in a natural way, in a first-order-like language called n-sorted logic. The translation from higher-order logic to n-sorted logic yields statements which are of a reasonable size for current theorem-proving programs. Finally, some suggestions as to the possible implementation of a resolution style automatic theorem-prover for this representation of higher-order logic will be made. 1.2 Outline of the Work The introduction gave a brief history of automatic theorem-proving. The remainder of this work is divided into five sections, four more Chapters and an Appendix. Chapter 2 contains descriptions of the two languages to be used--the A.-calculus and n-sorted logic. Some basic theorems about these languages will be stated. Finally, it will be shown that the resolution rule for first-order logic is extendable to n-sorted logic. Chapter 3 contains the major results of this work. It is first shown how to translate the \- calculus into an axiomatic theory of n-sorted logic. It is then shown that this translation preserves the property of being inconsistent. In Chapter h, the naming rule is introduced as a possible alternative to including certain of the axioms given in Chapter 3» Chapter h also contains some comments about the completeness of resolution and naming for higher-order theorem-proving in n-sorted logic. Chapter 5 contains some concluding remarks and suggestions for further research. Finally, the Appendix contains examples of higher-order problems solved by the method of Chapter k. 2. THE \-CALCULUS AND n-SORTED LOGIC 2.1 Preliminary Remarks on Defining a Logical System This chapter will describe the two logical systems to be considered, the \-calculus and n-sorted logic, and the relevant facts about each one. In describing a logical system it is necessary to give: 1. the basic symbols; 2. the way these symbols are to be combined into well-formed formulas (hereafter abbreviated wffs ) ; and 3- how the formulas are to be interpreted or how they are to be manipulated or both depending on whether the mathematician is interested in the existence of models for his formulas or in rules of inference and formal proofs. These two systems are well-known to logicians, and their descriptions are given for reference only. 2.2 The ^.-Calculus This section contains a description of a logical system called the \-calculus. Most of this material is taken directly from Robinson [18], and hence the reference will be made only here at the beginning of this section. The symbols of this language consist of type symbols, identifiers, the symbol "A.", the symbol "->", and the open and close parentheses, "(" and 7 Definition 2.2.1 The set of types, usually denoted by T, is defined inductively: 1. a finite set of symbols including the symbol TRUTHVALUE, (abbreviated as "TV") is given as the set of simple types ; 2. if a and f3 are types, then (a -»• p) is also a type and is called a complex type . Finally, a type is either a simple type or a complex type. There is an infinite number of types for given any two types, say a and. p, a third type (a -* (3) must be present. Thus starting with the finite set of simple types, one can construct inductively an infinite set of complex types which are also types. NOTE: The language to be described in Section 2-3 also contains types. When possible confusion may arise, types of the X- calculus will be referred to as \-types. Definition 2.2.2 The set of identifiers consists of 1. the special identifiers "OR", "AND", and "IMPLIES" all of type (TV - (TV -> TV)), "NOT" of type (TV -» TV), "TRUE" and "FALSE" of type TV, for each type a the symbol EQUAL of type (a -> (a -* TV)), and for each type a the symbol CHOICE of type ((a -> TV) -> a), and 2. for each type a infinitely many non-special identifiers of type a. The symbols used for the non-special identifiers can be chosen freely for their mnemonic value subject only to the restrictions that they be distinct from each other and from the special identifiers. Each identifier 8 is associated with a type. Identifiers of type (a -* TV) are also called predicate symbols of type a . Definition 2.2.3 A wff is a sequence of symbols defined inductively as follows: 1. an identifier of type a is a wff of type a, where a is a type; 2. if X is a non-special identifier of type a and B is a wff of type P, then (XXB) is a wff of type (a -> p), and is called an abstraction ; 3- if F is a wff of type (a -» p) and B is a wff of type a, then (FB) is a wff of type p and is called an application . As with identifiers, each wff has an associated type. A wff of type (a -* TV) is called a predicate wff of type g, and a wff of type TV is called a sentence . The reader may have noticed that all functions appear to be unary when, in fact, not all functions the mathematician deals with are unary. The method of handling this discrepancy can be illustrated using the standard "logical or" function, v, usually considered to be binary. Given two truth values a and b, a^b is another truth value. Suppose only one truth value, say a, is given, and write a^. If now a second truth value, say b, is supplied, the result a^b is indeed a truth value. Hence, the object "a^" produces a truth value when given a truth value; that is, a^ is itself a unary function from truth values to truth values. Thus the binary function can be considered as two unary functions, the unary function v which yields the second function av when applied to an argument a. Thus v is of type TV -> (TV -» TV). In a similar manner, any n-ary function can be considered as a sequence of n unary functions. This is the point of view taken in the ^.-calculus. Example 2.2.1 Suppose INTEGER and TV are the only two simple types. Then the set of complex types includes, among others, 1. (INTEGER -» INTEGER), 2. (INTEGER -> (INTEGER -» INTEGER)), and 3- (INTEGER -» TV). Suppose the non-special identifiers include 0, 1, 2, ... all of type INTEGER, S and UMINUS (denoting unary minus) both of type (INTEGER -> INTEGER), PLUS, MINUS, and TIMES all of type (INTEGER -> (INTEGER -* INTEGER)), and GREATER - THAN-ZERO of type (INTEGER -> TV). Then the following are some examples of wf f s and their types : 1. (UMINUS 2) of type INTEGER; 2. (MINUS 5) of type (INTEGER -> INTEGER); 3- ((MENUS 5) 7) of type INTEGER; k. (TIMES 1) of type (INTEGER -* INTEGER); 5- ((TIMES 1) 1) of type INTEGER; 6. (\1( (TIMES 1) 1)) of type (INTEGER -> INTEGER); 7. (( IMPLIES (GREATER-THAN-ZERO l) ) (GREATER-THAN-ZERO(Sl) ) ) of type TV. In the above example, the reader has probably already guessed the intended meaning of the identifiers. The numbers 0, 1, 2, ... were meant to name the integers, S the successor function, UMINUS the unary minus function, etc. Of course, one could have some other assignment of values in mind. In general, it is intended that identifiers corresponding to a simple type stand for individual objects while identifiers of a complex type, say (a -> p), stand for functions from objects of type a to objects of type p. 10 An application (FB) stands for the value of the function F when applied to the argument B. An abstraction, say (\XB), defines a new function by listing the argument of the function, in this case X, and the value of the function for the argument, in this case B. The special identifiers are meant to stand for the functions normally associated with them. That is, OR is meant to stand for disjunction, AND for conjunction, IMPLIES for implication, NOT for negation, EQUAL for equality between objects of type a, and CHOICE a a for a choice function of type a- Of course, these functions are now to be considered as sequences of unary functions as described above for disjunction. Finally, TRUE is meant to stand for the truth value "true" and FALSE for the truth value "false". Note that the symbol "TRUE" is an identifier of the \- calculus, while "true" is the value of that identifier. Thus the symbol "TRUE" is used in constructing a wff, while the value "true" may be the value of a sentence. Similarly for "FALSE" and "false", and the other special identifiers and their intended meanings. The only function mentioned above which may not be familiar to the reader is the choice function. Its definition is given here for reference. Definition 2.2.^ A choice function of type q is a function c which takes as an argument an object s of type (a -* TV) and yields as its value an object c(s) with the property that s(c(s)) is true if there are any objects of type a at all for which s has the value true. The above remarks about the intended meanings of identifiers will now be made precise. 11 Definition 2.2.5 An interpretation or Henkin frame is a pair (g, D) satisfying the following conditions: 1. D is a set of domains, one for each type in T; if a is a type, the domain associated with type a is denoted by D ; 2. g is a map of identifiers into D such that if X is an identifier of type a, g(X) is an object in D ; 3 * D TRUTHVALUE = [tT ™> f ^ S ^> k. D/ _n consists of functions from D into D„: 5- g maps the special identifiers onto their regular meanings; that is, g(NOT) is negation, g(OR) is disjunction, g(AND) is conjunction, g(lMPLIES) is implication, g(EQUAL ) is equality between objects of type a, and g(CHOICE ) is a choice function for objects of type a; 6. B is closed under g; that is, g(A) is in D for every wff A where g is extended to the set of all wffs inductively as follows : A. g(FB) is the value when the function g(F) is applied to the argument g(B); B. g(\XB) is that function which yields g'(B) when applied to g'(X) where g' = g everywhere except possibly at X and g'(X) is allowed to range over the domain of D associated with the type of X. The intent of property 6 is to ensure that the domain D contains all the objects that the language is able to "talk about". That is, given 12 the usual interpretation of application and abstraction, the interpretation of every wff (not just every identifier) is in D. Definition 2.2.6 A sentence A is said to be valid if g(A) = true for every inter- pretation (g, D). A sentence A is said to be consistent if g(A) = true for some interpretation (g,D); in this case, (g,D) is called a model of A. A sentence A is inconsistent if g(A) = false for every interpretation (g,D). Clearly a sentence A is valid if and only if the sentence (NOT A) is inconsistent. Example 2.2.1 (continued) Let (g, D) be an interpretation of the A.-calculus described in this example in which D IKTEGER is the set of integers, g(0) = 0, g(l) = 1, ..., g(UMINUS) is the unary minus function, g(PLUS) is addition, g (MINUS) is subtraction, g (TIMES) is multiplication, g(S) is the successor function, and g( GREATER -THAN-ZERO) is the predicate which is true of an integer i if and only if i is positive. Then the evaluations of the formulas listed above are as follows : 1. g(UMINUS 2) is the Integer -2; 2. g (MINUS 5) is the function which yields 5-a when applied to an integer a; 3. g( (MINUS 5) 7) is the integer -2; k. g( TIMES 1) is the function which yields l*a when applied to an integer a; 5- g( (TIMES 1) 1) is the integer 1; 13 6. g(\l( (TIMES l) l)) is the square function; to see this, note that g(\l( (TIMES l) l)) is that function f which yields g' ((TIMES 1) l) when applied to g'(l) where g' = g except at 1 and g'(l) varies over the Integers; now g' (TIMES) is still multiplication, so that g' ((TIMES l) 1) is just (g'(l)) 2 ; that 2 is, f applied to an integer a yields a ; 7. g( (IMPLIES (GTZ 1))(GTZ(S1))) is true. Note that the values of the wffs in 2, k, and 6 are functions from integers to integers. Because (g, D) was assumed to he an interpretation, it must satisfy condition 6 of Definition 2.2.5; thus, whatever other functions are contained in D ( INIEGER _> INTEGER)' the ° neS listed in 2 > ^> ajld 6 above must he present. Formulas 1-5 illustrate application, formula 6 illustrates the definition of a new function by abstraction, and formula 7 shows the evaluation of one of the standard logical connectives. In formula 6, the identifier 1 appears to play the role of a variable while in formulas k, 5, and 7 it stands for a fixed integer. It is the case in the \-calculus that an identifier is used both as a constant and as a variable . This is not true in n-sorted logic, to be described in the next section, where constants and variables are represented by distinct symbols. This fundamental difference in the two languages will arise often in the theory to be presented in Chapters 3 and k. The next three definitions single out wffs of particular forms which will play an important role in the material in Chapters 3 and k. Definition 2.2.7 An atom is a wff in one of the following two forms : Ik 1. ((EQUAL A) B); or 2. (FB) where F is a wff of type (a ■* TV) not of the form (OR A), (AND A) or (IMPLIES A) for some A. Definition 2.2.8 A literal is an atom or a wff of the form (NOT A) where A is an atom. Definition 2.2.9 A clause is a wff which is defined inductively as follows: 1. a literal is a clause; 2. if C. and C are clauses, then ((OR C. ) C ) is also a clause. The reader may also have noticed the absence of the universal and existential quantifiers, V and 3. In fact, in the \-calculus these two concepts are expressed as follows : V XA is ((\XA) (CHOICE (\X(NOT A)))) and 3 XA is ( (KKA) (CHOICE (\XA))) where A is a sentence. The first assertion will now be proved. The second one can be proved in a similar manner. Suppose a Henken frame is given in which V XA is true. CHOICE(\X(NOT A)) is an object of the same type as X, and so when KXA is applied to that object, the result must be true. Conversely, suppose ((A.XA) (CHOICE (XX (NOT A)))) is true and suppose it were possible that V XA were false. Then AXA would be false for some object of the type of X, and hence XX(NOT A) would be true of some such object. But, by the definition of CHOICE, CHOICE(XX(NOT A)) would be such an object (that is, an object for which \X(NOT A) were true); hence (A.XA) applied to CHOICE (\X(NOT A)) would yield false, contrary to the hypothesis. Thus V XA is true if and only if ((XXA) (CHOICE (\X(NOT A)))) is true. 15 The language described above may be supplemented by singling out some set of wffs, which are then called axioms, and including rules of inference for transforming one or more formulas into new formulas. Different choices of axioms and rules of inference lead to different inference systems. Definition 2.2.10 A proof is a sequence of wffs such that each wff is either an axiom or is derivable from previous wffs of the sequence by some rule of inference. Definition 2.2.11 A theorem is a wff which is the last wff of a proof. While the notions of proof and theorem are natural ones for humans, as mentioned in Section 1.1, almost all automatic theorem-pr overs (including the one proposed in this work) attempt to demonstrate that a wff A is valid, or equivalently, that (NOT A) is inconsistent. However, there is an important relation between the concepts of validity and proof which was given by Henkin [8]. Theorem 2.2.1 (Henkin' s Completeness Theorem) A sentence A of the \-calculus is a theorem (in a suitable inference system) if and only if A is valid. As a final note on the \-calculus in this section it should be pointed out that the domains H, > need not contain all functions from H to (a -> P) a H Q . A Henkin frame (g.H) in which H/ q n does contain all functions from H to H R for every complex type (a -* p) is called a standard frame . Obviously, to be valid in all Henkin frames is a stronger condition than being valid in all standard frames. In fact, validity in all standard frames is insufficient 16 to guarantee provability. This explains the apparent inconsistency between Henkin's Completeness Theorem and Gb'del's Incompleteness Theorem for Second Order Logic. GBdel showed how to construct a sentence J which, when inter- preted in a standard frame, had the meaning that J itself was not a theorem. Thus, if J was a theorem, it would be valid in all standard frames (see Church [2]), and so by the previous sentence, J would not be a theorem, a contradiction. Thus J is not a theorem, and so it is valid in all standard frames. Then J is a sentence which is valid in the standard sense, but not a theorem. This is Go'del's Incompleteness Theorem. In constructing J, G'ddel defined equality between two objects A and B as (v F( (IMPLIES (FA) ) (FB) ) )-- that is A and B are the same object if for every property F, if F holds for A it also holds for B. This is not an adequate definition of equality in some non-standard interpretations. In these interpretations, then, J will not necessarily have the meaning that J is not a theorem. In fact, J fails to be true in some non-standard interpretation. Thus, J is not valid in the general or Henkin sense and Henkin's Completeness Theorem is not violated. While it may be argued philosophically which notion of validity to accept, it seems clear that for the purposes of automatic theorem-proving the correct point of view is to take validity in the Henkin sense, since this is the view that links validity with provability (see Robinson [17]). 2-3 n-Sorted Logic The reference for this material is Kreisel and Krivine [10], Chapter 5- The notation is slightly different, but the concepts are the same. In an n-sorted logic, there are exactly n types where n is some integer; hence the name n-sorted logic. Moreover, unlike the X-calculus, all types are simple types. When n is 1, the language is called first-order. 17 Definition 2. 3.1 An n- sorted language, ¥L, consists of the following symbols: 1. a set of n type symbols ; 2. for each type i, an infinite set V of variables of type i; the set of all variables is denoted by V; 3- for each type 1, a (possibly empty) set C of constant symbols of type i; k. for each integer k > 1 and each k + 1- tuple of types (i_, ... i , i) -L k a (possibly empty) set of k-ary function symbols ; 5. for each integer k > and each k-tuple of types (i , ... i ) a (possibly empty) set of relation symbols ; Definition 2.3.2 A term of type i is a sequence of symbols of K defined inductively as follows : 1. a variable or constant symbol of type i is a term of type i; 2. if f is a function symbol of type (i_, ... i^ji) and t , ... t are terms of type i-,, . «. i, respectively, then f(t,, ... t, ) is a term of type i. The set of terms of type i is denoted by T , and the set of all terms is denoted by T. Definition 2.3-3 A wff is a sequence of symbols of K defined inductively as follows: 1. if P is a relation symbol of type (i, , ... i ) and t,, ... t are terms of type i , ... i respectively, then P(t,, ... t. ) is a wff and is called an atom; 18 2. if A and B are wffs, then so are (-» A), (A v B), (A a B), (A -► B), (V x A) and (3 x A) where x is a variable. When the relation symbol P in 1 above is =, the atom will be written (t = t ) to correspond to normal usage of the equality symbol. Definition 2.3.k The concept of a bound occurrence of a variable is defined inductively as follows: 1. in an atom, no occurrence of a variable is bound; 2. an occurrence of a variable in (A v B), (A ^ B), (A -* B), and (-, A) is bound if and only if the corresponding occurrence in A or in B is bound; 3- every occurrence of x in (V x A) or (3 x A) is bound; an occurrence of a variable distinct from x is bound in (VxA) or (3 x A) if and only if the corresponding occurrence in A is bound. An occurrence of a variable that is not a bound occurrence is called a free occurrence . Definition 2.3. 5 A closed wff is a wff which has no free occurrence of a variable. An open wff is a wff that is not closed. Example 2.3.1 Suppose there are two types, 1 and 2. Let SQF.T and # be function symbols of types (1,2) and (2,2,1) respectively. Let P and = be relation symbols of type (l) and (l, l) respectively. Suppose the variables of type 1 are 19 are x , x , . . . , and the variables of type 2 are y , y , ... . Then the following are terms of the indicated types : 1. SQRT (x_) of type 2; and 2. * (SQRT (x^), SQRTCx^)) of type 1. The two expressions P(x^) and (x = ^-(SQRT(x 1 ),SQRT(x_ ) ) ) are atoms, and the two expressions Vx 1 (P(x 1 ) - (3v 1 (x x = * (y r Y-l)))) and ((VxP(^)) -* (x, = y_)) are wffs. Intuitively, the different types of the language are meant to indicate distinct kinds of objects. Variables are meant to range over an appropriate type of object while constant symbols represent fixed objects of the appropriate type. Function symbols and relation symbols are meant to stand for actual functions and relations; the associated k- or k+1- tuple determines the type of object allowable in each argument position, and in the case of function symbols, the type of the value of the function. These intuitive remarks will now be made clear as the concept of a realization of the language K is introduced. A realization is the analogue of an interpretation for the \-calculus. Definition 2.3.6 A realization of an n-sorted language, K, consists of a pair (h,E) satisfying the following conditions: 1. E is a set of n non-empty domains, E., , E^, ... E , one for each 7 1 2 n type; 20 2. h(c) e E. if c is a constant symbol of type i; 3- n(f) e E. x ... x E. ■♦ E. if f is a function symbol of type x l \ x (i^, . . . , i^, 1 ) ; k. h(p) e E. x ... x E. -> {true, false] if P is a relation 1 1 T[ symbol of type (i-, . .., i ) . The following notation, used by Kreisel and Krivine [10], has two advantages. First, it allows rigorous definition of the meaning of the quantifiers without introducing names of objects into the language and without resorting to intuition. Second, it allows interpretation of open formulas . Let E denote the set of all maps 5 of V into E that preserve type, i.e. such that 5(V ) B) = (E V - h(A)) U h(B) (so that A -> B is equivalent to ((-j A) v b) as one normally would expect). For the quantifiers, h(3 x A) 21 is the projection of h(A) along x, that is h( 3 x A) = {5eE there is some S'eh(A) and 6' = 5 except possibly at x} . Finally, h( V x A) = {SeE | every 5 f such that 5' = 8 except possibly at x satisfies 5'eh(A)}. Again, one can verify that h( :3 x A) = h( — V x — A) as would be expected. Kriesel and Krivine [10] show that h(F) depends only on the variables which have free occurrences in F. In particular, if F is closed, then h(F) = E or h(F) = 0. The property h(F) = E corresponds intuitively to F being true in the interpretation (h, E). Definition 2. 3. 7 If A is a set of wffs of K and (h, E) is a realization of K such that h(F) = E for every wff F in A, then (h, E) is called a model of A . A model of A in which the binary relation symbol - ' is mapped onto equality between objects of type a and objects of type P is called a normal model . If every realization of K is a model of A, then A is said to be valid . If A has at least one model, then A is called consistent . If A has no model, A is said to be inconsistent . Example 2Q.1 (continued) Let a realization (h, E) for language K in Example 2.J.1 be defined as follows: E, = integers; E = real numbers; h(SOJlT) is the function which yields the real square root when applied to a non -negative integer and otherwise, h(*) the function which yields the greatest integer < the product of two real numbers; h(P) is that relation which is true of an integer i if and only if i is > and h(=) that relation which is true of an integer i and an integer r if and only if i and r are the same number. Then for any SeE , 5(SQPT(x,)) is the real number which is •/©(x..) if 5(x x ) > and is the real number if bix^) < 0. Next 5(*(SQiRT(x 1 ),SQPT(x ))) is the integer S(x ) if 5^) > 0, the integer if 5(x ) < 0. For the wffs, h(P(x 1 )) is the set of all 5eE V such that 5^) > 0. h(x 1 = *(SQPT(x 1 ),SQPvT(x _))] is also the set of all 5eE such that 5(x, ) > 0. For, if o(x,) > 0, by the above, 6(*(SQF.T(x ),SQPT(x ))) is o(x ), while if B(x ) < 0, 8(*(SQPT(x ),SQJ?T(x ))) is 0/ 6(x ). Finally h( ^x 1 (P(x 1 ) -> ( =Iy 1 (x 1 = *(ar 1 ,y 1 ))))) = E V . To see this, consider an arbitrary 5. If 5(x, ) > 0, then 5eh(P(x )) and so 5/(E -h(p(x -,)))• However, n/s(x ) is a real number. Let 6' = 5 except at y, where S'(y, ) = v5(x ). Then clearly 5'eh(x = *(j-,,j-,)), and so by the defini- tion of the evaluation of an existential quantifier, 5eh( 3y.(x. = My-^y-,))) and so this 5 is in h(p(x ) -» ( 3 y (x = *(y-,,y-,))))« Suppose then that 5(x 1 ) < 0. Then S^h(P(x )) so that 8e(E V -h(p(x ))). Thus, every 8 is in h(p(x, ) -* ( 3y (x ] = *(y , y )) ) ) . By the definition of the evaluation of a universal quantifier, h( Vx 1 (P(x 1 ) - ( Sy^ = ^(y 1 ,y 1 ))))) = E V . The definition of atom in an n-sorted language has already been given. Definitions of literal and clause for such languages (the analogues of Definitions 2.2.8 and 2.2.9) will now be given. 23 Definition 2.3.8 A literal is an atom or a wff ( -j A) where A is an atom. Definition 2.3.9 Clause is defined inductively as follows: 1. a literal is a clause; 2. if c, and c are clauses, then so is (c v/ c p ). Definition 2.3. 10 The wff (A -*■ B) a (B -> A) will be abbreviated (A «> B). Definition 2. 3. 11 A unit is a clause containing exactly one literal. The next three theorems from Kreisel and Krivine [ 10] are central to the results later. Theorem 2.3 »1 (The Finiteness or Compactness Theorem) Let S be a set of closed wffs. Then S has a model if and only if every finite subset of S has a model. Definition 2.3-12 A wff B is in Skolem normal form if it is of the form V Xl ... Vx k (F(x 1 , ... x k )) where F(x_ L , ... x^) is a quantifier free wff. Fix^ ... x^) is called the matrix of B . Theorem 2.3*2 (The Uniformity Theorem) A wff B in Skolem normal form with matrix F(x , ..., x ) is in- consistent if and only if there is a finite set of k-tuples of terms 2k (t 1 , ..., t k ), (t^~, ..., t k ) ... (t 1 , ..., t k ) such that F(t , ..., t. ) * ... *> F(t , . .., t, ) is an inconsistent wff of the propositional calculus. Theorem 2. 3. 5 (The Skolem Normal Form Theorem) There is an effective procedure which will transform a closed wff B into a wff C in Skolem normal form so that B has a model if and only if C has a model. 2.h Resolution for n-Sorted Logic This section describes the extension of the work of Robinson [ 15] to n-sorted logic. The proofs of the n-sorted analogues of the theorems in that paper are almost word for word the same as Robinson's original proofs for first-order resolution. Therefore, no proofs will be given here. Definition 2.^.1 A substitution component is a pair U/V where U is a term and V is a variable of the same type as U. Definition 2.JJ-.2 A substitution is a set of substitution components {\/\ f ••• \/\) satisfying V ± J V if i / j. Definition 2. k.3 The application of a substitution a to a string of symbols E, denoted by Ea, is to simultaneously replace each occurrence of a variable V. by the corresponding term U. . 25 The restriction that U and V he of the same type for a substitu- tion component u/v guarantees that the application of a substitution to a wff yields another wff. Definition 2.k.k Let a = (U^/x^ ... U/xjT-j/y-j, ... T k /y k ) and "X = (S-,/y-,, ••• S „/y„,W.,/z_, ... ¥ /z ). Then the composition 8\ is 1' 1 r .r 1' 1 m' m *■ 1 ' 1' j ' y 1 ' 1' k ' k 7 1' 1' m' m i i or T.X = y. for some i, that component is not included. Because n-sorted substitutions are substitutions, the elementary facts corresponding to 5»5«l-5«5»^ in [15] hold. That is (Ect)A. = H(oX) for all strings E; if Ea = EX for all strings, then a - X; {dk)u = cr(\ju) for all substitutions a, X, and ju; and (A U B)cr = Ao U Bcr. Definition 2. ^.5 The disagreement set, B, of a set of "well-formed expressions, A, is the set of well-formed subexpressions of expressions of A which begin at the first symbol position in which not all expressions of A agree. Definition 2. ^-.6 If a is a substitution such that Ao is a singleton set, then a is called a unifier of A and A is said to be unifiable . Further if o is such that for every unifier (i of A, u = oX for some substitution X, then o is called a most general unifier, abbreviated MGU. The empty substitution is denoted by e. 26 Definition 2.U.7 (The Unification Algorithm) 1. set a n = e, k = 0, and go to 2); 2. If Aa is a singleton, set a. = a, and terminate; otherwise go to 3.; 3« if the disagreement set of Aa, contains a variable x and a term t (necessarily of the same type since the expressions of Aa are all well" formed) such that x does not occur in t, then set a, , = ' k+1 a (t/x), k = k + 1 and return to step 2; otherwise terminate. Theorem 2A.1 (The Unification Theorem) If A is unifiable, then the unification algorithm halts in step 2 and cr is a MGU. Definition 2.^.8 Let C = L . v C and D = M v D' he two clauses with L and M subsets of C and D respectively. If L and M are literals and one is the negation of the other, then C and D are said to be ground resolvable with ground resolvent C v D'. Let C and D be C and D with variable names changed so that C and D have no common variables. Let N be the set of atoms that occur in L, and M-, . If N is unifiable, and a is a MGU of N, and if L a and M^ a are complements, then C and D are said to be generally resolvable, or just resolvable, with resolvent C' cr v D' a . Theorem 2.^.2 (The Lifting Lemma) Let A and B be two clauses with instances C = P ^ C 1 and D = ~i p v D' respectively. Let E be the ground resolvent C' v D*. Then A and B are 27 resolvable and have a resolvent F such that E is an instance of F. Theorem 2.K.J> (The Resolution Theorem for n-Sorted Logic) Let Vx ... V x, (C. a>» . . *• C ) be a closed inconsistent wff in Skolem Normal Form with matrix clauses C,, C p , ... C . Then the empty clause can be deduced from CL, C , ..., C using only the resolution rule of inference. Chapter 3 presents a certain equivalence between the ^-calculus and a corresponding theory in an n-sorted system. Chapter k deals with practical applications of this equivalence using the resolution rule. 28 3- EQUIVALENCE OF \ -CALCULI TO n-SORTED THEORIES In this chapter it will be shown that the \-calculus is equivalent to a theory in an associated n-sorted logic. 3.1 Locally Henkin Frames Definition 3«1»1 Let E be a subset of the set of types in a \-calculus. The type closure of E is the smallest set of types which contains E, the type TV, and which contains a and |3 whenever it contains (a -* (3). Let S be a set of expressions of a \-calculus. The notation T(S) will denote the type closure of the set of types occurring in S. Definition 3.1.2 Let S be a set of expressions of a \ -calculus. Then a local Henkin frame relative to T(s) consists of a set of domains H and a mapping g of identifiers satisfying the following: 1. H contains a domain for each type in T(s) and no others; 2. g is a map of identifiers whose types are in T(S) into H satisfying g(x)eH if X is of type a; 3- H(TV) = {true, false}; k. H, ^ N consists of functions from Ha to HP; (a -* 3) 5. g maps the special identifiers onto their regular meanings; 6. H is closed under g relative to T(S); that is, if A is an expression which is built from expressions all of whose types 29' are in T(S), then g(A)eH where g is extended to the set of all formulas over T(S) in the same way as in Definition 2.2.5. The notions of local validity, local consistency, and local inconsistency are the analogues of validity, consistency and inconsistency given in Definition 2.2.6. Where the context can cause no ambiguity, the phrase "relative to T(S)" will be omitted. Note the similarity to the definition of Henkin frame. As the name implies, a local Henkin frame merely restricts one's attention to a limited set of domains. The following theorem shows that it is sufficient for demonstrating inconsistency to consider only local Henkin frames. Theorem 3Q«i Let S be a set of sentences of a ^.-calculus . Then 1. if S is locally valid, S is valid, and 2. if S is consistent, S is locally consistent. Proof Suppose S is consistent and (g,H) is a Henkin frame for which g(S) = true. The restriction of (g,H) to T(S) is a locally Henkin model of S proving 2. To prove 1, suppose S is not valid. Then (NOT S) is consistent. By 2, (NOT S) is locally consistent contradicting the assumption that S is locally valid. Theorem 3* 1*1 shows that for the purpose of determining validity or inconsistency, one can just as well restrict one's attention to locally Henkin frames. This is the approach taken in the remainder of this work. 30 3*2 Representation of the ^.-Calculus in an n-Gorted System A translation of the X-calculus into n-sorted logic will now be given. Definition 3.2.1 Let S be a set of A.-calculus sentences and T(s) the type closure of S. Then the associated n-sorted language, K, contains the following symbols and types : 1. for each type oeT(S), K contains a type, also denoted by a; note that while some types of T(S) are complex, all types in K are simple; 2. for each type a = (0 -» TRUTHVALUE) in T(S), one specialized binary relation symbol, p , 3- for each type a = (P -> 5) in T(S), 5 ^ TRUTHVALUE, one specialized binary function symbol f ' ' ' called the function associated with (p _> g). k. for each \-calculus identifier, X , a constant symbol c and a variable v„ both of K-type a. 5. for each type a in T(S), the binary equality relation „_(a, a) it. 6. a set of function symbols to be described below. In 2 and 3 above, the types p and 5 will be K- types because they will be in T(S), a type closure in which (0 -» TRUTHVALUE) or (p -» 5) occurs. Superscripts, when written, will indicate types. Where no ambiguity may arise superscripts will be omitted. We may assume without loss of generality that in any formula 31 of a \- calculus, no two bound identifiers are equal and if X.YB is subformula of A, that Y occurs in A only within \YB. The following definitions are made with respect to a particular wff A, in preparation for the actual translation of A. Definition 3.2.1 An identifier X which occurs immediately to the right of a X in a subexpression (A.XB) of A is called a marked identifier of A . Recall that (V XA) and ( =3 XA) are abbreviations for formulas containing \, so that in each of the above cases, X is a marked identifier. Definition 3-2 «2 The term associated with a subexpression B of A not of type TV is defined inductively as follows: 1. the term associated with an identifier X is v if X is a marked A identifier, c otherwise; X 2. the term associated with an application (GD) is f(t , t ) where G D t is the term associated with G, t is the term associated with D, and f is the function symbol associated with the type of the wff G; 3- the term associated with (XXB) is d(u , ..., u ) where d is a new function symbol (allowed for in 6 of Definition 3*2.1) and u, , ... u are the variables of K associated with the 1' m marked identifiers of (\XB) other than X. 32 Definition p. 2.$ The translation t, of a subexpression B of A of type TV is calculated as follows : 1. if B is an atom (FC) of A, t-^FC) is P(t ,t ) where t and t are the terms associated with F and C respectively and P is the relation symbol of K, of type ((a -* TV), a) where F is of type (a - TV); 2. t, maps the logical connectives and EQUAL onto their analogues 1 a of K; thus, for example, t,((QR B,) B ) is (t-,(B ) v t,(B )) and T 1 ((EQUAL a B 1 ) Bg) is t^ = (a ' a) t £ . Definition 3.2.k The defining axiom for a subexpression (A.XB) of A is one of the following two wffs of K: 1. yv-V^ ... Vu (t, (B) +* P(d(u,, . .. u ),0) if B is of v X 1 ml 1' m ' X type TV; 2. V v x V u x ... Vu m (t B = f(d(u L , ... u m ), v x )) where t £ is the term associated with B and f is the function symbol associated with the type of (\XB) if B is not of type TV. In both cases, d(u., ..., u ) is the term associated with AXB. The translation of A can now be given. Definition 3-2-5 The translation t of A is t-,(A) aC a ... AC 33 "where C , C , . . . C, are the defining axioms for the subf ormulas of A beginning with \. While CHOICE is used mainly to define the quantifiers, it can be translated as Vz Vx (3 yP(x,y) -* (z = f (CHOICE, x) -» P(x,z))). As was noted on page 13 the X-calculus and n-sorted logic differ in their treatment of marked identifiers. In the ?v-calculus, an unmarked identifier acts as a constant while that same identifier when marked plays the role of a variable. In n-sorted logic there are no such dual symbols, and care must be taken to use the associated variable or constant as necessary. Example 3«2. 1 Suppose A is the formula (Vm((Xn((EQUAL( (TIMES TWO) ((SUM ID) n)))((TIMES n)((PLUS n) ONE)))) m)) The types and their associated identifiers are: 1. INTEGER - n, m, ONE, TWO, 2. (INTEGER -> INTEGER) - ID, 3- (INTEGER -* (INTEGER -» INTEGER)) - TIMES, PLUS, and k. ((INTEGER -> INTEGER) -> (INTEGER -+ INTEGER)) - SUM. The marked identifiers of A are n and m. Let f be the function associated with (INTEGER -* INTEGER), f the function associated with (INTEGER -» (INTEGER -> INTEGER)), and f the function associated with ((INTEGER -* INTEGER) ^ (INTEGER -* INTEGER)). Let P be the relation symbol of type ((INTEGER -» TV), INTEGER). The term t associated with ((TIMES TWO) ((SUM ID) n)) is f x (f 2 (TIMES, TWO), f x (f 5 (SUM, ID), n)) ; and the term t p associated with ((TIMES n)((PLUS n) ONE)) is f- (f ( TIMES, n), f n (f_(PLUS, n), ONE)) . ± ^ 2 ^^^, »„ - 1 v-g The symbol EQUAL applied to the two subexpressions is translated by t, onto (t, . t 2 ). The defining axiom for the \ -expression is (Vn ((t 1 = t )** P(d,n))). The wff A has one atom, namely the ^.-abstraction applied to m. Then t(A) is Tl (A) a (yn((t 1 = t 2 )** P(d,n))) which is ( VmP(d,m)) - ( V n ((t 1 = t & ) - P(d,n))) 35 The only variables occurring in t(A) are variables associated with marked identifiers. Such variables occurring in a defining axiom are all universally quantified. For a variable to occur in t (a), the corresponding identifier must be quantified by V or 3. These occurrences of variables are bound when V or 3 is translated by x., . All other subexpressions beginning with X are assigned associated terms as in Definition 3-2.2 before the calculation of t, (A) . Thus, all occurrences of variables in t(A) are bound occurrences, and t(A) is a closed wff of K. It is possible to treat the A.-type TV as another type in K. This is necessary, for example, if the formula being translated includes an identifier of type (TV -> a) distinct from the special identifiers. In this case, the logical symbols of the ^.-calculus are also treated as constant symbols, and certain wffs in the language of K, which will be described later, must be added to insure that these constant symbols are Interpreted correctly. However, in most of this work, it is assumed unnecessary to treat TV in this way. 3-3 The Associated Realization of K Throughout this section, all Henkin frames are considered to be local. Definition 3»3»1 Let (g,H) be a (local) Henkin frame for the A.-calculus. The associated realization (h, M) of K is defined as follows: 1. M = H for each type a; a a 2. h(c y ) = g(x) for each constant symbol c„ of K; 36 3« for the relation symbol P of type ((a -* TV"), a), h(P) holds for a pair (a, b) if and only if a(b) holds; that is, for a predicate a in H/ _^ _ ^ and an object b in H , h(P) (a,b) is true if and only if the predicate a applied to the object b is true; k. for the function symbol f of type ((a -► p), a), p ^ TV, h(f) is the function defined by the equation h(f) (a,b) = a(b); 5. for the function d associated with the wff >-YB with marked identifiers Y, Y , ... Y,, h(d) is the function defined by the equation h(d) (suj, ..., a^) = g'(\YB) where g' = g except at Y , ... Y where g'(Y. ) = a.. In the following (g,H) and (h,M) are as above. Lemma 3»3«1 Let A be a formula of the X.-calculus and B a subformula of A. Let t be the term associated with B. Let Y, , ... Y be the marked identifiers of 1' m A and u n , ... u the associated variables of K. Let 5 be in M . If 5(u. ) = g(Y ± ), i = 1, ... m, then S(t) = g(B). Proof By induction on the construction of B (or t); Case 1 B is an identifier. If B is marked, then B is Y. for some i and t is u. . By hypothesis, 5(u.) = g(Y.)« If B is not marked, then t is c . 37 By the definition of h and 8, &(c_) = h(C_) = g(B) Case 2 B is \YB ,, B not of type TRUTHVALUE. The term associated with B is d(u n , ... u ). By the definition of h(d), and the fact that S(u.) = l 7 m ' v i ' g(Y. ), where Y. are the marked variables of B, 8(t) = h(d)(g(Y 1 ), ... g(Y m )) = g(\YB 1 ). Case 3 B is an application (FD) not of type TV and the terms t , t associated with F and D satisfy 5(t ) = g(F) and 5(t ) = g(D). Then t is t JJ f(t F , t D ) and 6(t) = h(f)(5(t F ), 6(t D )) = h(f ) (g(F),g(D)) . By part k of the definition of M, h(f)(g(F), g(D)) = g(F)(g(D)) = g(FD) = g(B). This completes the proof of the lemma. Clearly by the way h(d) is defined, the result holds as g is allowed to vary over assignments of identifiers . Lemma 3-3-2 Let A, B, Y., , .... Y , u, , .... u be as in Lemma 3«3«1> and suppose B is of type TV. Then for each 6eM , 5eh(r (b)) if and only if g'(B) = true where g' - g except at Y,, ..., Y and g'(Y. ) = 5(u. ). Proof Suppose B is an atom (FD). Let t and t be the terms associated with F and D respectively. By Lemma 3.3.1, S(t ) = g'(F) and 8(t ) = g'(D). 38 Now, t 1 (b) is the n-sorted atom P(t„, t ). Suppose Seh(P(t , t )). Then, by 3 of Definition 3.3.1, 5(t ) (6(t )) holds; that is g f (F)(g*(D)) holds, or g'(F)(g'(D)) = g f (FD) = true. Conversely, suppose g'(FD) = true. Then, because g T (F) = &(t ) and g'(D) = B(t ), 8(t F )(B(t D )) holds. So, by 3- of Definition 3-3.1, h(P)(&(t ), 8(t D )) holds and 5€h(P(t ,t )). Suppose B is ((EQUAL F. ) F ). Then t. (B) is (t v = t_ ) where t_ 12 1 F i F 2 F i is the term associated with F. . Because (h, M) is a normal realization of K, h(=) is equality. The proof for this form of B now proceeds exactly as above. Suppose B is ((OR B, ) B p ), and suppose the lemma holds for B. and Bg. ^((OR B 1 ) B 2 ) is (^(B^ v T-^Bg)). Then 5€h(r 1 (B 1 ) v ^(Bg)) <=> 5eT 1 (B 1 ) or bex^B^ j. — < g , (B 1 ) = true or g'(Bg) = true <=^ g'((OR B x ) Bg) = true. The cases for the other logical connectives and quantifiers are handled analogously. In the above lemma, note that if t,(b) is a closed wff of K, then h(x,(B)) is either M or cp. If g(B) is true, then there is at least one 5 in h(x (B)), namely the 5 such that 5(u.) = g(Y. ). Then h(«r (B)) is not empty V and therefore, must be M . 39 Theorem 3*3«1 With (g, H) and (h,M) as above, for every sentence A of the ^.-calculus, g(A) = true if and only if 1i(t(A)) = M . Proof t(A) is a conjunction where C. is the defining axiom for a ^.-subexpression of A, say (A.Y.B. ). Recall that t-, (A) is a closed wff of K, so by the remark above, h(T n (A)) = M if and only if g(A) = true. Then it is sufficient to prove the result for each C. . There are two cases. Case 1 B. is of type TV. Then C. has the form 1 l V v y V u x ... Vu m (T 1 (B )«*P(d(u 1 , ..., u ), v y )). i i Consider the matrix of this wff T 1 (B i )«- P(d(u 1 , ..., u m ), v y ) , i and let 5 be in M . Now, d(u n , .... u ) is the term associated "with (XY.B. ) ' v 1' ' m y l l By Lemma J. 3.1, 6(d(u_, ..., u )) = g'OY.B.) 1' m li where g'(Z.) = S(u.) for the marked identifiers Z. of KY.B other than Y. 3 J 3 l l l itself and g f = g elsewhere. Let g" = g* except at Y. where g"(Y. ) = 8(v ). i Clearly g"(\Y ± B ± ) = g 1 (M ± B ± ) because in the evaluation of each, the particular value of g , (Y i ) or g"(Y i ) is immaterial (see 6 of Definition 2.2.5). Then g"(*.Y,B ) = 8(d(u., ... u )). Furthermore, g u ((^Y ± B ± )Y ± ) = g"(B i ). Now suppose 8eh(T ]L (B ]L )). By Lemma 3.3-2, g"(B.) must be true. That is, g"((\Y.B.)Y. ) = true, ill But g n ((A.Y.B.)Y.) = g"(^Y i B.)(g"(Y.)) = b(d(u ±f ... u ))(6(t y )). Then by i the definition of h(p) (3 of Definition 3.3-1), h(p)(8(d(u , ... u )),8(v )) i is true, and so Seh(p(d(u , ... u ),v )). Thus, i 5eh(T 1 (B.) -* P(d(u , ... u ),v )). If 5^h(T-,(B.)), then the above also i holds. Since 8 was arbitrary, 5eh(T, (B. ) ■* P(d(u,, ... u ),v )). Similar remarks show that for any 8 in M , 8eh(p(d(u 1 , ... u m ),v Y> ) -> r 1 (B i )). Then, hO^B^ ~ P(d(u x , ... u m ),v Y )) = M , and by the definition of the i evaluation of the universal quantifier, h(C.) = M V also, l kl Case 2 B. is not of type TV. Then C. has the form VV Y. V V "• Vu m ^B. = f ( d ( u l> '•"' U m^ V Y ^ Again, consider the matrix of this wff, t B = f(d(u 1 , ..., u m ), v y ) i i Let 5 be in M . If g' and g" are defined as in Case 1, then it follows that 6(d(u n , ..., u )) = g'(\Y.B.) = g"(\Y.B. ) 1' ' m li li and g"((^Y.B.)Y.) = g"(B.). f is the function symbol associated with the type of (A.Y.B.)* Therefore, by h of Definition 3«3«1> h(f)(5(d( Ul , ..., u m ), B(v '))) = g"(M.B i )(g"(Y i )) = g"((\Y.B.)Y. ) ill -6(t B _). 1 The last equality follows from Lemma 5«5»1« Tii e first and last items in the above string of equalities state that h(f)(B(d(u n , ..., u )), 8(v v )) = 5(t_ ) L l> Then, 5eh(t = f(d(u.., . .., u ), v )). As in Case 1, 5 was arbitrary, so i i that h(t_ = f(d(u , ..., u ), v )) = M , and hence, h(C.) = M also. This £5. -L mi. 1 1 1 completes the proof of the theorem. In the proofs of the two lemmas and the theorem it was assumed that the type TV never occurred as an argument type, i.e., there were no symbols of type TV -* a, other than the standard logical connectives. If this assumption is not valid, TV is treated just as any other type, the two-place P relation symbols are replaced by three-place relation symbols, and P(x, y) becomes P(TRUE,x,y), etc. Then the proofs of Lemma 3«3»1 and 3«3«2 and Theorem 3»3 •! still hold. Also note that, in the proof of Lemma 3.3.1, no mention was made of the translation of CHOICE. This is because CHOICE does not seem to be used except for quantifiers, which are handled separately. Theorem 3*3 •! shows that if A is a sentence of the \-calculus which has a Henkin model, i.e. a Henkin realization (g, H) for which g(A) = true, then t(A) has a model. In particular one model of t(A) is the realization of K associated with (g,H). The converse is not true. That is, there may exist a model (h, M) of t(A), but the Henkin frame (g,H) which one would expect to correspond to (h, M) is not a model of A itself. There are two ways this situation might arise: 1. the objects of M/ n y are not actual functions from M into (a -* P) a M R , and 2. the pair (g, H) which would be expected to correspond to (h,M) does not satisfy closure, that is, there may be a formula B of the \-calculus for which g(B) ^ H. These will be illustrated in the following example. h3 Example 5«3»1 Let the \-calculus contain the two types REAL and (REAL -* TV), and let A be the formula VX(( IMPLIES ((EQUAL X)Y))(V Z(N0T(ZX)))) where X, Y and Z are identifiers. Then t(A) is V V X^ Y X = °Y "* V V Z^ V(kY Z> V X^' The wff A can have no Henkin model (g, H) because the predicate "equal to g(Y)" which is the interpretation of (\x( (EQUAL X) Y)) must be in H, .. Clearly then, if g(x) = g(Y), there is a predicate in E, . „,.> which ^KJliAij -> 1 V ) when applied to g(X) yields true, namely g(A.X( (EQUAL X)y)). Hence, g(A) must be false. Consider two realizations of K. First, let (h,M) be a realization of the associated n-sorted logic such that M^-p AT = m (-r-fat w^i = integers, h(C ) = 0, and let h(p) be the binary relation which is identically false. In this case h(r(A)) = M but the objects in M/ x are not predicates over the objects in M_ . For the second realization, let ]VL = the real numbers, M, x = {"less than 0", "greater than 0"}, and h(c ) = 0. Let h(p) be identically false. Again, h(x(A)) - M , but the pair (g,H) that one would expect to correspond to (h, M) is not a Henkin frame, because M /-.,-,..,. mTT x does not contain the predicate "equal 0". In (REAL -> TVj the realization (h, M), this predicate corresponds to the open wff Vv z —, P(v z ,x). In order to guarantee that models of t(A) will correspond to Henkin models of A, a set of axioms is added to t(a). This is strictly a theoretical device. It has already been pointed out by Robinson and Wos [Ik] how the addition of axioms can severely decrease the efficiency of theorem-provers . Later it will be suggested that new rules of inference replace these axioms in the same manner that Robinson and Wos replaced the equality axioms by paramodulation. These axioms are called the Henkin axioms and are denoted by I. Definition 3-3-2 The set of Henkin axioms I consist of the following: 1. one axiom for every formula \YA of the \-calculus, namely the defining axiom for \YA given in Definition 3-2.4; 2. for every type (a •* 3), the axiom of extensionality Vx 1 Vx 2 (V z(f(x r z) = f(x 2 ,z)) - x x = x 2 ) where f is the function symbol associated with (a -* 3) or V x ± Vx 2 (V z(p( X;l ,z) ~ P(x 2 ,z)) -> x 1 = x 2 ) where P is the relation symbol associated with (a -> 3) de- pending on whether or not 3 = TV; 3- the TV axioms, in the case the type TV is needed as a separate type of K; these consist of the wff VV(V = C TRUE " V= C FALSE ) and the obvious set of wffs which define the constants c.^, UK C AND' C IMPLIES' °NOT> C V> e 3* The defining axioms guarantee the closure of the corresponding Henkin frame (g, H). Clearly, an application of a function to an object in its domain does yield an object in the range, so that application alone can- ^5 not produce an object outside the set of domains. Since all the identifiers also represent objects in the domain, the only way that entirely new objects could be defined is through abstraction. By including t, (\YA), one is in- cluding in the language K the term associated with X.YA, whose interpretation will be just the object represented by \YA. The axioms of extensionality ensure that only one representation of a particular function or predicate is present. The TV axioms are needed only when TV is needed as a type in K (e.g. when there is a type (TV -* a), other than the standard logical connec - tives, where arguments of type TV are present). It will be assumed for the rest of this work that these are not needed and axioms of type 3 are not in- cluded in I. Lemma 3»3«3 To any normal model (h,,N) of the axioms of extensionality there corresponds a normal model (h , M) of these axioms satisfying the following conditions : 1. there is a 1-1 map g from N onto M; 2. the objects of M, n \ are actual functions from M to M R if P / TV or actual predicates over M if 3 = TV; 3- for any wff A of K, h (A) = g(h (A)) where g(S) is the map in M V such that g(6)(x) = g(s(x)); k. h^(f)(a,b) is a(b) for aeM/ oN and beM and f the function symbol associated with (a -> 3); h(p)(d, e) is true if and only if d(e) is true for deM/ mTr N and eeM and P the relation (a -* TVJ a symbol associated with (a ■* TV). k6 Tv roof The domains M are defined by induction. If u corresponds to a simple type of the \-calculus, i.e. u is not of the form (a — p), then M = N . That is, the domains of individuals are the same in M and N. If g is taken as the identity map here, then g is 1-1. Suppose M and M Q have already been given and are in 1-1 correspondence with N and N Q P oc p respectively by g. Suppose P ^ TV. Let aeN/ ^ „>,. Then for every beN there is exactly one ceN R such that c = h (f)(a,b). Define g(a) to be that function such that g(a)(g(b)) = g(c) if and only if c = h,(f)(a,b). Suppose g as defined so far is 1-1. Since (h,,N) satisfies the axioms of extension - ality, given any object aeN/ _>., no other object of N, >, will have [a -» p) {a •* p) the same value for every argument in N . Therefore, since g is 1-1 between N and M and 1-1 between I\L and M , two distinct objects a and b in N/ Q s a a p 0' ° (a -» P) yield distinct functions in M, %. Of course, each object of N/ Q >, [a •* PJ ' ° (a -> P) does give rise to a function, and thus, g is also 1-1 between N/ s and M/ „s. The analogous process is carried out in the case P = TV. By la -> p) finite induction, the result of 1 is proved. Clearly 2 holds due to the way objects of M were defined. Finally, it remains to define h for the symbols c, P, and f. Define h„(c) to be g(h,(c)). hp(P) is that relation which holds for a pair (d, e) in M, rm^xM if a*" 3 - onl y if n n ( p ) holds for [a — TV; a -L (g (d),g (e)). h p (f) is that function whose value for a pair (a,b) in M (a ^ r) xM q is g((h 1 (f))(g" 1 (a),g" 1 (b))). Now let aeM/ > a)' By the definition of the objects in M/ ^ r n, the function a applied to an argument beM yields g(h 1 (f))(g- 1 (a),g- 1 (b)). ^7 This Is just hp(f)(a,b), and so condition k for the function symbols holds. Similar remarks prove condition k- for relation symbols. There remains only the verification of property 3 and this is done by induction. Clearly, if A is an atomic formula, property 3 holds by the definition of h (P) and hp(f). Suppose A is the formula B ^ C and property 3 holds for B and for C. Then Seh (A) if and only if 5eh (b) or 6eh (c) if and only if g(5)eh (B) or g(s)eh_(C) by the induction hypothesis if and only if g(s)eh (A). Similar remarks prove 3 for B * C, B -+ C and — > B. Suppose A is the formula 3 xB. Then 5eh (A) if and only if there is a 8'eh,(B) such that 8' = 8 every- where except possibly at x. By the induction hypothesis 8 T €h-,(B) if and only if g(S') £ h (b). Clearly g(S') = g(&) everywhere except possibly at x so that g(8)ehp(A). Thus 8eh, (A) implies g(&)ehp(A). An identical argument shows oehp(A) implies g (&)eh (A). Since g is 1-1, this gives the result. Finally, VxB is ->dx-jB and the result holds for -» and 3 x. This proves the lemma. The necessity for including the axioms of extensionality can be seen by the following example. Example 3»3»2 Suppose there are two types, a and (a -» a). Let f be the function associated with (a ■* a)« Let = be the only relation symbol. Define a normal realization of K as follows: N = N, N = integers; a (a -» a) h(f)(j,k) = k + 1 for all j and k. Thus each integer j names the successor function. Because all integers in N/ \ name the same function there can [a -»• a) be no realization M where M, \ contains actual functions which is 1-1 with {a -* a) N/ \. Also, no such realization can satisfy exactly the same closed wffs (a - a) as (h, N). In particular, the formula 3 x ± 3 x 2 (V y(f( Xl ,y) = f(x 2 ,y)) * a^ / x g ) is satisfied by (h.N). But for any realization (h ,M) for which M, ^ h(T(A)) = M V , that is, (h, M) is a model of A. Clearly (h,M) is normal. Also, M (a-P) = H (a-P) and so M, ^ ^ does consist of actual functions or predicates. This guarantees that the axioms of extensionality are satisfied. Finally, h9 recall that in the proof of Theorem 3* 3*1 it was shown that h(c) = M for any defining axiom C associated with a wff \YB. Thus, (h,M) satisfies all wffs in I and so (h, M) is a model of t(A) U I. Conversely, suppose (h, M) is a normal model of t(A) U I. (h, M) may be assumed to satisfy conditions 2 and h of Lemma 3«3»3« Define a local frame (g, H) for the \-calculus as follows: 1. H = M for all types under consideration: H w = t true > false ] ; 2. g(X) = h(c ) for all identifiers X; g maps the special iden- tifiers onto their regular meanings. It must be shown that (g,H) is a Henkin frame, that is conditions 1-6 in Definition 3- 1«1 are satisfied. Conditions 1-5 are obviously satisfied. To show condition 6, closure, it is only necessary to show g(E)eH for formulas E of the form (A.YE'). For, recalling a previous remark, an appli- cation of a function to an argument cannot produce an object outside H if the function and argument are both in H. By induction on the construction of the defining axiom for E it will now be shown that the name d associated with E satisfies h(d) = g(E) so that g(E) is in H. Case 1 B is a subformula of E consisting of a sequence of applications and containing no \'s or special identifiers. Let t be the term associated with B and let Y n , ... Y be the marked identifiers and u, , ... u the associated variables of K. Then 8(t) = g'(B) where g' = g except possibly at the marked identifiers where g'(Y. ) = S(u.). For if B is a marked iden- l i tifier, the condition obviously holds. If B is an unmarked identifier, t is c^ and 5(c_) = h(c_) = g(B) = g'(B). Suppose B is (FD) and the -O D B 50 condition holds for F and D. Then t is f(t„,t ), and 6(t) = h(f)(5(t F ),5(t D )) = 6(t F )(8(t D )) = g'(F)(g'(D)) = g'(FD) = g'(B). The second equality follows from condition k of Lemma 3«3«3« Case 2 B is a subformula X.YB ., B, not of type TV, and suppose B, contains no \'s or special identifiers. Then the defining axiom is V Vy Vu x ... Vu m (t B = f(d(u r ... u m ),v y )). Because this axiom is satisfied, it must be that 8(t B ) = 5(f(d(u r ... u m ),v y )) for any 5 in M . But, recall that 6(f(d(u 1 , ... u m ),v Y )) = 5(d(u ] _, ... u m ))(5(v Y )). Also, by Case 1, 8(t ) = g'(B n ) where g 1 is the assignment of identifiers B 1 1 associated with 8. Then, 8(d(u r ... u m ))(8(v y )) = g , (B 1 ). Thus, as 8(v ) varies, the function S(d(u , ... u )) maps out the same set of values that g' (X.YB, ) does as g'(Y) varies, i.e. &(d(u x , ... uj) = g'(\YB 1 ). Clearly if there are subformulas of B beginning with X. and the above condition holds for the terms associated with these subformulas, the analogous remarks of Cases 1. and 2. hold. 51 Case 3 B is an application (FD) of type TV. Suppose S(t ) = g'(F) and 5(t D ) = g'(D) for all 5eM , g' as in Case 1. t-^B) is P(t ,t ). By condition k of Lemma 3-3.3, 5eh(P(t F ,t D )) if and only if 5(t )(8(t )) is true, i.e. g T (F)(g'(D)) = true. Again, because the special identifiers mean the same in both systems, this result holds when B contains these as well. Case k B is of form X.YB, , B-, of type TV. The defining axiom is V Vy V U;L ... V u m (x 1 (B 1 ) - P(d(u ] _, ... u m ),v y )). The proof of Case 2 can be used with minor modifications to show that S(d(u 1 , ... uj) = g , (XYB 1 ) if 5 and g' are related as above. In Case 2 or k , if B is E itself, the result is that 5(d) = g(E), that is h(d) is g(E), thus showing g(E) is in H, and con- dition 6 is satisfied. Then (g,H) is a Henkin frame. Clearly (h, M) is the realization of K associated with (g,H). By Theorem 3«3«1, g(A) = true if and only if h(x(A)) = M . But, it is assumed in this case that 1i(t(A)) is M , and so g(A) = true. That is, (g,H) is a Henkin model of A. In Chapter k, some practical uses of the material in Chapter 3 are considered. 52 h. COMPUTER IMPLEMENTATION OF n-SORTED LOGIC FOR HIGHER-ORDER LOGIC k.l Preliminary Comments The results of Chapter 3 show that n-sorted logic is an alternative to the \-calculus as a vehicle for automating higher-order logic. The ease of altering existing first-order theorem-provers for use with n-sorted languages and the vast experience of researchers with first-order languages are among the advantages of the use of n-sorted logic. Moreover, the com- pleteness of most of the strategies used for first-order theorem-provers can be proved by showing completeness for the ground case and then applying a lifting lemma. But the lifting lemma holds for n-sorted logic as well, and the ground metatheorems are the same for both the classical first-order and n-sorted problems. Thus, strategies whose first-order completeness was proved in this way will also be complete for n-sorted problems. Thus, for example, set of support, merging, subsumption, linearity, deletion of tautologies, and hyper-resolution are all complete for n-sorted problems [ see references cited on page 3] • However, there are some difficulties which arise when this approach is taken for automating higher-order logic. It will be the purpose of this chapter to consider two of these problems as well as suggest a practical im- plementation of this method. The first of these problems has to do with the inclusion of the Henkin axioms I with the negation S of the theorem to be proved. Robinson and Wos [1^] have considered this problem in conjunction with their work on problems which include the equality axioms. They point out that many resolu- 53 tions are usually required to produce the effect of applying one axiom. In addition, the intermediate resolvents are added to the memory even though it is only the last resolvent that is germane to the theorem being proved. A more natural approach is to replace the axiom by a new operational rule whose effect is just the application of the axiom. Robinson and Wos intro- duced paramodulation to replace the equality axioms; a new rule of in- ference, called naming, will be introduced below as a possible alternative to the inclusion of the defining axioms of I. Although the set of defining axioms is infinite, Theorem 2.3.1 says that if S U I is inconsistent, there will be a finite subset of I, say I', such that S U I' is inconsistent. This raises the second problem- -while it is necessary to consider only some finite subset of closure axioms, the mathematician does not know beforehand which finite subset will be sufficient for his problem. One solution to this would be to generate expanding sub- sets of I (which is recursively enumerable). But this produces inefficient algorithms analogous to those in use before resolution was introduced which generated initial segments of the Herbrand Universe. Fortunately, the method below partially solves this problem. It will be shown that, in a sense, S itself and its resolvents contain all the information needed to determine which axioms of I to include in the problem. k.2 The Naming Rule As motivation for the introduction of the naming rule and the theorems in the next section consider the following example from Robinson [18]. Example ^.2.1 Show that the wff ( (IMPLIES (Qa))(Sa)) 5h is a logical consequence of the two wffs ( Vr (( IMPLIES (RP))(RQj)) and ( (lMPLIES(Pa))(Sa)) ; that is, for any Henkin frame (g,H) in which g maps the last two wffs onto true, g maps the first wff onto true. Call these wffs A, B, and C respec- tively. There are three types in this problem: 1. individuals, say type a — a is the only symbol of this type in the above wffs; 2. predicates, that is, type (a ■* TV) — P, Q, and S are of this type; 3- predicates of predicates, type ((a -> TV) -> TV) -- P is of this type. (Of course, the types of the logical connectives are present as well as the symbol CHOICE/ -i because of the quantification over R. However, these are not translated into separate types and symbols of the associated n-sorted language. ) First, it will be shown that ( (IMPLIES (Qa)) (Sa)) is a logical con- sequence of the other two wffs. Let (g,H) be a Henkin frame in which g assigns the value true to the wffs B and C. The wff C is a wff of type TV involving the identifier P. It can be considered to be expressing a property of P. Formally, this property is given by the wff (\X( (IMPLIES (Xa) ) (Sa) ) ) . Call this D. Then clearly g(DP) is true. However, substituting D for R in B, one concludes that g(DQ) is true also. But then g((lMPLIES(Qa))(Sa)) is also true. Now, consider the negation of the above problem and its transla- tion into a 3-sorted system. This negation asserts that B and C hold while 55 A fails; that is, B, C, and (Qa) hold but (Sa) is false. The translation yields the "wff (Vv R (F( V e p ) -F(v R ,c Q ))) - (G( Cp ,o a ) -G(o s ,o a )) -G(o Q ,o a )^-,G(o s ,o a ) where F and G are 3-sorted relation symbols of types ((a •+ TV) -> TV, (a -> TV)) and (a -> TV, a) respectively. The Skolem normal form with matrix in conjunctive normal form is Vv E ((-,F( V c p ) -F(y R ,c Q )) - ( - 1 G(o p ,c a )vC( V c a )) -G(o (i ,o a ).-.- i G( V c a )). Only one pair of matrix clauses is resolvable, namely -j G(c , c ) v G(c ,c ) and -iG(c c ,c ). They yield the resolvent -|G(c ,c )• In order to get a o a .Fa refutation, it is necessary to include the defining axiom for the \-wff (\X( (iMPLIES(Xa) ) (Sa) ) ) . This axiom is Vv x ((G(v x ,C a ) ->G(c s ,c a )) *F(d,v x )) where d is a new constant symbol. In clause form, this axiom is (G(v x ,c a ) v F(d,v x )) * ( ->G(c s ,C a ) v F(d,v x )) - ( -,F(d,v x ) ^^G(v x ,c a ) vG(c g ,C a )). The first two clauses are equivalent to ((G(v x ,c a ) ->G(c s ,C a )) -F(d,v x )) and the third clause is equivalent to (F(d,v x ) - (G(v x ,c a ) -G(c s ,c a ))). adding these three clauses to the negation of the theorem allows the following refutation: 56 1. -»F(v R ,c p ) s/ F(v R , Cq) -- original clause 2. 1 G(c , c ) v G(c , c ) -- original clause ir a o a 3. G(c , c ) — original clause fcj, a k, -i G(c_, c ) — original clause 5. G ( v x > c a ) v F(d,v ) — axiom 6. -»G(c g ,c ) v F(d,v ) — axiom 7- T F(d,v x ) v -| G(v x , c & ) v G(c g ,c a ) -- axiom 8. G(c c ) v- F(d,c ) -- from 2 and 5 o a Jr 9. F(d,c ) — from k and 8 10. F(d, c n ) — from 1 and 9 11. -, G(c n ,c ) v G(c Q ,c ) -- from 7 and 10 iqj, a b a 12. G(c„, c ) — from 3 and 11 b a 13- empty clause — from k and 12 Note that it took two steps to apply the axiom. One would prefer to deduce F(d, c_.) directly from -|G(c_,c ) s/ G(c_, c ). This leads to the con- Jr 1 a b a cept of naming. Definition 1<-.2.1 Let C be a clause. Let C be any other clause such that C and C* have no variables in common but do have a common instance. Let a be a most general substitution such that Ca = C'a. Let the variables of C' be x, u., , . . . u . The wf f ' Ir m V x V u n ... V u (C «* P(d(u_. ... u ),x)), 1 m 1' m ' * where d is a new function symbol, is called a naming axiom for C . The 57 naming rule states that from C one can deduce (P(d(u 1 , ... u m ),x))a. The unnaming rule states that from D v P(d(t n , ... t ),s) one can deduce D v c" where C" = C'(s/x, t n /u,, ... t /u ). Note that a naming axiom is just the defining axiom for some wff \XC (recall section 3*3) • In this case, however, the wff C is of a special form, namely a clause. As in Definition 3*2.^, the term d(u,, ... u ) names the predicate, defined by the clause C f , being applied to the variable x. Also note that there is more than one naming axiom associated with a clause C. In fact, there are generally infinitely many. Later it will be suggested that future research be directed toward finding suitable refinements of the naming axioms. Finally, the unnaming rule is completely equivalent to resolu- tion with the clause -iP(d(u 1 , ... u m ),x) v C This corresponds to the implication *- of the axiom. Returning to Example k.2.±, naming allows the following shorter refutation: 1. - k. -- these clauses are as in Example ^.2.1 5. F(d, c ) — name 2 using V x (( -,G(x,c a ) ^G(c s ,c a )) ~F(d,x)) 6. F(d, c ) -- from 5 and 1 7. -»G(cq, c a ) v G(c s ,c a ) — unname 8. G(c Q , c ) -- from 7 and 3 9. empty clause — from 8 and h 58 In the above deduction, a number of other wffs could also have been used for the naming axiom. For example, the wff V x V u (( -,G(x,c a ) v G(u,c a )) - F(d 1 (u),x)) could have been used. In any realization (h, M) of K in which M/ _.-v consisted of actual predicates, h(d) and h(d,(c q )) would be exactly the same predicate. k,3 Another Look at the Closure Axioms In Chapter 3, the emphasis was on developing a relationship be- tween the ^.-calculus and an n-sorted language. The axioms introduced there were used to establish the relationship. There was no need to consider their particular form. The emphasis here is on the development of actual theorem- proving programs. Thus, the form of the logical statements involved is a major consideration. One would guess that the axioms of extensionality would be used in relatively few problems. Even so, their form is known ahead of time, and they can always be included in the problem. A similar remark holds for the TV axioms in case they are needed. In a computer program, these axioms would probably be handled in a special way, separate from the clauses of the theorem itself. This leaves the defining axioms. The diffi- culty with these is that it is generally not known beforehand which axioms to Include with a particular theorem. The intent of this section is to show that these axioms can be put into a special form. Then, in the next section it will be shown that this special form allows one to deduce from the theorem itself which axioms to include. Consider the defining axiom associated with a wff KXA of type (a -* P), 3 / TV. This axiom has the form Vv Vu ... Vu (t = f(d(u n , ... u ),v v )). Y 1 m 1 ? m ' Y 59 This axiom, when added to the Skolem normal form of a theorem S, yields just the equality unit t = f(d(u x , ... u m ), v y ). This unit can resolve only with a literal of the form -, (s-j, = f(x,s 2 )) or -, ( S] _ = f(d(t p ... t m ),s 2 )). Suppose the function d has not yet been introduced, that is, the program is contemplating the addition of the above axiom. Then only literals of the form ~i (s, = f(x, s )) are relevant "because, of course, the function symbol d can- not occur. Thus in considering which axioms of this form to include, a pro- gram can look for literals of the form -| (s ]L = f(x, Sp)) during the theorem - proving process . The appropriate equality unit defining the new function symbol can be added at that time. If the theorem-prover does this for all such literals, then all the necessary axioms of this form will be included. For, if yet another axiom of this form were added, it would either not be resolvable with any other literal or it would resolve with a literal -7 (s = f(x, s )) and an axiom already naming this function would have been introduced. These remarks prove the following lemma. Lemma k.~5.~L The defining axioms for wffs of the form \YB not of type (a -» TV) can be added to the theorem during the theorem-proving process. Finally there are the axioms associated with formulas of the form A.YB of type (a -» TV). It will be shown that these can be put into a form similar to the form of a naming axiom described in Section k.2. Some pre- liminary results are given first. Lemma k.^>.2 In any wff of the X-calculus of type TV, it can be assumed that OR, NOT, and V are the only logical connectives that appear. 60 Troof In the standard way (see Church [2]) one can express the special identifiers AND and IMPLIES in terms of OR and NOT and the existential quantifier d in terms of NOT and V. That is, ((AND A)B) is equivalent to NOT((0R(N0T A))(N0T b)), ((IMPLIES A)B) is equivalent to ((0R(N0T A))b), and (3XA) is equivalent to (N0T(V x(N0T A))). Lemma ^.3«3 Let A be a wff and B a subwff of A containing an identifier Y. Let Y' be an identifier not occurring in A, and let B' be obtained from B by replacing all occurrences of Y by Y'. Then B can be replaced by ((\Y'B')y) in A without affecting the value of A in any Henkin frame (g,H). Proof To evaluate g(A), one must first evaluate g(B). But g(B) = g((\Y'B')Y). To see this, note that g(\Y'B*) is that function which yields g'(B') when applied to g'(Y') where g' = g for all identifiers ex- cept possibly Y', and g'(Y') is allowed to vary over the appropriate domain. If g'(Y') = g(Y), then clearly g'(B') = g(B). Thus, the function g(XY'B') applied to g(Y) yields g(B). That is g(B) - g((\Y'B*)Y). Theorem k. 3.1 Given a wff \YB of type (a -> TV), there exists a formula X.YC satisfying 1. g(VYB) = g(^YC) for any Henkin frame (g,H), and 2. for every subformula of the form \ZD of \YC of the type (P -> TV), D is either a clause, NOT applied to an atom, or V X applied to an atom. 6l Proof By Lemma k.3-2, it may be assumed that OR, NOT, and V are the only logical connectives. Next, note that sequences of more than one NOT may first he simplified to have at most one application of NOT by using the fact that g(NOT(NOT F)) = g(F). Suppose condition 2 fails for X.YB. There must be a subformula \ZD where D fails to satisfy one of the three conditions. D is of type TV so if D is not a clause it must be NOT D, or V XD, where D is of type TV. In either case, if D is a clause, the subformula D can be replaced by (\Z'D')Z where Z' is a new identifier not occurring in \YB, Z, is some identifier in D , and DJ is obtained from D, by replacing Z, by Z' . By Lemma k.~5-3, g(^-YB) will not be changed. However, (\Z'D')Z, is an atom, so that after the replacement, the NOT or V is applied to an atom. Similar re- placements can be made if D is NOT D or V XD . By repeating the process a sufficient, but finite, number of times, one obtains a formula A.YC satisfying the conditions of the theorem. Some properties will now be established in preparation for the final theorem of this section. Consider the defining axiom of a subformula \ZD where D is a clause. Because the logical connectives OR and NOT are mapped onto the logical connectives ^ and -ts of K, the axiom is (Vv z V Ui ... Vu m (C ~ Q(d( U;L , ... u m ),v z ))) where C is a clause. The atoms of C are P. (t„, t„) where (FG) is an atom of D. 1 F' G Similarly, if D is NOT D , D an atom (FG), the defining axiom is (V v Vu n ... Vu (( -, P(t_,t_ )) - Q(d(u,, ... u ),vj)). 62 If D is V XD , D an atom (FG), the axiom is (Vv z V U;L ... V ^ m (( V v x Q(t F ,t G )) -P(d(u r ... u m ),v z ))). Before these axioms are added to a set S of clauses to be given to a theorem- prover, they are put in Skolem normal form and the universal quantifiers are dropped. The first two axioms with quantifiers deleted are of the form C **■ a where C is a clause and a is a unit whose first argument starts with a new function symbol. The third axiom is V v„ V u, ... Vu ((Vv C) * a) which is equivalent to (see Kreisel and Krivine [10]) (V v z V U;L ... Vu m V v x (a -» C)) * (V v z V U;L ... Vu^ 3 v x (c -a)). In taking the Skolem normal form of this formula, the variable v in the second conjunct would be replaced by a new Skolem function f(v ,u , , ... u ). The result after this replacement and the deletion of quantifiers is the wff (C -> a)*, obtained from (C -> a) by replacing v v by f(v ,u.,, ... u ). However, the problem is to determine if S is inconsistent. If J is a clause and I is an instance of J, then S U {1} inconsistent implies S U (J) inconsistent. Then, in the above, one can just as well add (C -> a) to S instead of (C -> a)'. Thus, for the third axiom one can again add the quantifier free formula (C ■» a) to S as the representative of that axiom. In all three cases, then, the defining axiom gives rise to a formula of the form (C ■»> a), C a clause, to be included with the set of clauses given to the theorem-prover. The form of this formula is that of the matrix of a naming axiom. The results of this section on the closure axioms are collected into Theorem k.3-2 One can take as a set of quantifier free formulas of K representing the defining axioms a set of formulas each of which is an equality unit t = f(d(u,, ... u ),x) or of the form C ■«■ P(d(u, , ... u ),x), C a clause. 63 k-.k Some Completeness Results Definition k.k-.l A procedure is called refutation complete or just complete for n-sorted logic if, given a quantifier free set S of formulas which is inconsistent, that procedure demonstrates that S is inconsistent in a finite number of steps. A procedure "which is refutation complete is called a proof procedure . Resolution and most of the strategies used with resolution are known to be complete refutation procedures. In this section a proof procedure will be mentioned for n-sorted logic when used to represent higher-order logic. Given the translation S of a \-calculus sentence which is inconsistent, this procedure will demonstrate that S is inconsistent in a finite number of steps. The procedure itself will generate any defining axioms needed. This particu- lar proof procedure is not computationally feasible and is mentioned for theoretical purposes only. It will be suggested that alternate procedures based on some refinement of the naming rule be developed and investigated. Unfortunately, the question of the completeness of such procedures is still open. The basis for the following is Corollary 1 of Slagle [ 19] • This result is stated here for reference. Definition k.k.2 A clause D subsumes a clause C if there is a substitution cr such that Da is contained in C. Note that it is not necessary that Da equal C, but only that every literal of Da occur in C. 6k 'iheorem 4.4.1 (Corollary 1, Slagle [19]) Let E be a set of clauses and let F be a finite set of ground clauses. If E * ( -» F) is inconsistent, then there is a finite set I of clauses such that 1. each clause of I is in E or is deducible by resolution from E; 2. each clause C of F is subsumed by some clause D of I; 3* I ^ ( -| F) is inconsistent; 4. every predicate symbol occurring unnegated (negated) in I occurs unnegated (negated) in both E and F; and 5- every function symbol and constant symbol occurring in I occurs in both E and F. In the above, -j F is the negation of the conjunction of the clauses in F. Some properties "will now be developed in order to apply Slagle' s work to higher-order logic. Let A be a sentence of the \-calculus which is locally inconsistent. By Theorems 3*3*2 and 2.3*1, there is a finite set I of defining axioms such that t(A) U I U {axioms of extensionality} is incon- sistent. Let S be the set of clauses obtained from t(A) and the axioms of extensionality and let I, be the set of quantifier free formulas obtained from I. By Theorem 4.3*2, it can be assumed that I contains only formulas of the form t = f (d(u, , . . . u ),x) and C ■♦* M. Lemma 4.4.1 With the notation above, there is a finite set I' of ground instances of formulas in I, such that S U I,' is inconsistent. Proof Herbrand's Theorem states that a set T of quantifier free formulas 65 is inconsistent if and only if there is a finite set T' of instances of for- mulas in T -which is inconsistent in the Boolean sense. S U I, is inconsis- tent, so there is a finite set of Boolean inconsistent instances. Call this set S' U I-I where S' is the set of ground instances of formulas in S and similarly for I-I . But then S U I ' is inconsistent for it does have a finite set of Boolean inconsistent instances, namely S' (J I-I • In the following, I' -will be used for -7 F in Theorem k.k.l. Then -j I' corresponds to the F of that theorem. However, -7 I* must first he put into the form of a set of clauses, that is, in conjunctive normal form. Now, I' is a conjunction of wffs of one of the forms t = f(d(t r ... t m ),s), (C-»M), or (M-C) where C is a clause and M is a unit. By elementary Boolean algebra, -| I' can be written as a disjunction of wffs of one of the forms -, (t = fCdCt^ ... t m ),s)), -,(C->M), or -,(M->C). The last two are equivalent to (L a -jM) v ... v (L a-,m) and M^-jL ••• \,m^' s ^' Example K.K.2 Suppose I' consists of one instance of each implication -» and *- of an axiom ((L, s/ 1_) •* M), say (A ^ A-) ■* C and C -» (A^ v A^). Then using elementary Boolean algebra, -i I' is 67 -, (((A x v A 2 ) -> C) - (C - (Aj_ - A£))) = -, ((A 1 v Ag) ^ C) v ^ (C - (Aj_ ^ Ap) = ((A 1 vA^-,c)v (C - ( n (Aj_ v A£))) ee (A ] _ A n G) v (A 2 -n -, C) v (C 1 ^ -,A£ a -,A£). Finally, the conjunctive normal form F of -> I-! consists of twelve clauses, for there are two choices of literals from each of the first two disjuncts and three choices from the last disjunct. The twelve clauses are 1. (A^i^vc'), 2. (A 1 v Ag v -,Aj_), 3- (A 1 vA 2 v- ) A 2 ), k. (A 1 v -,C v C»), 5. (A 1 s/ -, C v -, Aj_), 6. (A 1 v- 1 Cv-,A') } 7. (tCv^vC 1 ), 8. ( -,C v Ag v-iAj_), 9- (iCvA 2 v n Ap, 10. (nCvC), 11. (nCv^Aj), 12. ( -, C ^ -,A 2 ). Note that in 10, 11, and 12, the -> C was not repeated. Also, clauses h—9 are subsumed by clauses 10-12. Thus, a smaller, but still suitable, set of clauses to use for F is 1-3, 10-12. Theorem U.3«2 and Lemma h.k.2 suggest the following sort of algorithm for theorem-proving in higher-order logic. Starting with the original set S of clauses, one proceeds as in the usual resolution algorithm, except that occasionally a naming axiom is added to the set of clauses. The combination of resolution and addition of naming axioms is made in such a way that every possible naming axiom of each subclause of each clause generated is eventually added to the set of clauses. By Theorem k.3 «2, each axiom needed for the program to prove the theorem is either an equality unit or of the form (C ■*■ M), C a clause. Equality units have already been discussed. 68 For the other case, by Lemma k.k.2, one can deduce from A a clause D such that for some subclause E of D, E and C have a common instance. Then, the axiom (C •*- M) is a naming axiom for E, and the above algorithm will eventually add it to S. Thus, such an algorithm discovers which axioms to add to the theorem and is complete. One could give an algorithm that guaranteed that every such naming axiom was eventually added to S. However, such an algorithm would not be computationally feasible because there are infinitely many such axioms for each clause. Also, the above algorithm adds all the clauses representing these axioms to S rather than applies each axiom in some single operation. Thus, such an algorithm is of purely theoretical interest. It will now be shown how certain applications of the naming rule alleviate the second problem above, i.e. the addition of axioms. Consider the application of a naming rule for a clause C whose literals are ij-. , ... j_l . .Le'C VxVn ... V u (C «* M) 1 m be the naming axiom used where the literals of C' are L' , ... JJ f and let a be the substitution such that Ccr = Co". Suppose C , cr = C so that Co - Ccr = C and C' is an instance of C. The clauses associated with the axiom are ( -> L' s/ M) ... ( -, L£ v M) and ( -7 M v L^ v ... v IJ ) . The above algorithm would have added all of these to the set S. However, the naming rule states that one can deduce Mcr = M directly. Note that M can be deduced by resolution from (L, y . . . v L, ), that is, C, and ( -> L' v M), ... ( -) V v m) using a as a unifier of L. and -| L! . Thus, 69 the naming rule is equivalent to a sequence of resolutions. Now note that M subsumes each clause ( -j L! ^ M) . Thus, having deduced M, the clauses ( -| L! v/ m) can be discarded. That is, with the application of the naming rule to C only the clause -j M v L' v ... v/ JJ corresponding to the second half (M -* (L' v ... v/ IJ ) ) of the naming axiom has to be added to S. Adding this clause to S is tantamount to saving a pattern of the definition of the new function symbol occurring in M for future unnamings. The above analysis fails if C'ct ^ C, for in this case, the naming rule yields Ma. If Ma / M, then Ma does not subsume ( -| 1*. ^ M). (Recall, 1 for Ma to subsume ( -jL 1 , ^ m), (Mct)\ must be a literal of ( -% L*. v M). In this case (Ma)X. must be M. But, \ cannot undo the effects of a, unless a only changes variable names. Therefore, (Mcr)\ cannot be M. ) Without the subsumption property, the clauses ( -j l! v M) cannot be discarded. The analysis also cannot be carried out if one tries to apply naming to a subclause. For example, if the clause C v D is deduced, one might deduce by some altered naming rule Ma v Da. Again, this will not subsume any clause ( -\ L. \/- M). Thus, the naming rule achieves its advantage over adding all the clauses associated with the naming axiom when it is applied to a whole clause C and the naming axiom (C ** M) used satisfies C = Ca. In this chapter it was shown that the Henkin axioms I could be put into a form compatible with the naming axioms and the naming rule. Further, it was shown that complete proof procedures can be developed using resolution and the naming axioms. It was pointed out that such a procedure is not efficient enough to be of practical interest. Finally, it was suggested that the naming rule could lead to practical algorithms. TO 5. SUGGESTIONS FOR FUTURE RESEARCH The purpose of this thesis is theoretical in rtature--to demon- strate that n-sorted logic is suitable for representing higher-order logic. This was accomplished in Chapter 3* This work offers an alternative to the use of the \-calculus for automating higher-order logic. It was not in- tended to develop a computer program for higher-order logic, only to lay the foundation for the development of such programs in the future. The author feels that the material in Chapter k will be of great use in this area. While no specific, practical algorithm was given in that chapter, some examples of problems solved by use of the naming rule are given in the Appendix. There are two directions along which future research should pro- ceed. The first of these is to determine if resolution and the naming rule, or some suitable restriction of it, is complete for higher-order logic. Re- call (Section k.k) that when the naming rule is applied in certain cases, it is not necessary to add a number of the clauses representing the associa- ted naming axiom to the set of clauses being generated. One would expect that a theorem-prover which takes advantage of this to be more efficient than a theorem-prover which merely adds all the clauses of the associated axiom (Robinson and Wbs [1^]). Thus, the more efficient theorem-provers using n-sorted representations of higher-order logic will most likely be the ones which implement some form of the naming rule. Then from a practical, as well as a theoretical, standpoint, it is important to know if resolution and naming are complete. The second area for future work is the implementation of resolution 71 and naming. After resolution was introduced for first-order theorem-proving, research was carried out for several years to develop strategies for the use of resolution. That is, rules for guiding the choice of pairs of clauses to resolve were developed; restrictions on when resolution could be applied were introduced; strategies for eliminating certain resolvents were used; and even new operational rules based on resolution were developed. All of these tended to increase the efficiency of resolution style first-order theorem- provers. Similar work needs to be done now for resolution and naming for higher-order logic. Many of the strategies for resolution mentioned in the introduction may simply be carried over to resolution and naming theorem- provers. But, new strategies for handling the naming rule should also be developed. Analogues of some of the resolution strategies (particularly set of support) should be considered as well as entirely new strategies designed especially for the naming rule. 72 LIST OF REFERENCES 1. Andrews, Peter B., "Resolution with Merging," Journal of the Associa - tion for Computing Machinery, Vol. 15 (1968), pp. 367 -38l. 2. Church, Alonzo, Introduction to Mathematical Logic, Princeton Univer- sity Press, Princeton, New Jersey, 1956. 3. Darlington, J. L., "Theorem Proving and Information Retrieval," Machine Intelligence, Vol. k, American Elsevier Publishing Company, Inc., New York, 1969. h. Davis, M., "Invited Commentary on New Directions in Mechanical Theorem- Proving, " Proceedings of the International Federation of Information Processing Congress, 1968, North-Holland Publishing Company, Amsterdam, 5- Davis, M. and Putnam, H., "A Computing Procedure for Quantification Theory, " Journal of the Association for Computing Machinery, Vol. 7 (i960), pp. 201-215. 6. Gilmore, P. C, "A Proof Method for Quantification Theory," IBM Journal of Research and Development, Vol. h (i960), pp. 28-35« 7. Green, C. Cordell, "Theorem-Proving by Resolution as a Basis for Question- Answering, " Machine Intelligence, Vol. k, American Elsevier Publishing Company, Inc~ New York, 1969. 8. Henkin, L., "Completeness in the Theory of Types," Journal of Symbolic Logic, Vol. 15 (1950), pp. 81-91. 9« Herbrand, Jacques, "Recherches sur la Theorie de la Demonstration," Traveux de la Societe des Sciences et des Lettres Science Mathematique et Physiques, no. 33* 128 pp. Traveux de la Societe des Sciences et des Lettres de Varsovie, Classe III 10. Kreisel, G. and Krivine, J. L., Elements of Mathematical Logic (Model Theory ), North -Holland Publishing Company, Amsterdam, 1967. 11. Loveland, D. ¥., "A Linear Format for Resolution," Proceedings of the IRIA Symposium on Automatic Demonstration, 1968, Versailles, France, Springer-Verlag, 1970, pp. IV7-I62. 12. Luckham, D., "Refinement Theorems in Resolution Theory," ibid ,, pp. 163- 190. 13. Prawitz, D., "Advances and Problems in Mechanical Proof Procedures," Machine Intelligence, Vol. k, American Elsevier Publishing Company, Inc., New York, 1969. 73 Ik, Robinson, G. and Wos, L., "Paramodulation and Theorem-Proving in First-Order Theories with Equality," Machine Intelligence, Vol. k, American Elsevier Publishing Company, Inc., New York, 1969. 15- Robinson, J. A., "A Machine Oriented Logic Based on the Resolution Principle, " Journal of the Association for Computing Machinery, Vol. 12 (1965), pp. 23-U. 16. Robinson, J. A., "Automatic Deduction with Hyper -Re solution, " Inter - national Journal of Computer Mathematics, Vol. 1 (1965), pp. 227-23^. 17. Robinson, J. A., "New Directions in Mechanical Theorem- Proving, " Proceedings of the International Federation of Information Processing Congress, 1^68, North-Holland Publishing Company, Amsterdam, 1969. 18. Robinson, J. A., "Mechanizing Higher-Order Logic," Machine Intelligence, Vol. k, American Elsevier Publishing Company, Inc., New York, 1969. 19. Skolem, T., "Uber die Mathematische Logic," Norsk Matematisk Tidskrift, Vol. 10 (1928), pp. 125-1^2. 20. Slagle, James R., "interpolation Theorems for Resolution in Lower Predicate Calculus, " Journal of the Association for Computing Machinery, Vol. 17 (1970), pp. 535-5^2. 21. Wos, L., Robinson, G. and Carson, D., "Efficiency and Completeness of the Set of Support Strategy in Theorem-Proving, " Journal of the Associa - tion for Computing Machinery, Vol. 12 (1965), pp. 536-5^+1. 7^ APPENDIX Some examples of theorems from higher-order logic and their proofs in an n-sorted system will now be given. These examples show the ease and naturalness with which n-sorted logic can express mathematical problems. In the last example, the problem is stated directly in a ^-sorted logic without ever resorting to higher-order logic. Also, the reader should note that in some examples, it was not necessary to apply the naming rule. Thus, resolution alone is sufficient for some, and possibly many, higher- order problems. Example 1 If F and G are predicates, their disjunction is also. That is VxVf Vg =i H( ( (FX) v (GX) ) *> HX) . The negation is B X 3 F 3 G V H( ( (FX v GX) * -; HX) v (HX *. -7 FX ^ -7 GX) ) . This leads to the 2-sorted wff 3 x B f 'd g V h(((p(f,x) v P(g,x)) *• -i P(h,x)) ^ (P(h,x) - -, P(f,x) *■ -, P(g,x))). The Skolem normal form of this is Vh(((p(F,a) v P(G,a)) ^ -, P(h,x)) ^ (P(h,a) *■ -, p(F,a) ~ -, P(G,a))) where F, G, and a are constant symbols. The matrix yields the clauses 1.1. P(F,a) ^ P(G,a) ^ P(h,a) 1.2. -, P(h,a) v -, P(F,a) 1.3. -| P(h,a) ^ 1 P(G,a) The following is a refutation of the denial of the theorem. l.k. P(d,(h),a) — name 1.2 using ( -1 P(h,x) v -, P(F,x)) * P(d x (h),x) 75 1.5' ~i P(d (d (h)),a) — take negation of d-.(h) using P(x,y) *> -j P(d 2 (x), y) 1.6. P(F,a) v P(G,a) — from 1.1 and 1.5 1.7. -1 P(F,a) -- from 1.2 and 1.4 1.8. -7 P(G,a) ~ from 1.3 and l.k 1.9. P(G,a) -- from 1.6 and 1.7 Formulas 1.8 and 1.9 yield the empty clause. Example 2 This example is taken from Church [2]. He defines equality between objects, say x = y, as V f((Fx) -» (Fy)). He then proves some properties about equality. Three of these are given here. A. V x (x = x), i.e. V x V f((Fx) -» (Fx)). The negation is 3 x 3 F((Fx) /\ -j (Fx)). The 2-sorted translation in Skolem normal form is P(F, a) a -? p(F, a) where F and a are constant symbols. Obviously, one resolution yields the empty clause. B. V x V y(x = y -> y = x), i.e. V x V y ( V f( (Fx) -> (Fy) ) -> V F( (Fy) - (Fx) ) ) . The negation is 'd x 3 y(V F((Fx) - (Fy)) ^ 3 F((Fy) P(f,x)). The Skolem normal form is V f(( -, p(f,a) v P(f,b)) - P(G,b) a -, p(G,a)) where a, b, and G are constant symbols. The three matrix clauses are 76 2.B.I. -i P(f,a) * P(f,b) 2.B.2. P(G,b) 2. BO- -iP(G,a) The following is a refutation. 2.B.4. -| P(d(G),b) -- take negation of G in 2.B.2 using P(x,y) * -» P(d(x),y) 2.B.5* P(d(G),a) — same as 2.B.^ using 2.B.3 instead of 2.B.2 2.B.6. p(d(G),b) — from 2.B.1 and 2.B.5 2.B-7. empty clause — from 2.B.J+ and 2.B.6 c. VxVyVz(x= y^y= z -> x = z) The negation is 3 x 3y 3 z(Vf((Fx) -* (Fy)) >n V F((Fy) - (Fz)) * ( J3 F((Fx) * -, (Fz)))). The Skolem normal form of the 2 -sorted translation is V f V g (( -, p(f,a) v p(f,b)) * ( -r P(g,b) v P(g,c)) a p(F,a) * -» P(F,c)) where a, b, c, and F are constant symbols. The matrix clauses are 2.C.I. -7 P(f,a) s/ P(f,b) 2.C.2 2.C-3 2.C.^ The refutation is 2.C.5 2.C.6 2.C7 -: P(g,b) v P(g,c) P(F,a) -1 P(F,c) P(F,b) P(F,c) empty clause — from 2.C.1 and 2. C.J — from 2.C2 and 2.C-5 -- from 2.C3 and 2.C.6 Note that in A and C it was unnecessary to apply the naming rule. Resolution alone was enough to derive a contradiction. 77 Example 3 Show that the factor group of a homomorphism is isomorphic to the image group. This theorem would probably be handled as several lemmas. One such lemma states that there is a homomorphism of cosets of the kernal onto the image of the original homomorphism. This lemma will now be proved. It is stated directly in a ^-sorted system, rather than stated first in the \-calculus and then translated. There are four types. Type 1 is the domain group, type 2 the image group, type 3 a set of maps from type 1 to type 2, and type k- is a set of subsets of type 1 objects. The predicate symbols are P of type (l, 1,1), Q of type (2,2,2), R of type (1,3,2), and S of type (1,4). The function sym- bols are f of type (2,1), i of type (1,1), i p of type (2,2), and g,h, and k, all of type (3,1). The clauses needed for the lemma and their meanings are now given. 3-l p . -i P(x x ,x 2 ,x 5 ) s/ i P(x 2 ,x^,x 5 ) v -, P(x x ,x 5 ,x 6 ) v P(x^,x^,x 6 ) 3-2 p . -, P( X;L ,x 2 ,x 5 ) v -, P(x 2 ,x^,x 5 ) v/ -rP(x 5 ,x^,x 6 ) v pfx^x ,x g ) P(x, y, z) means x-y = z. The above two formulas express the fact that type 1 objects are associative. 3-3 p . P(e 1 ,x,x) \ — e is the identity. 3-4 p . P(x, ei ,x) j 3«5p« P(x, i, (x), e, ) — i. (x) is the inverse of x There is a similar set of axioms 3«l n - 3«5 n using Q,, e and i stating that type 2 objects form a group. 3- 6. -, P(x,y, z) v -, R(x,H,u) v- -, R(y,H,v) v -» R(z,H,w) v Q(u,v,w) — R(x,H, u) means H(x) = u. This for- mula states that H is a homomorphism, i.e. H(x-y) = H(x)-H(y) 78 3. 7- R(f(u),H,u) -- f(u) is an inverse of u under H 3. 8. n S(x,N) s/ R(x,H,e 2 ) 3. 9- -i R(x,H,e 2 ) v S(x,N) — N is the kernal of H 3.10. S(h(j),N) — given a map j, h(,j) is an element of the kernal of H 3.11. P(g(j),h(j),k(j)) 5.12. R(g(j),j,a) 3.13. R(k(j),j,b) 3.1^. -»Q(a,e 2 ,l3) Formulas 3.11-3.14 express the fact that given any map j, g(j) and k(j') are in the same coset of N, but their images are not equal. The refutation follows . 3.15. R(h(j),H,e 2 ) — from 3-8 and 3-10 3.16. n R(g(j),H,u) v -, R(h(j),H,v) v/ -i R(k(j),H,w) s/ Q(u,v,w) — from 3-6 and 3-H 3.17. -» R(h(H),H,v) v -, R(k(H),H,w) v Q(a,v,w) — from 3-12 and 3.16 3.18. -,R(k(H),H,w) - Q(a,e 2 ,w) — from 3-15 and 3-17 3.19. Q(a,e 2 ,"b) — from 3-13 and 3. 18 3.20. empty clause — from 3«1^ and 3«19 79 VITA Lawrence Joseph Henschen was born on October 11, 19^, in Joliet, Illinois. He received the Bachelor of Arts degree from the University of Illinois, Urbana, in 1966. He pursued graduate study at the University of Illinois from 1966 to 1971 and was granted the Master of Arts degree in 1968. While an undergraduate he was employed as a grading assistant in the Depart- ment of Computer Science and held a research assistantship in that department during his graduate studies. A £ % % 4* UNIVERSITY OF ILLINOIS URBANA 510 14 II BR no COO? no 451 456(18/1 Rotolullon ttylt prool proc.dun lot hlg 3 0112 088399743 II ■ ■ ■ ■ • HI ■!■■ •! *v ■ ■ HH HHflfl ml IHflHHfl ■HI HH BS8B Bhm BHi I H m ma BBS Hnm HI ■■ V ■ ■ ■■■■■■■