key: cord-0053372-7pdk3yly authors: Barthe, Gilles; Crubillé, Raphaëlle; Lago, Ugo Dal; Gavazzo, Francesco title: On the Versatility of Open Logical Relations: Continuity, Automatic Differentiation, and a Containment Theorem date: 2020-04-18 journal: Programming Languages and Systems DOI: 10.1007/978-3-030-44914-8_3 sha: adfbaf2ff8b9764e815b969b839ffea6a78efd01 doc_id: 53372 cord_uid: 7pdk3yly Logical relations are one among the most powerful techniques in the theory of programming languages, and have been used extensively for proving properties of a variety of higher-order calculi. However, there are properties that cannot be immediately proved by means of logical relations, for instance program continuity and differentiability in higher-order languages extended with real-valued functions. Informally, the problem stems from the fact that these properties are naturally expressed on terms of non-ground type (or, equivalently, on open terms of base type), and there is no apparent good definition for a base case (i.e. for closed terms of ground types). To overcome this issue, we study a generalization of the concept of a logical relation, called open logical relation, and prove that it can be fruitfully applied in several contexts in which the property of interest is about expressions of first-order type. Our setting is a simply-typed [Formula: see text] -calculus enriched with real numbers and real-valued first-order functions from a given set, such as the one of continuous or differentiable functions. We first prove a containment theorem stating that for any collection of real-valued first-order functions including projection functions and closed under function composition, any well-typed term of first-order type denotes a function belonging to that collection. Then, we show by way of open logical relations the correctness of the core of a recently published algorithm for forward automatic differentiation. Finally, we define a refinement-based type system for local continuity in an extension of our calculus with conditionals, and prove the soundness of the type system using open logical relations. Logical relations have been extremely successful as a way of proving equivalence between concrete programs as well as correctness of program transformations. In their "unary" version, they also are a formidable tool to prove termination of typable programs, through the so-called reducibility technique. The class of programming languages in which these techniques have been instantiated includes not only higher-order calculi with simple types, but also calculi with recursion [3, 2, 23] , various kinds of effects [14, 12, 25, 36, 10, 11, 34] , and concurrency [56, 13] . Without any aim to be precise, let us see how reducibility works, in the setting of a simply typed calculus. The main idea is to define, by induction on the structure of types, the concept of a well-behaved program, where in the base case one simply makes reference to the underlying notion of observation (e.g. being strong normalizing), while the more interesting case is handled by stipulating that reducible higher-order terms are those which maps reducible terms to reducible terms, this way exploiting the inductive nature of simple types. One can even go beyond the basic setting of simple types, and extend reducibility to, e.g., languages with recursive types [23, 2] or even untyped languages [44] by means of techniques such as step-indexing [3] . The same kind of recipe works in a relational setting, where one wants to compare programs rather than merely proving properties about them. Again, two terms are equivalent at base types if they have the same observable behaviour, while at higher types one wants that equivalent terms are those which maps equivalent arguments to equivalent results. There are cases, however, in which the property one observes, or the property in which the underlying notion of program equivalence or correctness is based, is formulated for types which are not ground (or equivalently, it is formulated for open expressions). As an example, one could be interested in proving that in a higher-order type system all first-order expressions compute numerical functions of a specific kind, for example, continuous or derivable ones. We call such properties first-order properties 5 . As we will describe in Section 3 below, logical relations do not seem to be applicable off-the-shelf to these cases. Informally, this is due to the fact that we cannot start by defining a base case for ground types and then build the relation inductively. In this paper, we show that logical relations and reducibility can deal with first-order properties in a compositional way without altering their nature. The main idea behind the resulting definition, known as open logical relations [59] , consists in parameterizing the set of related terms of a certain type (or the underlying reducibility set) on a ground environment, this way turning it into a set of pairs of open terms. As a consequence, one can define the target first-order property in a natural way. Generalizations of logical relations to open terms have been used by several authors, and in several (oftentimes unrelated) contexts (see, for instance, [15, 39, 47, 30, 53] ). In this paper, we show how open logical relations constitute a powerful technique to systematically prove first-order properties of programs. In this respect, the paper's technical contributions are applications of open logical relations to three distinct problems. • In Section 4, we use open logical relations to prove a general Containment Theorem. Such a theorem serves as a vehicle to introduce open logical relations but is also of independent interest. The theorem states that given a collection F of real-valued functions including projections and closed under function composition, any first-order term of a simply-typed λ-calculus endowed with primitives for real numbers and operators computing functions in F, computes itself a function in F. As an instance of such a result, we see that any first-order term in a simply-typed λ-calculus extended with primitives for continuous functions, computes a continuous function. Although the Containment Theorem can be derived from previous results by Lafont [41] (see Section 7), our proof is purely syntactical and consists of a straightforward application of open logical relations. • In Section 5, we use open logical relations to prove correctness of a core algorithm for forward automatic differentiation of simply-typed terms. The algorithm is a fragment of the one presented in [50] . More specifically, any first-order term is proved to be mapped to another first-order term computing its derivative, in the usual sense of mathematical analysis. This goes beyond the Containment Theorem by dealing with relational properties. • In Section 6, we consider an extended language with an if-then-else construction. When dealing with continuity, the introduction of conditionals invalidates the Containment Theorem, since conditionals naturally introduce discontinuities. To overcome this deficiency, we introduce a refinement type system ensuring that first-order typable terms are continuous functions on some intended domain, and use open logical relations to prove the soundness of the type system. Due to space constraints, many details have to be omitted, but can be found in an Extended Version of this work [7] . In order to facilitate the communication of the main ideas behind open logical relations and their applications, this paper deals with several vehicle calculi. All such calculi can be seen as derived from a unique calculus, denoted by Λ ×,→,R , which thus provides the common ground for our inquiry. The calculus Λ ×,→,R is obtained by adding to the simply typed λ-calculus with product and arrow types (which we denote by Λ ×,→ ) a ground type R for real numbers and constants r of type R, for each real number r. Given a collection F of real-valued functions, i.e. functions f : R n → R (with n ≥ 1), we endow Λ ×,→,R with an operator f , for any f ∈ F, whose intended meaning is that whenever t 1 , . . . , t n compute real numbers r 1 , . . . , r n , then f (t 1 , . . . , t n ) compute f (r 1 , . . . , r n ). We call the resulting calculus Λ ×,→,R F . Depending on the application we are interested in, we will take as F specific collections of real-valued functions, such as continuous or differentiable functions. The syntax and static semantics of Λ ×,→,R F are defined in Figure 1 , where f : R n → R belongs to F. The static semantics of Λ ×,→,R F is based on judgments of the form Γ ⊢ t : τ , which have the usual intended meaning. We adopt standard syntactic conventions as in [6] , notably the so-called variable convention. In particular, we denote by F V (t) the collection of free variables of t and by s[t/x] the capture-avoiding substitution of the expression t for all free occurrences of x in s. We do not confine ourselves with a fixed operational semantics (e.g. with a callby-value operational semantics), but take advantage of the simply-typed nature of Λ ×,→,R F and opt for a set-theoretic denotational semantics. The category of sets and functions being cartesian closed, the denotational semantics of Λ ×,→,R F is standard and associates to any judgment x 1 : τ 1 , . . . , x n : τ n ⊢ t : τ , a function x 1 : τ 1 , . . . , x n : τ n ⊢ t : τ : i τ i → τ , where τ -the semantics of τ -is thus defined: Due to space constraints, we omit the definition of Γ ⊢ t : τ and refer the reader to any textbook on the subject (such as [43] ). In this section, we will look informally at a problem which, apparently, cannot be solved using vanilla reducibility or logical relations. This serves both as a motivating example and as a justification of some of the design choices we had to do when designing open logical relations. Consider the simply-typed λ-calculus Λ ×,→ , the prototypical example of a well-behaved higher-order functional programming language. As is well known, Λ ×,→ is strongly normalizing and the technique of logical relations can be applied on-the-nose. The proof of strong normalization for Λ ×,→ is structured around the definition of a family of reducibility sets of closed terms {Red τ } τ , indexed by types. At any atomic type τ , Red τ is defined as the set of terms (of type τ ) having the property of interest, i.e. as the collection of strongly normalizing terms. The set Red τ1→τ2 , instead, contains those terms which, when applied to a term in Red τ1 , returns a term in Red τ2 . Reducibility sets are afterwards generalised to open terms, and finally all typable terms are shown to be reducible. Let us now consider the calculus Λ ×,→,R F , where F contains the addition and multiplication functions only. This language has already been considered in the literature, under the name of higher-order polynomials [22, 40] , which are crucial tools in higher-order complexity theory and resource analysis. Now, let us ask ourselves the following question: can we say anything about the nature of those functions R n → R which are denoted by (closed) terms of type R n → R? Of course, all the polynomials on the real field can be represented, but can we go beyond, thanks to higher-order constructions? The answer is negative: terms of type R n → R represent all and only the polynomials [5, 17] . This result is an instance of the general containment theorem mentioned at the end of Section 1. Let us now focus on proofs of this containment result. It turns out that proofs from the literature are not compositional, and rely on"heavyweight" tools, including strong normalization of Λ ×,→ and soundness of the underlying operational semantics. In fact, proving the result using usual reducibility arguments would not be immediate, precisely because there is no obvious choice for the base case. If, for example, we define Red R as the set of terms strongly normalizing to a numeral, Red R n →R as the set of polynomials, and for any other type as usual, we soon get into troubles: indeed, we would like the two sets of functions Red R→(R→R) ; to denote essentially the same set of functions, modulo the adjoint between R 2 → R and R → (R → R). But this is clearly not the case: just consider the function f in R → (R → R) thus defined: Clearly, f turns any fixed real number to a polynomial, but when curried, it is far from being a polynomial. In other words, reducibility seems apparently inadequate to capture situations like the one above, in which the "base case" is not the one of ground types, but rather the one of first-order types. Before proceeding any further, it is useful to fix the boundaries of our investigation. We are interested in proving that (the semantics of) programs of first-order type R n → R enjoy first-order properties, such as continuity or differentiability, under their standard interpretation in calculus and real analysis. More specifically, our results do not cover notions of continuity and differentiability studied in fields such as (exact) real-number computation [57] or computable analysis [58] , which have a strong domain-theoretical flavor, and higher-order generalizations of continuity and differentiability (see, e.g., [26, 27, 32, 29] ). We leave for future work the study of open logical relations in these settings. What this paper aims to provide, is a family of lightweight techniques that can be used to show that practical properties of interest of real-valued functions are guaranteed to hold when programs are written taking advantage of higher-order constructors. We believe that the three case studies we present in this paper are both a way to point to the practical scenarios we have in mind and of witnessing the versatility of our methodology. In this section we introduce open logical relations in their unary version (i.e. open logical predicates). We do so by proving the following Containment Theorem. Theorem 1 (Containment Theorem). Let F be a collection of real-valued functions including projections and closed under function composition. Then, As already remarked in previous sections, notable instances of Theorem 1 are obtained by taking F as the collection of continuous functions, or as the collection of polynomials. Our strategy to prove Theorem 1 consists in defining a logical predicate, denoted by F , ensuring the denotation of programs of a first-order type to be in F, and hereditary preserving this property at higher-order types. However, F being a property of real-valued functions-and the denotation of an open term of the form x 1 : R, . . . , x n : R ⊢ t : R being such a function-we shall work with open terms with free variables of type R and parametrize the candidate logical predicate by types and environments Θ containing such variables. This way, we obtain a family of logical predicates F Θ τ acting on terms of the form Θ ⊢ t : τ . As a consequence, when considering the ground type R and an environment Θ = x 1 : R, . . . , x n : R, we obtain a predicate F Θ R on expressions Θ ⊢ t : R which naturally corresponds to functions from R n to R, for which belonging to F is indeed meaningful. Definition 1 (Open Logical Predicate). Let Θ = x 1 : R, . . . , x n : R be a fixed environment. We define the type-indexed family of predicates F Θ τ by induction on τ as follows: We extend F Θ τ to the predicate F Γ ,Θ τ , where Γ ranges over arbitrary environments (possibly containing variables of type R) as follows: Here, γ ranges over substitutions 6 and γ ∈ F Γ Θ holds if the support of γ is Γ and γ(x) ∈ F Θ τ , for any (x : τ ) ∈ Γ . Notice that Definition 1 ensures first-order real-valued functions to be in F, and asks for such a property to be hereditary preserved at higher-order types. Lemma 1 states that these conditions are indeed sufficient to guarantee any Λ ×,→,R F term Θ ⊢ t : R to denote a function in F. For all environments Γ , Θ as above, and for any expression Γ , Θ ⊢ t : τ , we have t ∈ F Γ ,Θ τ . Proof. By induction on t, observing that F Θ τ is closed under denotational semantics: if s ∈ F Θ τ and Θ ⊢ t : τ = Θ ⊢ s : τ , then t ∈ F Θ τ . The proof follows the same structure of Lemma 3, and thus we omit details here. Finally, a straightforward application of Lemma 1 gives the desired result, namely Theorem 1. In this section, we show how we can use open logical relations to prove the correctness of (a fragment of) the automatic differentiation algorithm of [50] (suitably adapted to our calculus). Automatic differentiation [8, 9, 35] (AD, for short) is a family of techniques to efficiently compute the numerical (as opposed to symbolical ) derivative of a computer program denoting a real-valued function. Roughly speaking, AD acts on the code of a program by letting variables incorporate values for their derivative, and operators propagate derivatives according to the chain rule of differential calculus [52] . Due to its vast applications in machine learning (backpropagation [49] being an example of an AD technique) and, most notably, in deep learning [9] , AD is rapidly becoming a topic of interest in the programming language theory community, as witnessed by the new line of research called differentiable programming (see, e.g., [28, 50, 16, 1] for some recent results on AD and programming language theory developed in the latter field). AD comes several modes, the two most important ones being the forward mode (also called tangent mode) and the backward mode (also called reverse mode). These can be seen as different ways to compute the chain rule, the former by traversing the chain rule from inside to outside, while the latter from outside to inside. Here we are concerned with forward mode AD. More specifically, we consider the forward mode AD algorithm recently proposed in [50] . The latter is based on a source-to-source program transformation extracting out of a program t a new program Dt whose evaluation simultaneously gives the result of computing t and its derivative. This is achieved by augmenting the code of t in such a way to handle dual numbers 7 . The transformation roughly goes as follows: expressions s of type R are transformed into dual numbers, i.e. expressions s ′ of type R×R, where the first component of s ′ gives the original value of s, and the second component of s ′ gives the derivative of s. Real-valued function symbols are then extended to handle dual numbers by applying the chain rule, while other constructors of the language are extended pointwise. The algorithm of [50] has been studied by means of benchmarks and, to the best of the authors' knowledge, the only proof of its correctness available in the literature 8 has been given at the time of writing by Huot et al. in [37] . However, the latter proof relies on denotational semantics, and no operational proof of correctness has been given so far. Differentiability being a first-order concept, open logical relations are thus a perfect candidate for such a job. In the rest of this section, given a differentiable function f : R n → R, we denote by ∂ x f : R n → R its partial derivative with respect to the variable x. Let D be the collection of (real-valued) differentiable functions, and let us fix a collection F of real-valued functions such that, for any f ∈ D, both f and ∂ x f belong to F. We also assume F to contain functions for real number arithmetic. Notice that since ∂ x f is not necessarily differentiable, in general ∂ x f ∈ D. We begin by recalling how the program transformation of [50] works on Λ ×,→,R D , the extension of Λ ×,→,R with operators for functions in D. In order to define the derivative of a Λ ×,→,R D expression, we first define an intermediate The action of D on types, environments, and expressions is defined in Figure 2 . Let us comment the definition of D, beginning with its action on types. Following the rationale behind forward-mode AD, the map D associates to the type R the product type R × R, the first and second components of its inhabitants being the original expression and its derivative, respectively. The action of D on non-basic types is straightforward and it is designed so that the automatic differentiation machinery can handle higher-order expressions in such a way to guarantee correctness at real-valued function types. The action of D on the usual constructors of the λ-calculus is pointwise, although it is worth noticing that D associates to any variable x of type τ a new variable, which we denote by dx, of type Dτ . As we are going to see, if τ = R, then dx acts as a placeholder for a dual number. More interesting is the action of D on real-valued constructors. To any numeral r, D associates the pair Dr = (r, 0), the derivative of a number being zero. Let us now inspect the action of D on an operator f associated to f : R n → R (we treat f as a function in the variables x 1 , . . . , x n ). The interesting part is the where n i=1 and * denote the operators (of Λ ×,→,R F ) associated to summation and (binary) multiplication (for readability we omit the underline notation), and ∂ xi f is the operator (of Λ ×,→,R F ) associated to partial derivative ∂ xi f of f in the variable x i . It is not hard to recognize that the above expression is nothing but an instance of the chain rule. Finally, we notice that if Example 1. Let us consider the binary function f (x 1 , x 2 ) = sin(x 1 ) + cos(x 2 ). For readability, we overload the notation writing f in place of f (and similarly for ∂ xi f ). Given expressions t 1 , t 2 , we compute D(sin(t 1 ) + cos(t 2 )). Recall that ∂ x1 f (x 1 , x 2 ) = cos(x 1 ) and ∂ x2 f (x 1 , x 2 ) = − sin(x 2 ). We have: As a consequence, we see that D(λx.λy. sin(x) + cos(y)) is We now aim to define the derivative of an expression x 1 : R, . . . , x n : R ⊢ t : R with respect to a variable x (of type R). In order to do so we first associate to any variable y : R its dual expression dual x (y) : R × R defined as: Next, we define for Let us clarify this passage with a simple example. We first of all compute Dt, obtaining: Observing that dual x (x) = (x, 1) and dual x (y) = (y, 0), we indeed obtain the desired derivative as x : R, y : R ⊢ Dt[dual x (x)/dx, dual x (y)/dy].2 : R. For we have: Dτ , for any variable y and Θ ⊢ s : τ . Open Logical relations for AD We have claimed that the operation deriv performs automatic differentiation of Λ ×,→,R D expressions. By that we mean that once applied to expressions of the form x 1 : R, . . . , x n : R ⊢ t : R, the operation deriv can be used to compute the derivative of x 1 : R, . . . , x n : R ⊢ t : R . We now show how we can prove such a statement using open logical relations, this way providing a proof of correctness of our AD program transformation. We begin by defining a logical relations R between Λ ×,→,R D and Λ ×,→,R F expressions. We design R in such a way that (i) tRDt and (ii) if tRs and t inhabits a first-order type, then indeed s corresponds to the derivative of t. While (ii) essentially holds by definition, (i) requires some efforts in order to be proved. Definition 2 (Open Logical Relation). Let Θ = x 1 : R, . . . , x n : R be a fixed, arbitrary environment. Define the family of relations (R Θ τ ) Θ,τ between Λ ×,→,R D and Λ ×,→,R F expressions by induction on τ as follows: where Γ ranges over arbitrary environments (possibly containing variables of type R), as follows: where γ, δ range over substitutions, and: Obviously, Definition 2 satisfies condition (ii) above. What remains to be done is to show that it satisfies condition (i) as well. In order to prove such a result, we first need to show that the logical relation respects the denotational semantics of Λ ×,→,R D . Lemma 2. Let Θ = x 1 : R, . . . , x n : R. Then, the following hold: We are now ready to state and prove the main result of this section. For all environments Γ , Θ and for any expression Γ , Proof. We prove the following statement, by induction on t: We show only the most relevant cases. Suppose t is a variable x. We distinguish whether x belongs to Γ or Θ. for any variable y (of type R). The first identity obviously holds as For the second identity we distinguish whether y = x or y = x. In the former case we have dual y (x) = (x, 1), and thus: In the latter case we have dual y (x) = (x, 0), and thus: , for all substitutions γ, δ such that γ R Γ Θ δ. Since x belongs to Γ , we are trivially done. Suppose t is λx.s, so that we have for some types τ 1 , τ 2 . As x is bound in λx.s, without loss of generality we can assume (x : It is easy to see that γ ′ R ∆ Θ δ ′ , so that by s R ∆,Θ τ2 Ds (recall that the latter follows by induction hypothesis) we infer sγ ′ R Θ τ2 (Ds)δ ′ , by the very definition of open logical relation. As a consequence, the thesis is proved if we show The above identities hold if x ∈ F V (γ(y)) and dx ∈ F V (δ(dy)), for any (y : τ ) ∈ Γ . This is indeed the case, since γ(y) R Θ τ δ(dy) implies Θ ⊢ γ(y) : τ and DΘ ⊢ δ(dy) : Dτ , and x ∈ Θ (and thus dx ∈ DΘ). A direct application of Lemma 3 allows us to conclude the correctness of the program transformation D. In fact, given a first-order term Θ ⊢ t : R, with Θ = x 1 : R, . . . , x n : R, by Lemma 3 we have t R Θ R Dt, and thus for any real-valued variable y, meaning that Dt indeed computes the partial derivative of t. Theorem 2. For any term Θ ⊢ t : R as above, the term DΘ ⊢ Dt : DR computes the partial derivative of t, i.e., for any variable y we have In Section 4, we exploited open logical relations to establish a containment theorem for the calculus Λ ×,→,R F , i.e. the calculus Λ ×,→,R extended with real-valued functions belonging to a set F including projections and closed under function composition. Since the collection C of (real-valued) continuous functions satisfies both constraints, Theorem 1 allows us to conclude that all first order terms of Λ ×,→,R C represent continuous functions. The aim of the present section is the development of a framework to prove continuity properties of programs in a calculus that goes beyond Λ ×,→,R C . More specifically, (i) we do not restrict our analysis to calculi having operators representing continuous real-valued functions only, but consider operators for arbitrary real-valued functions, and (ii) we add to our calculus an if-then-else construct whose static semantics is captured by the following rule: The intended dynamic semantics of the term if t then s else p is the same as the one of s whenever t evaluates to any real number r = 0 and the same as the one of p if it evaluates to 0. Notice that the crux of the problem we aim to solve is the presence of the if-then-else construct. Indeed, independently of point (i), such a construct breaks the global continuity of programs, as illustrated in Figure 3a . As a consequence we are forced to look at local continuity properties, instead: for instance we can say that the program of Figure 3a is continuous both on R <0 and R ≥0 . Observe that guaranteeing local continuity allows us (up to a certain point) to recover the ability of approximating the output of a program by approximating its input. Indeed, if a program t : R × . . . × R → R is locally continuous on a subset X of R n , then the value of ts (for some input s) can be approximated For this reason we will work with the notion of sequential continuity, instead of the usual topological notion of continuity. It must be observed, however, that these two notions coincide as soon as the continuity domain X is actually an open set. ). Let f : R n → R, and X be any subset of R n . We say that f is (sequentially) continuous on X if for every x ∈ X, and for every sequence (x n ) n∈N of elements of X such that lim n→∞ x n = x, it holds that lim n→∞ f (x n ) = f (x). In [18] , Chaudhuri et al. introduced a logical system designed to guarantee local continuity properties on programs in an imperative (first-order) programming language with conditional branches and loops. In this section, we develop a similar system in the setting of a higher-order functional language with an if-then-else construct, and we use open logical relations to prove the soundness of our system. This witnesses, on yet another situation, the versatility of open logical relations. Compared to [18] , we somehow generalize from a result on programs built from only first-order constructs and primitive functions, to a containment result for programs built using also higher-order constructs. We however mention that, although our system is inspired by the work of Chaudhuri at al., there are significant differences between the two, even at the first-order level. The consequences these differences have on the expressive power of our systems are twofold: • On the one hand, while inferring continuity on some domain X of a program of the form if t then s else p, we have more flexibility than [18] for the domains of continuity of s and p. To be more concrete, let us consider the program λx.(if (x > 0) then 0 else (if x = 4 then 1 else 0)), which is continuous on R even though the second branch is continuous on R ≤0 , but not on R. We are able to show in our system that this program is indeed continuous on the whole domain R, while Chaudhuri et al. cannot do the same in their system for the corresponding imperative program: they ask the domain of continuity of each of the two branches to coincide with the domain of continuity of the whole program. • On the other hand, the system of Chaudhuri at al. allows one to express continuity along a restricted set of variables, which we cannot do. To illustrate this, let us look at the program: λx, y.if (x = 0) then (3 * y) else (4 * y): along the variable y, this program is continuous on the whole of R. Chaudhuri et al. are able to express and prove this statement in their system, while we can only say that for every real a, this program is continuous on the domain {a} × R. For the sake of simplicity, it is useful to slightly simplify our calculus; the ideas we present here, however, would still be valid in a more general setting, but that would make the presentation and proofs more involved. As usual, let F be a collection of real-valued functions. We consider the restriction of the calculus Λ ×,→,R F obtained by considering types of the form τ ::= R | ρ; ρ ::= ρ 1 × · · · × ρ n × R × · · · × R m-times → τ ; only. For the sake of readability, we employ the notation (ρ 1 . . . , ρ n , R, . . . , R) → τ in place of ρ 1 × · · · × ρ n × R × · · · × R → τ . We also overload the notation and keep indicating the resulting calculus as Λ ×,→,R . Nonetheless, the reader should keep in mind that from now on, whenever referring to a Λ ×,→,R F term, we are tacitly referring to a term typable according to the restricted type system, but that can indeed contain conditionals. Since we want to be able to talk about composition properties of locally continuous programs, we actually need to talk not only about the points where a program is continuous, but also about the image of this continuity domain. In higher-order languages, a well-established framework for the latter kind of specifications is the one of refinement types, that have been first introduced by [31] in the context of ML types: the basic idea is to annotate an existing type system with logical formulas, with the aim of being more precise about the underlying program's behaviors than in simple types. Here, we are going to adapt this framework by replacing the image annotations provided by standard refinement types with continuity annotations. Our refinement type system is developed on top of the simple types system of Section 2 (actually, on the simplification of such a system we are considering in this section). We first need to introduce a set of logical formulas which talk about n-uples of real numbers, and which we use as annotations in our refinement types. We consider a set V of logical variables, and we construct formulas as follows: ψ, φ ∈ L ::= ⊤ | (e ≤ e) | ψ ∧ φ | ¬ψ, e ∈ E ::= α | a | f (e, . . . , e) with α ∈ V, a ∈ R, f : R n → R. Recall that with the connectives in our logic, we are able to encode logical disjunction and implication, and as customary, we write φ ⇒ ψ for ¬φ ∨ ψ. A real assignment is a partial map σ : V → R. When σ has finite support, we sometimes specify σ by writing (α 1 → σ(α 1 ), . . . , α n → σ(α n )). We note σ |= φ when σ is defined on the variables occurring in φ, and moreover the real formula obtained when replacing along σ the logical variables of φ is true. We write |= φ when σ |= φ always holds, independently on σ. We can associate to every formula the subset of R n consisting of all points where this formula holds: more precisely, if φ is a formula, and X = α 1 , . . . , α n is a list of logical variables such that Vars(φ) ⊆ X, we call truth domain of φ w.r.t. X the set: (a 1 , . . . , a n ) ∈ R n | (α 1 → a 1 , . . . , α n → a n ) |= φ}. We are now ready to define the language of refinement types, which can be seen as simple types annotated by logical formulas. The type R is annotated by logical variables: this way we obtain refinement real types of the form {α ∈ R}. The crux of our refinement type system consists in the annotations we put on the arrows. We introduce two distinct refined arrow constructs, depending on the shape of the target type: more precisely we annotate the arrow of a type (T 1 , . . . , T n ) → R with two logical formulas, while we annotate (T 1 , . . . , T n ) → H (where H is an higher-order type) with only one logical formula. This way, we obtain refined arrow types of the form (T 1 , . . . , T n ) ψ → H has its real arguments used in a continuous way on the domain specified by ψ, but it is not possible anymore to specify an image domain, because H is higher-order. The general form of our refined types is thus as follows: T ::= H | F ; F ::= {α ∈ R}; , and the (α i ) 1≤i≤n are distinct. We take refinement types up to renaming of logical variables. If T is a refinement type, we write T for the simple type we obtain by forgetting about the annotations in T . Example 3. We illustrate in this example the intended meaning of our refinement types. • We first look at how to refine R → R: those are types of the form {α 1 ∈ R} φ1 φ2 → {α 2 ∈ R}. The intended inhabitants of these types are the programs t : R → R such that i) t is continuous on the truth domain of φ 1 ; and ii) t sends the truth domain of φ 1 into the truth domain of φ 2 . As an example, φ 1 could be (α 1 < 3), and φ 2 could be (α 2 ≥ 5). An example of a program having this type is t = λx. 3−a when a < 3 0 otherwise , and moreover we assume that {f , +} ⊆ F. • We look now at the possible refinements of R → (R → R): those are of the form . The intended inhabitants of these types are the programs t : R → (R → R) whose interpretation function (x, y) ∈ R 2 → t (x)(y) sends continously Dom(θ 1 ) α1 × Dom(θ 2 ) α2 into Dom(θ 3 ) α3 . As an example, consider θ 1 = (α 1 < 1), θ 2 = (α 2 ≤ 3), and θ 3 = (α 3 > 0). An example of a program having this type is λx 1 .λx 2 .f (x 1 * x 2 ) where we take f as above. A refined typing context Γ is a list x 1 : T 1 , . . . , x n : T n , where each T i is a refinement type. In order to express continuity constraints, we need to annotate typing judgments by logical formulas, in a similar way as what we do for arrow types. More precisely, we consider two kinds of refined typing judgments: one for terms of ground type, and one for terms of higher-order type: We first consider refinement typing rules for the fragment of our language which excludes conditionals: they are given in Figure 4 . We illustrate them by way of a series of examples. Example 4. We first look at the typing rule var-F: if θ implies θ ′ , then the variable x-that, in semantics terms, does the projection of the context Γ to one of its component-sends continuously the truth domain of θ into the truth domain of θ ′ . Using this rule we can, for instance, derive the following judgment: Example 5. We now look at the Rf rule, that deals with functions from F. Using this rule, we can show that: Before giving the refined typing rule for the if-then-else construct, we also illustrate on an example how the rules in Figure 4 allow us to exploit the continuity informations we have on functions in F, compositionally. ⊢r λ(x1, . . . , xn).t : (T1, . . . , Tn) ⊢r t(s1, . . . , sm, p1, . . . , pm) : T The formula ψ(η) should be read as ψ when T is a higher-order type, and as ψ η when T is a ground type. Example 6. Let f : R → R be the function defined as: Observe that we can actually regard f as represented by the program in Figure 3a -but we consider it as a primitive function in F for the time being, since we have not introduced the typing rule for the if-then-else construct, yet. Consider the program: t = λ(x, y).f (min(x, y)). We see that t : R 2 → R is continuous on the set {(x, y) | x ≥ 0 ∧ y ≥ 0}, and that, moreover, the image of f on this set is contained on [1, +∞). Using the rules in Figure 4 , the fact that f is continuous on R ≥0 , and that min is continuous on R 2 , we see that our refined type system allows us to prove t to be continuous in the considered domain, i.e.: We now look at the rule for the if-then-else construct: as can be seen in the two programs in Figure 3 , the use of conditionals may or may not induce discontinuity points. The crux here is the behaviour of the two branches at the discontinuity points of the guard function. In the two programs represented in Figure 3 , we see that the only discontinuity point of the guard is in x = 0. However, in Figure 3b the two branches return the same value in 0, and the resulting program is thus continuous at x = 0, while in Figure 3a the two branches do not coincide in 0, and the resulting program is discontinuous at x = 0. We can generalize this observation: for the program if t then s else p to be continuous, we need the branches s and p to be continuous respectively on the domain where t is 1, and on the domain where t is 0, and moreover we need s and p to be continuous and to coincide on the points where t is not continuous. Similarly to the logical system designed by Chaudhuri et al [18] , the coincidence of the branches in the discontinuity points is expressed as a set of logical rules by way of observational equivalence. It should be observed that such an equivalence check is less problematic for first-order programs than it is for higher-order one (the authors of [18] are able to actually check observational equivalence through an SMT solver). On the other hand, various notions of equivalence which are included in contextual equivalence and sometimes coincide with it (e.g., applicative bisimilarity, denotational semantics, or logical relations themselves) have been developed for higher-order languages, and this starts to give rise to actual automatic tools for deciding contextual equivalence [38] . We give in Figure 5 the typing rule for conditionals. The conclusion of the rule guarantees the continuity of the program if t then s else p on a domain specified by a formula θ. The premises of the rule ask for formulas θ q for q ∈ {t, s, p} that specify continuity domains for the programs t, s, p, and ask also for two additional formulas θ (t,0) and θ (t,1) that specify domains where the value of the guard t is 0 and 1, respectively. The target formula θ, and the formulas (θ q ) q∈{t,s,p,(t,1),(t,0)} are related by two side-conditions. Side-condition (1) consists of the following four distinct requirements, that must hold for every point a in the truth domain of θ: i) a is in the truth domain of at least one of the two formulas θ t , θ s ; ii) if a is not in θ (t,1) (i.e., we have no guarantee that t will return 1 at point a, meaning that the program p may be executed) then a must be in the continuity domain of p; iii) a condition symmetric to the previous one, replacing 1 by 0, and p by s; iv) all points of possible discontinuity (i.e. the points a such that θ t does not hold) must be in the continuity domain of both s and p, and as a consequence both θ s and θ p must hold there. The side-condition (2) uses typed contextual equivalence ≡ ctx between terms to express that the two programs s and p must coincide on all inputs such that θ t does not hold-i.e. that are not in the continuity domain of t. Observe that typed context equivalence here is defined with respect to the system of simple types. Notation 1. We use the following notations in Figure 5 . When Γ is a typing environement, we write GΓ and HΓ for the ground and higher-order parts of Γ , respectively. Moreover, suppose we have a ground refined typing environment Θ = x 1 : {α 1 ∈ R}, . . . , x n : {α n ∈ R}: we say that a logical assignment σ is compatible with Θ when {α i | 1 ≤ i ≤ n} ⊆ supp(σ). When it is the case, we build in a natural way the substitution associated to σ along Θ by taking ⊢r if t then s else p : T Again, the formula ψ(η) should be read as ψ when T is a higher-order type, and as ψ η when T is a ground type. The side-conditions (1), (2) are given as: 2. For all logical assignment σ compatible with GΓ , σ |= θ ∧ ¬θt implies HΓ ⊢ sσ GΓ ≡ ctx pσ GΓ . Our goal in this section is to show the correctness of our refinement type systems, that we state below. Theorem 3. Let t be any program such that: Then it holds that: • t (Dom(θ) α1,...,αn ) ⊆ Dom(θ ′ ) β ; • t is sequentially continuous on Dom(θ) α1,...,αn . As a first step, we show that our if-then-else rule is reasonable, i.e. that it behaves well with primitive functions in F. More precisely, if we suppose that the functions f , g 0 , g 1 are such that the premises of the if-then-else rule hold, then the program if f (x 1 , . . . , x n ) then g 1 (x 1 , . . . , x n ) else g 0 (x 1 , . . . , x n ) is indeed continuous in the domain specified by the conclusion of the rule. This is precisely what we prove in the following lemma. Lemma 4. Let f , g 0 , g 1 : R n → R be functions in F, and Θ = x 1 : {α 1 ∈ R}, . . . , x n : {α n ∈ R}. We denote α the list of logical variables α 1 , . . . , α n . We consider logical formulas θ and θ f , θ (f ,0) , θ (f ,1) , φ g0 , φ g1 that have their logical variables in α, and such that: {b} for b ∈ {0, 1}. 2. g 0 and g 1 are continuous on Dom(φ g0 ) α , and Dom(φ g1 ) α respectively, and (α 1 → a 1 , . . . , α n → a n ) |= θ ∧ ¬θ f implies g 0 (a 1 , . . . , a n ) = g 1 (a 1 , . . . , a n ); Then it holds that: Proof. The proof can be found in the extended version [7] . Similarly to what we did in Section 4, we are going to show Theorem 3 by way of a logical predicate. Recall that the logical predicate we defined in Section 4 consists actually of three kind of predicates-all defined in Definition 1 of Section 4: where Θ ranges over ground typing environments, Γ ranges over arbitrary environments, and τ is a type. The first predicate F Θ τ contains admissible terms t of type Θ ⊢ t : τ , the second predicate F Θ Γ contains admissible substitutions γ that associate to every (x : τ ) in Γ a term of type τ under the typing context Θ, and the third predicate F Θ,Γ τ contains admissible terms t of type Γ , Θ ⊢ t : τ . Here, we need to adapt the three kinds of logical predicates to a refinement scenario: first, we replace τ and Θ, Γ with refinement types and refined typing contexts respectively. Moreover, for technical reasons, we also need to generalize our typing contexts, by allowing them to be annotated with any subset of R n instead of restricting ourselves to those subsets generated by logical formulas. Due to this further complexity, we split our definition of logical predicates into two: we first define the counterpart of the ground typing context predicate F Θ τ in Definition 4, then the counterpart of the predicate for substitutions F Θ Γ and the counterpart of the predicates F Θ,Γ τ for higher-order typing environment in Definition 5. Let us first see how we can adapt the predicates F Θ τ to our refinement types setting. Recall that in Section 4, we defined the predicate F Θ R as the collection of terms t such that Θ ⊢ t : R, and its semantics Θ ⊢ t : R belongs to F. As we are interested in local continuity properties, we need to build a predicate expressing local continuity constraints. Moreover, in order to be consistent with our two arrow constructs and our two kinds of typing judgments, we actually need to consider also two kinds of logical predicates, depending on whether the target type we consider is a real type or an higher-order type. We thus introduce the following logical predicates: where Θ is a ground typing environment, X is a subset of R n , φ is a logical formula, and, as usual, F ranges over the real refinements types, while H ranges over the higher-order refinement types. As expected, X and φ are needed to encode continuity constraints inside our logical predicates. Definition 4. Let Θ be a ground typing context of length n, F and H refined ground type and higher-order type, respectively. We define families of predicates on terms C(Θ, Y φ, F ) and C(Θ, Y , H), with Y ⊆ R n and φ a logical formula, as specified in Figure 6 . • For F = {α ∈ R} we take: where as usual we should read η when T is an annnotated real type. • We look now at an example when the target type T is higher-order. We take {β 2 ∈ R}, and we look at the logical predicate C (Θ, B • , H) . We are going to show that the latter contains, for instance, the program: Looking at Figure 6 , we see that it is enough to check that for any Y ⊆ R 2 and any s ∈ C(Θ, Y (β 1 ≥ 0), {β 1 ∈ R}), it holds that: Our overall goal-in order to prove Theorem 3-is to show the counterpart of the Fundamental Lemma from Section 4 (i.e. Lemma 1), which states that the logical predicate F Θ R contains all well-typed terms. This lemma only talks about the logical predicates for ground typing contexts, so we can state it as of now, but its proof is based on the fact that we dispose of the three predicates. Observe that from there, Theorem 3 follows just from the definition of the logical predicates on base types. Similarly to what we did for Lemma 1 in Section 4, proving it requires to define the logical predicates for substitutions and higherorder typing contexts. We do this in Definition 5 below. As before, they consist in an adaptation to our refinement types framework of the open logical predicates F Γ Θ and F Θ,Γ τ of Section 4: as usual, we need to add continuity annotations, and distinguish whether the target type is a ground type or an higher-order type. Notation 2. We need to first introduce the following notation: let Γ , Θ be two ground non-refined typing environments of length m and n respectively-and with disjoint support. Let γ : supp(Γ ) → {t | Θ ⊢ t : R} be a substitution. We write γ for the real-valued function: Definition 5. Let Θ be a ground typing environment of length n, and Γ an arbitrary typing environment. We note n and m the lengths of respectively Θ and GΓ . • Let Z ⊆ R n , W ⊆ R n+m . We define C(Θ, Z W , Γ ) as the set of those substitutions γ : supp(Γ ) → {t | Θ ⊢ t : R} such that: • ∀(x : H) ∈ HΓ , γ(x) ∈ C(Θ, Z, H), • γ | GΓ : R n → R n+m sends continuously Z into W ; • Let W ⊆ R n+m , F = {α ∈ R} an annotated real type, and ψ a logical formula with Vars(ψ) ⊆ {α}. We define: • Let W ⊆ R n+m , and H an higher-order refined type. We define : Example 9. We illustrate Definition 5 on an example. We consider the same context Θ as in Example 8, i.e. Θ = x 1 : {α 1 ∈ R}, x 2 : {α 2 ∈ R}, and we take We are interested in the following logical predicate for substitution: where the norm of the couple (a, b) is taken as: |(a, b)| = √ a 2 + b 2 . We are going to build a substitution γ : {x 3 , z} → Λ ×,→,R F that belongs to this set. We take: . We can check that the requirements of Definition 5 indeed hold for γ: Looking at our definition of the semantics of a substitution, we see that γ | GΓ (a, b) = (a, b, |(a, b)|), thus the requirements above hold. Lemma 5 (Fundamental Lemma). Let Θ be a ground typing context, and Γ an arbitrary typing context-thus Γ can contain both ground type variables and non-ground type variables. • Suppose that Γ , Θ θ η ⊢ r t : F : then t ∈ C(Γ ; Θ, Dom(θ) η, F ). • Suppose that Γ , Θ θ ⊢ r t : H: then t ∈ C(Γ ; Θ, Dom(θ), H). Proof Sketch. The proof is by induction on the derivation of the refined typing judgment. Along the lines, we need to show that our logical predicates play well with the underlying denotational semantics, but also with logic. The details can be found in the extended version [7] . From there, we can finally prove the main result of this section, i.e. Theorem 3, that states the correctness of our refinement type system. Indeed, Lemma 5 has Theorem 3 as a corollary: from there it is enough to look at the definition of the logical predicate for first-order programs to finally show the correctness of our type system. Logical relations are certainly one of the most well-studied concepts in higherorder programming language theory. In their unary version, they have been introduced by Tait [54] , and further exploited by Girard [33] and Tait [55] himself in giving strong normalization proofs for second-order type systems. The relational counterpart of realizability, namely logical relations proper, have been introduced by Plotkin [48] , and further developed along many different axes, and in particular towards calculi with fixpoint constructs or recursive types [3, 4, 2] , probabilistic choice [14] , or monadic and algebraic effects [34, 11, 34] . Without any hope to be comprehensive, we may refer to Mitchell's textbook on programming language theory for a comprehensive account about the earlier, classic definitions [43] , or to aforementioned papers for more recent developments. Extensions of logical relations to open terms have been introduced by several authors [39, 47, 30, 53, 15] and were explicitly referred to as open logical relations in [59] . However, to the best of the authors' knowledge, all the aforementioned works use open logical relations for specific purposes, and do not investigate their applicability as a general methodology. Special cases of our Containment Theorem can be found in many papers, typically as auxiliary results. As already mentioned, an example is the one of higher-order polynomials, whose first-order terms are proved to compute proper polynomials in many ways [40, 5] , none of them in the style of logical relations. The Containment Theorem itself can be derived by a previous result by Lafont [41] (see also Theorem 4.10.7 in [24] ). Contrary to such a result, however, our proof of the Containment Theorem is entirely syntactical and consists of a straightforward application of open logical relations. Algorithms for automatic differentiation have recently been extended to higherorder programming languages [50, 46, 51, 42, 45] , and have been investigated from a semantical perspective in [16, 1] relying on insights from linear logic and denotational semantics. In particular, the work of Huot et al. [37] provides a denotational proof of correctness of the program transformation of [50] that we have studied in Section 5. Continuity and robustness analysis of imperative first-order programs by way of program logics is the topic of study of a series of papers by Chaudhuri and co-authors [19, 18, 20] . None of them, however, deal with higher-order programs. We have showed how a mild variation on the concept of a logical relation can be fruitfully used for proving both predicative and relational properties of higherorder programming languages, when such properties have a first-order, rather than a ground "flavor". As such, the added value of this contribution is not much in the technique itself, but in showing how it is extremely useful in heterogeneous contexts, this way witnessing the versatility of logical relations. The three case studies, and in particular the correctness of automatic differentiation and refinement type-based continuity analysis, are given as proofof-concepts, but this does not mean they do not deserve to be studied more in depth. An example of an interesting direction for future work is the extension of our correctness proof from Section 5 to backward propagation differentiation algorithms. Another one consists in adapting the refinement type system of Section 6.1 to deal with differentiability. That would of course require a substantial change in the typing rule for conditionals, which should take care of checking not only continuity, but also differentiability at the critical points. It would also be interesting to implement the refinement type system using standard SMT-based approaches. Finally, the authors plan to investigate extensions of open logical relations to non-normalizing calculi, as well as to non-simply typed calculi (such as calculi with polymorphic or recursive types). A simple differentiable programming language. PACMPL 4(POPL) Step-indexed syntactic logical relations for recursive and quantified types An indexed model of recursive types for foundational proof-carrying code A very modal model of a modern, major, general type system Higher-order interpretations and program complexity The lambda calculus: its syntax and semantics On the versatility of open logical relations: Continuity, automatic differentiation, and a containment theorem (long version Automatic differentiation of algorithms Automatic differentiation in machine learning: a survey Abstract effects and proof-relevant logical relations Handle with care: relational interpretation of algebraic effects and handlers A kripke logical relation for effect-based program transformations A concurrent logical relation Step-indexed logical relations for probability Noninterference for free Backpropagation in the simply typed lambdacalculus with linear negation Church => scott = ptime: an application of resource sensitive realizability Continuity analysis of programs Continuity and robustness of programs Proving programs robust Preliminary Sketch of Biquaternions Characterizations of the basic feasible functionals of finite type (extended abstract) Syntactic logical relations for polymorphic and recursive types Categories for Types. Cambridge mathematical textbooks The impact of higher-order state and control effects on local relational reasoning The domain of differentiable functions Domain theory and differential calculus (functions of one variable) The simple essence of automatic differentiation Operational domain theory and topology of sequential programming languages Semantic analysis of normalisation by evaluation for typed lambda calculus Refinement types for ml A language for differentiable functions Une extension de l'interpretation de gödel a l'analyse, et son application a l'elimination des coupures dans l'analyse et la theorie des types Logical relations for monadic types Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation Logical relations and nondeterminism. In: Software, Services, and Systems -Essays Dedicated to Martin Wirsing on the Occasion of His Retirement from the Chair of Programming and Software Engineering Correctness of automatic differentiation via diffeologies and categorical gluing (2020), to appear in Proc Syteci: automating contextual equivalence for higher-order programs with references A new characterization of lambda definability A new characterization of type-2 feasibility Logiques, catégories & machines: implantation de langages de programmation guidée par la logique catégorique Perturbation confusion in forward automatic differentiation of higher-order functions Foundations for programming languages. Foundation of computing series Functional big-step semantics Lazy multivariate higher-order forward-mode AD Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator Observable properties of higher order functions that dynamically create local names, or what's new? Lambda-definability and logical relations Neurocomputing: Foundations of research. chap. Learning Representations by Back-propagating Errors Efficient differentiable programming in a functional array-processing language Nesting forward-mode AD in a functional framework. Higher-Order and Symbolic Computation Calculus On Manifolds: A Modern Approach To Classical Theorems Of Advanced Calculus Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints Intensional interpretations of functionals of finite type i A realizability interpretation of the theory of species Logical relations for fine-grained concurrency Exact real computer arithmetic with continued fractions Computable Analysis: An Introduction. Texts in Theoretical Computer Science Relational parametricity for a polymorphic linear lambda calculus ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use