key: cord-0046335-mjj0e5fd
authors: Pshenitsyn, Tikhon
title: Hypergraph Basic Categorial Grammars
date: 2020-05-31
journal: Graph Transformation
DOI: 10.1007/978-3-030-51372-6_9
sha: 2f0ce3a4175d571b06282e8bc4a45831939dab4e
doc_id: 46335
cord_uid: mjj0e5fd

This work is an attempt to generalize categorial grammars, which deal with string languages, to hypergraphs. We consider a particular approach called basic categorial grammar (BCG) and introduce its natural extension to hypergraphs — hypergraph basic categorial grammar (HBCG). We show that BCGs can be naturally embedded in HBCGs. It turns out that, as BCGs are equivalent to context-free grammars, HBCGs are equivalent to hyperedge replacement grammars in generalized Greibach normal form. We also present several structural properties of HBCGs. From practical point of view, we show that HBCGs can be used to describe semantics of sentences of natural languages. We incorporate the lambda semantics into the new mechanism in the same way as it is done for BCGs and show that such an embedding allows one to describe semantics of sentences with cross-serial dependencies.

Formal mechanisms serving to describe formal (string) languages include two large classes: generative grammars and categorial grammars. The former generate strings using rewriting rules (productions): a string is correct if it can be produced by a grammar. The most well-known example of such a formalism is context-free grammar (CFG). Categorial grammars, in opposite, take the whole string at first and then check whether it is correct as follows: there is a set of types and a uniform mechanism which defines what sequences of types are correct; a particular grammar contains a lexicon, i.e. a correspondence between symbols in an alphabet and types of the system. In order to check whether a string a 1 . . . a n is correct one chooses types T 1 , . . . , T n such that a i corresponds to T i in the grammar and then checks if T 1 , . . . , T n is correct with respect to uniform rules of the formalism.

One of the most fundamental examples of categorial grammars is basic categorial grammar (BCG). It is introduced in works of Ajdukiewicz [2] and Bar-Hillel [3] . Types in BCGs are built of primitive types P r using left and right divisions \, /. There are two uniform rules of interaction between types: given A, (A\B) or (B/A), A standing nearby each other within a sequence of types one can replace them by B. The sequence of types is said to be correct iff it can be reduced to some distinguished s ∈ P r. It is proved by Gaifman [4] that this approach has the same generating power as context-free grammars.

BCGs can serve to describe natural languages. E.g. the sentence Tim thinks Helen is smart corresponds to the sequence of types NP , (NP \S)/S, NP , (NP \S)/ADJ , ADJ, which can be reduced to S; thus this sentence is grammatically correct 1 . Moreover, it is possible to combine BCGs with the λ-calculus and to model semantics of this sentence. Namely, if the λ-term λx.smart(x) is assigned to the adjective smart, the λ-term λs.λx.think(s)(x) is assigned to the verb thinks, and the λ-term λf.λx.f (x) is assigned to is, then the reductions that are done in order to obtain S from the sequence above can be treated as applications in the λ-calculus; hence, the sentence above is described by the meaning think(smart(Helen))(T im).

Let us return to generative grammars. The principles underlying them can be extended to graphs; a class of resulting formalisms is called graph grammars. In this paper we focus on a particular approach to generating graphs named hyperedge replacement grammar (HRG in short). An overview on HRGs can be found in [9] . We are interested in HRG because it is closely related to CFG: definitions of these formalisms are similar to each other; consequently, they share many crucial properties, e.g. the pumping lemma and the fixed-point theorem. Moreover, HRGs represent a natural extension of CFGs, since strings can be represented by string graphs and CFGs can be modeled using HRGs.

The question we are going to discuss in this paper is how to generalize basic categorial grammars to hypergraphs and to obtain a categorial mechanism related to HRGs. We present such a generalization -hypergraph basic categorial grammars. We extend notions of types, of reduction laws, and of semantics to hypergraphs. As expected, the resulting mechanism is closely related both to BCGs and HRGs, which is shown in Sects. 5 and 6. In Sect. 7 several structural properties of HBCGs are studied. In Sect. 8 we show how to enrich our mechanism with the lambda semantics. In Sect. 9 we show an application of our theory to linguistics.

The survey of categorial grammars including basic categorial grammars can be found in [8] . Here we introduce the main definitions to show connections with the new formalism.

Let us fix a countable set P r = {p i } ∞ i=1 of primitive types. Definition 2.1. The set T p of types is defined inductively as follows: it is the least set such that P r ⊆ T p and for each A, B ∈ T p B\A, A/B are also in T p.

Throughout this paper small letters p, q, . . . and strings composed of them (e.g. np, cp) range over primitive types. Capital letters A, B, . . . usually range over types (however, graphs are often referred to as G and H).

There are two rules of BCGs:

Here Γ, Δ are finite (possibly empty) sequences of types. Thus → is a relation on T p + × T p + . We denote by * → its reflexive transitive closure. Γ k → Δ denotes that Δ is obtained from Γ in k steps (the same notation is used for all the relations in this work).

A basic categorial grammar is a tuple Gr = Σ, s, where Σ is a finite set (alphabet), s is a distinguished primitive type, and ⊆ Σ × T p is a finite binary relation, i.e. it assigns a finite number of types to each symbol in the alphabet.

The language L(Gr) generated by Gr is the set of all strings a 1 . . . a n for which there are types T 1 , . . . , T n such that a i T i , and T 1 , . . . , T n * → s.

This section is concerned with definitions related to hypergraphs. All the notions except for compression are well known and widely accepted (see [9] ). Note that we use a slightly different notation from that in [9] .

N includes 0. The set Σ * is the set of all strings over the alphabet Σ including the empty string ε. The length |w| of the word w is the number of symbols in w. Σ + denotes the set of all nonempty strings. The set Σ is the set of all strings consisting of distinct symbols. The set of all symbols contained in the word w is denoted by [w] . If f : Σ → Δ is a function from one set to another, then it is naturally extended to a function f :

Let C be some fixed set of labels for whom the function type : C → N is considered.

where V is the set of nodes, E is the set of hyperedges, att : E → V assigns a string (i.e. an ordered set) of attachment nodes to each edge, lab : E → C labels each edge by some element of C in such a way that type(lab(e)) = |att(e)| whenever e ∈ E, and ext ∈ V is a string of external nodes.

Components of a hypergraph G are denoted by V G , E G , att G , lab G , ext G resp.

In the remainder of the paper, hypergraphs are simply called graphs, and hyperedges are simply called edges. The set of all graphs with labels from C is denoted by H(C). In drawings of graphs black dots correspond to nodes, labeled squares correspond to edges, att is represented with numbered lines, and external nodes are depicted by numbers in brackets. If an edge has exactly two attachment nodes, it can be denoted by an arrow (which goes from the first attachment node to the second one). it for which type equals 2, one edge with type equal to 1, and an edge with type equal to 3.

. v n and lab(e 0 ) = a, then H is called a handle. It is denoted by (a). 

In this work, we do not distinguish between isomorphic graphs.

In graph formalisms certain graph transformation are in use. To generalize categorial grammars we present the following operation called compression.

Compression. Let G be a graph, and let H be a subgraph of G. Compression of H into an a-labeled edge within G is a procedure of transformation of G, which can be done under the following conditions:

then v has to be external in

Then the procedure is the following:

Let G a/H (or G a, e/H ) denote the resulting graph. Formally,

Replacement. This procedure is defined in [9] . In short, the replacement of an edge e 0 in G with a graph H can be done if type(e 0 ) = type(H) as follows:

1. Remove e 0 ; 2. Insert an isomorphic copy of H (namely, H and G have to consist of disjoint sets of nodes and edges); 3. For each i, fuse the i-th external node of H with the i-th attachment node of e 0 .

To be more precise, the set of edges in the resulting graph is ( 

Definition 3.7. A hyperedge replacement grammar is a tuple Gr = N, Σ, P, S , where N is a finite alphabet of nonterminal symbols, Σ is a finite alphabet of terminal symbols (N ∩ Σ = ∅), P is a set of productions, and S ∈ N . Each production is of the form

If G is a graph,

The corresponding sequence of production applications is called a derivation. 

In this section, we present definitions needed to extend BCGs to graphs. Firstly, we introduce the notion of a type; then we define a rewriting rule, which operates on graphs labeled by types; finally, we introduce the definitions of a hypergraph basic categorial grammar and of a language generated by it.

We fix a countable set P r of primitive types and a function type : P r → N such that for each n ∈ N there are infinitely many p ∈ P r for which type(p) = n. Types are constructed from primitive types using division. Simultaneously, we define the function type on types.

Let us fix some symbol $ that is not included in all the sets considered. NB! This symbol is allowed to label edges with different number of attachment nodes. To be consistent with Definition 3.1 one can assume that there are countably many symbols $ n such that type($ n ) = n.

Definition 4.1. The set T p χ of types is the least set satisfying the following conditions: In types, $ serves to "connect" a denominator and a numerator.

Example 4.1. The following structure is a type:

Here p, q belong to P r, type(p) = 2, type(q) = 3; type(E 0 ) = 2.

In order to generalize the rules A/B, B → A and B, B\A → A, denominators of types are going to be "overlaid" on subgraphs of graphs. This idea is formalized by the notion of a d-isomorphism. 

The concept of hypergraph basic categorial grammars (HBCGs) is based on the mechanism of reduction of hypergraphs labeled by types. There is an inference rule, which is denoted by (÷), generalizing two rules for BCGs presented earlier.

The following dramatis personae participate in the rule (÷): 

Definition 4.5. A hypergraph basic categorial grammar (HBCG for short) Gr is a tuple Gr = Σ, s, where Σ is a finite alphabet, s is a primitive type, and ⊆ Σ × T p χ is a binary relation (called a lexicon) which assigns a finite number of types to each symbol in the alphabet. Additionally, we require that the function type is defined on Σ such that a T implies type(a) = type(T ).

Definition 4.6. The language L(Gr) generated by an HBCG Gr = Σ, s, is the set of all hypergraphs G ∈ H(Σ) for which a function f G : E G → T p χ exists such that:

All the definitions presented above are slightly more complicated and technical than that of BCGs; however, the concept of HBCGs is closely related both to HRGs and BCGs.

Example 4.2. Let us consider an example of an HRG from [11] (a little bit modified) generating abstract meaning representations, which contains four rules:

Here S is the initial symbol. This HRG can be converted into an equivalent HBCG as follows. Let s, x, y, i, a 0 , a 1 be primitive types. Then the following lexicon defines an HBCG that generates the same language as the HRG above:

To be more precise, the HBCG is of the form {want, need, go, I, arg 0 , arg 1 }, s, .

All the primitive types have type being equal to 2. The conversion can be done since there is a terminal edge in the right-hand side of each production (see more in Sect. 6). A more thorough example of an HBCG is given in Sect. 9.

To justify that HBCGs appropriately extend BCGs, we present an embedding of the latter into the former in a natural and a simple way.

A function tr : T p → T p χ presented below embeds string types into graph types:

-tr(p) := p, p ∈ P r, type(p) = 2;

Recall that a string graph induced by a word w = a 1 . . . a n is a graph of the It is proved by a straightforward conversion of the reduction process for strings into the reduction process for graphs. These propositions yield Theorem 5.1. If Gr is a BCG, then L(tr(Gr)) = {w • |w ∈ L(Gr)}.

It is well known that CFGs and BCGs are equivalent; one of the simplest proofs involves Greibach normal form for CFGs. In this section, we show that this proof can be generalized to a wide class of graph grammars in a natural way.

Firstly, one has to extend the notion of the (weak) Greibach normal form. There are a few works in which variants of such extension are introduced, see [10, 12] . However, normal forms presented in these works are more strict than it is needed for our purposes. In this paper, we use the following Note that not each language generated by some HRG can be generated by an HRG in GNF. It follows from This grammar produces graphs that have exactly one edge labeled by a and arbitrarily many isolated nodes. If there is an equivalent Gr = N, {a}, P , S in GNF, then each right-hand side of each production in P contains exactly one terminal edge. Note that if S k ⇒ G, G ∈ H({a}) in Gr , then G has k terminal edges; hence k has to equal 1 and therefore S → G ∈ P . However, there are infinitely many graphs in L(Gr) while |P | < ∞.

The characterization of languages generated by HRGs in the WGNF is a subject of the further study.

It turns out that HBCGs generate the same class of languages as HRGs in the normal form presented. This is proved below.

Definition 6.2. The set st(T ) of subtypes of a type T is defined inductively as follows: The "if" part is proved similarly: one has to transform applications of (÷) in Gr into productions in Gr . Example 4.2 provides an example of application of the theorem above.

Proof. Let Gr = N, Σ, P, S . Consider elements of N as elements of P r with the same function type defined on them. Since Gr is in GNF, each production in P is of the form π = X → G where G contains exactly one terminal edge e 0 (say lab G (e 0 ) = a ∈ Σ). We convert this production into the type T π := ÷(X/G[e 0 := $]). Then we introduce the HBCG Gr = Σ, S, where is defined as follows: a T π . Finally, note that, if one applies the transformation described in Theorem 6.1 to Gr , he obtains Gr, which implies that L(Gr) = L(Gr ).

Proof. This problem is in NP since, if the answer is "YES", there is a certificate of polynomial size that justifies this; namely, this is a sequence of applications of (÷) (a derivation). Another explanation is that an HBCG can be converted into an equivalent HRG in polynomial time for whom the membership problem is in NP.

In [9] , an NP-complete graph language generated by some HRG ERG is introduced. One notices that there is at least one terminal edge in each production in ERG; by adding nonterminal symbols corresponding to terminal ones one transforms ERG into an equivalent one in GNF, and then -to an HBCG using Theorem 6.2 (it all takes polynomial time).

In this section, we study some structural properties of HBCGs.

The set of primitive types P r is countably infinite; i.e. we are allowed to use as many primitive types as we want. However, the following theorem shows that it suffices to have one primitive type only. 

Behind this definition a simple idea stands: F (p k ) has a huge edge in the denominator, which is larger than any edge existing in the lexicon.

Let F (T ) stand for a type obtained from T ∈ T by substituting each p k with F (p k ) (we do not change s). Now, if a T, then let a F (T ). No more relations in exist. We argue that Gr = Σ, s, is a desired grammar. It follows from its definition that it contains only s as a primitive type. Clearly, L(Gr) ⊆ L(Gr ): each derivation in Gr can be remade in Gr , if one considers F (p k ) as an atomic, indivisible type corresponding to p k .

To prove the reverse inclusion let us consider the set T = {F (T )|T ∈ T }. Assume that for a graph G ∈ H(T ) the graph F (G) is reducible to (s) (here F (G) is obtained from G by changing each label a by the label F (a)). At each step of the reduction process the rule (÷) is applied either to a type of the form F (÷(N/D)), ÷(N/D) ∈ T or to F (p k ). However, note that no edges in F (G) have type exceeding M , whereas D k requires to be overlaid on the edge of the type M + k > M. Consequently, (÷) cannot be applied to F (p k ), and G * → χ (s).

If 

One of the features HBCGs inherit from BCGs is so-called counters. 

The λ-calculus is a formal tool, which has a number of applications in functional programming and in formal semantics. In this paper, we do not provide the definitions of this mechanism and refer the reader to the paper [5] , which is an overview of the λ-calculus.

In basic categorial grammars, one can assign λ-terms to types. α : A denotes a λ-term α assigned to a type A (i.e. this is the pair (α; A) ). The rules of reduction then have the following form:

Here βα stands for the application of β to α. A linguistic example that shows how λ-terms describe semantics of a natural language was given in Sect. 1.

This approach can be generalized to hypergraphs and HBCGs. Let T be a type, i.e. belong to T p χ . By τ : T we denote a pair containing a λ-term τ . Now we are going to incorporate the λ-calculus into the rule (÷). Let objects involved in this rule be denoted as in Sect. 4.3. We additionally require that edges in E D are numbered: E D = {e 0 , e 1 , . . . , e k } (and this numbering is fixed for a given type). If a λ-term τ is assigned to ÷(N/D) and for i > 0 lab( (e i )) = α i : T i , then the rule (÷) is of the form

Here τ α 1 α 2 . . . α k = (((τ α 1 )α 2 ) . . . )α k . This means that λ-terms written on edges that are consumed by the denominator D are treated as arguments of the λ-term assigned to ÷(N/D). An example of an application of HBCGs enriched with the λ-calculus to linguistics is presented in the next section.

It is well known that context-free languages in the usual sense fail to describe certain linguistic phenomena. One of them is so-called cross-serial dependencies (CSD) -a class of phenomena that that can be described by the language {ww|w ∈ Σ * } of reduplicated strings.

We focus here on the following example of CSD from the Russian language:

The meaning of this sentence is: Olya was the first who finished, Petya was the second who finished, and Vasya was the third who finished (e.g. when speaking about a competition). In Russian, the ordinal numerals the first, the second, the third agree with nouns in gender which leads to CSD (note that Olya is a female name, and Petya, Vasya are male ones). Below we show how to generate such Russian sentences using an HBCG and how to model their semantics using the λ-calculus. In order to simplify the example, we ignore some features of the Russian language. We denote by pd m (pd f ) a primitive type which stands for masculine (feminine resp.) predicate phrases in singular form in the instrumental case (such as ordinal numbers, e.g. ); we denote by np m (np f ) a primitive type corresponding to masculine (feminine) noun phrases in singular form in the nominative case (such as proper nouns, e.g.

); np p (pd p ) denotes nouns (predicate phrases resp.) in a plural form; s denotes sentences (it is a distinguished type). Then the grammar generating sentences of the above form is the following:

Here Z equals Note that tr is defined in Sect. 5.

Let us denote the first type in the first row above as I m , and the second one as T m ; by analogue, types in the second row are denoted by I f and T f resp. In addition, we assign semantic types to the syntactic ones: Then the sentence (1) belongs to the language generated by this HBCG:

There are a few papers devoted to combining categorial grammars with graph tools. E.g. there is a recent work of Sebastian Beschke and Wolfgang Menzel [7] where it is shown how to enrich the Lambek calculus, which is another categorial approach (see [13] ), with graph semantics. In the paper [14] (which is rather linguistic than mathematical) an extension of some concepts of the Lambek calculus to graphs is presented; namely, sentences are considered to be graph structures (functor-argumentor-structures), and then categorial graph grammars are introduced, which deal with these structures. Hypergraph basic categorial grammars introduced in our work, however, do not seem to be closely related to any of these approaches. Possibly, it is because our motivation for HBCGs is rather logical and mathematical: our main purpose was to directly combine concepts of BCGs and HRGs, so the resulting mechanism satisfies these requirements. Nevertheless, we hope that HBCGs also can be used (possibly, with some further modifications) in practical applications, e.g. in linguistics.

Note that there is a work [6] where HRGs are used to describe CSD of the Dutch language. Actually, examples in Sect. 9 have a similar structure with examples in [6] . Comparing [6] with this paper, we conclude that one of the crucial features distinguishing between HRGs and HBCGs from linguistic point of view is the λ-semantics, which can be naturally built into the latter.

There is a number of questions that remain open; we hope to study them in future works.

-We showed that the membership problem for HBCGs is NP-complete. How to restrict HBCGs in order to obtain efficient parsing algorithms? -We introduced some applications of HBCGs to linguistics. We are interested in further developing a theory that would use HBCGs and the λ-calculus to model visual structures related to natural languages. Particularly, we desire to consider syntactic trees which linguists deal with from the point of view of our approach. -How to generalize other string categorial approaches to hypergraphs?

There is a term τ 1 = λf.f (first)(Olya) assigned to Z after the first step

1 and P = second are applied to λζ.λP.λg. (g(P )(P etya) ∧ ζg) in the second step; the result is λg. (g(second)(P etya) ∧ g(first

Similarly, after the third step the following λ-term is assigned to Z: λh. (h(third)(V asya) ∧ (h(second)(P etya) ∧ h(first)(Olya)))

one obtains the following term: finish(third)(V asya) ∧ (finish(second)(P etya) ∧ finish(first)(Olya))

Thus, this HBCG not only generates sentences of the form (1) (HRGs can deal

The Theory of Parsing, Translation, and Compiling

Die syntaktische Konnexitat (Syntactic connexion)

A quasi-arithmetical notation for syntactic description

On categorial and phrase structure grammars

Introduction to lambda calculus

Hyperedge replacement and nonprojective dependency structures

Graph algebraic combinatory categorial grammar

Categorial grammars and their logics

Hyperedge replacement graph grammars

A Greibach normal form for context-free graph grammars

Parsing graphs with regular graph grammars

A local Greibach normal form for hyperedge replacement grammars

The mathematics of sentence structure

Categorial graph grammar: a direct approach to functor-argumentorstructure

Acknowledgments. I thank my scientific advisor prof. Mati Pentus for his careful attention to my study and anonymous reviewers for their valuable advice.