Abstract
The discrete Choquet Integral (CI) and its generalizations have been successfully applied in many different fields, with particularly good results when considered in Fuzzy Rule-Based Classification Systems (FRBCSs). One of those functions is the CC-integral, where the product operations in the expanded form of the CI are generalized by copulas. Recently, some new Choquet-like operators were developed by generalizing the difference operation by a Restricted Dissimilarity Function (RDF) in either the usual or the expanded form of the original CI, also providing good results in practical applications. So, motivated by such developments, in this paper we propose the generalization of the CC-integral by means of RDFs, resulting in a function that we call d-CC-integral. We study some relevant properties of this new definition, focusing on its monotonicity-like behavior. Then, we proceed to apply d-CC-integrals in a classification problem, comparing different d-CC-integrals between them. The classification acuity of the best d-CC-integral surpasses the one achieved by the best CC-integral and is statistically equivalent to the state-of-the-art in FRBCSs.
This research was funded by FAPERGS/Brazil (Proc. 19/2551-0001660-3, 23/2551-0000126-8), CNPq/Brazil (301618/2019-4, 305805/2021-5, 150160/ 2023-2).
Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF
Similar content being viewed by others
1 Introduction
The discrete Choquet Integral (CI) is an interesting aggregation operator that can capture the relationship between the aggregated data by means of a fuzzy measure (FM) [10]. It has been successfully applied in many different fields, such as decision-making [13, 29], classification [13, 28] and image processing [24]. This yielded the theoretical development of several generalizations [13] of the CI, such as the CC-integral [22] (a generalization by copulas [3]) and \(C_{F_1F_2}\)-integral [19] (a generalization by fusion functions \(F_1\) and \(F_2\) under some constraints), all of them providing at least competitive results when applied in practical problems.
In particular, Fuzzy Rule-Based Classification Systems (FRBCSs) [17], which are notable for their high interpretability while still achieving good classification results, seem to benefit from the application of the CI and their generalizations in the aggregation process that occur in the reasoning method. This can be seen in the works of Lucca et al. [19], where the application of such integrals produced results that rival the state of the art in FRBCSs.
Bustince et al. [8] generalized the CI using Restricted Dissimilarity Functions (RDF), in the form of d-integrals. Following that, Wieczynski et al. studied \(dC_{F}\)-integrals [28] and d-XC-integrals [29], obtaining promising results not only in decision making and classification but also in signal processing, based on motor-imagery brain-computer interface. Recently, Boczek and Kaluszka [4] introduced the preliminary notion of the extended Choquet-Sugeno-like (eCS) operator, which generalizes most of the modifications of the CI known in the literature (briefly discussed above), and also generates some new CI-type operators. This inspired us to further research their properties and applications.
Then, the objectives of this paper are: (1) to study some important theoretical properties of an instance of the eCS-operator, introducing a generalization of CC-integrals by RDFs, called d-CC-integrals (Sect. 3); (2) to apply d-CC-integrals in FRBCSs in an experimental study, where we compare the classification accuracy of different d-CC-integrals based on combinations of RDFs and copulas with the best results obtained by other approaches in the literature (Sect. 4 and 5). Additionally, Sect. 2 recalls preliminary concepts and Sect. 6 is the Conclusion.
2 Preliminaries
Consider \(N = \{1, \dots , n\}\), with \(n > 0\), and \(\textbf{x}=(x_1,\ldots , x_n)\). A function \(A :[0,1]^n\rightarrow [0,1]\) is an aggregation function if (A1) it is increasing, and (A2) \(A(0,\ldots , 0)=0\) and \(A(1,\ldots , 1)=1\). Copulas are a special type of aggregation function that, in the context of the theory of metric spaces, link (2-dimensional) probability distribution functions to their 1-dimensional margins . A bivariate function \(C :[0,1]^2 \rightarrow [0,1]\) is a copula if, for all \(x,x',y, y' \in [0,1]\) with \( x \le x'\) and \(y \le y'\): (C1) \(C(x,y) + C(x',y') \ge C(x,y') + C(x', y)\); (C2) \(C(x,0) = C(0,x) = 0\); (C3) \(C(x,1) = C(1,x) = x\) [3].
Proposition 1
[3]. For each copula C it holds that: (i) C is increasing; (ii) C satisfies the Lipschitz property with constant 1, that is, for all \(x_1,x_2,y_1,y_2 \in [0,1]\), one has that \(\mid C(x_1,y_1) - C(x_2,y_2) \mid \le \mid x_1 - x_2 \mid + \mid y_1 - y_2\mid \).
Table 1 shows examples of copulas, which are used in the rest of the paper. We divided them into three groups: t-norms [3], overlap functions [6], and copulas that are neither t-norms nor overlap functions.
A Restricted Dissimilarity Function (RDF) [7] \(\delta : [0,1]^2 \rightarrow [0,1]\) is a function such that, for all \(x,y,z \in [0,1]\): (d1) \(\delta (x,y) = \delta (y,x)\); (d2) \(\delta (x,y) = 1\) if and only if \(\{x, y\} = \{0,1\}\); (d3) \(\delta (x,y) = 0\) if and only if \(x = y\); (d4) if \(x \le y \le z\), then \(\delta (x, y) \le \delta (x, z)\) and \(\delta (y, z) \le \delta (x, z)\). One convenient way of constructing RDFs is through automorphisms (strictly increasing bijections) of the unit interval:
Proposition 2
[7]. Let \(\varphi _1, \varphi _2: [0,1] \rightarrow [0,1]\) be two automorphisms. Then, the function \(\delta ^{\varphi _1, \varphi _2}: [0,1]^2 \rightarrow [0,1]\), given, for all \(x,y \in [0,1]\), by \(\delta ^{\varphi _1, \varphi _2}(x,y)=\varphi _1(|\varphi _2(x)-\varphi _2(y)|)\), is a Restricted Dissimilarity Function.
All functions from Table 2 are examples of RDFs, constructed via Proposition 2, which are considered in our experiments presented in Sect. 5.
A function \(m :2^{N} \rightarrow [0,1]\) is a fuzzy measure (FM) [10] if, for all \(X,Y \subseteq N\), it satisfies the following properties: (m 1) Increasing: if \(X\subseteq Y\), then \(m(X)\le m(Y)\); (m 2) Boundary conditions: \(m(\emptyset ) =0\) and \(m(N) =1\). An example of FM is the power measure \(m_{PM} :2^{N} \rightarrow [0,1]\), which is defined, for all \(X \subseteq N\), by \(m_{PM}(X) = \left( \frac{|X|}{n} \right) ^{q}\), where the exponent \(q > 0\) can be learned genetically from the data. It is the only FM considered in this paper, since it provides excellent results in classification problems, as discussed in [21].
The discrete CI [10] was generalized in many forms [4, 13], one of them, based on its expanded form, considers copulas instead of the product operation:
Definition 1
[22]. Let \(m :2^{N} \rightarrow [0,1]\) be a fuzzy measure and \(C :[0,1]^2 \rightarrow [0,1]\) be a bivariate copula. The CC-integral is defined as a function \(\mathfrak {C}_{m}^C :[0,1]^n \rightarrow [0,1]\), given, for all \({\boldsymbol{x}} \in [0,1]^n\), by
where \(\left( x_{(1)}, \ldots , x_{(n)}\right) \) is an increasing permutation on the input x, that is, \(0 \le x_{(1)} \le \ldots \le x_{(n)}\), where \(x_{(0)} = 0\) and \(A_{(i)} = \{(i), \dots , (n) \}\) is the subset of indices corresponding to the \(n-i+1\) largest components of \(\boldsymbol{x}\).
In another direction, the CI in its expanded form was generalized considering RDFs in the place of the subtraction operation:
Definition 2
[29]. The generalization of the CI expanded form by RDFs \(\delta :[0,1]^2 \rightarrow [0,1]\) with respect to an FM \(m :2^N \rightarrow [0,1]\), named d-XChoquet integral (d-XC), is a mapping \(X\mathfrak {C}_{\delta ,m} :[0,1]^2 \rightarrow [0,n]\), defined, for all \(\boldsymbol{x} \in [0,1]^n\), by:
where \( x_{(i)} \) and \(A_{(i)}\) are defined according to Definition 1.
3 d-CC Integrals
In this section, we introduce the definition of d-CC-integrals, by combining the concepts of CC-integral (Definition 1) and d-XC-integral (Definition 2). Following that, we study some aspects of d-CC-integrals that are relevant to our application in classification, such as different forms of monotonicity.
Definition 3
Let \(C :[0,1]^2 \rightarrow [0,1]\) be a copula. The generalization of the CC-integral by RDFs \(\delta :[0,1]^2 \rightarrow [0,1]\) with respect to a FM \(m :2^N \rightarrow [0,1]\), named d-CC-integral (d-CC), is a mapping \(\mathfrak {CC}_{\delta ,m}^C :[0,1]^2 \rightarrow [0,n]\), defined by:
for all \(\boldsymbol{x} \in [0,1]^n\), where the ordered \(x_{(i)}\), and \(m(A_{(i)})\), with \(0\le i\le n\), were stated in Definition 1.
From now on, we denote \(m(A_{(i)})\) simply by \(m_{(i)}\). Observe that d-CC-integrals can also be obtained by generalizing X-dC-integrals (Definition 2), where the product is replaced by a copula, or a restriction of \(C_{F_1F_2}\) [19], putting \(F_1=F_2 = C\).
Proposition 3
Under the conditions given in Definition 3, \(\mathfrak {CC}_{\delta ,m}^C\) is well defined, for all RDF \(\delta \), FM m and copula C.
Proof
For \(i = 1, \ldots , n\), one has that \(0 \le x_{(i)} \le 1\), and for \(i = 2, \ldots , n\), we have that \( 0\le m_{(i)} \le 1\), \( 0\le C(x_{(i)},m_{(i)}) \le 1\) and \(0 \le \delta (C(x_{(i)},m_{(i)}),C(x_{(i-1)},m_{(i)})) \le 1\). It is immediate that, for \(\boldsymbol{x} \in [0,1]^n\), \(0 \le X\mathfrak {C}_{\delta ,m} (\boldsymbol{x}) \le n\), for any RDF \(\delta \), FM m and copula C. Consider an input vector \(\boldsymbol{x} \in [0,1]^n\), for which there may be different increasing permutations, meaning that \(\boldsymbol{x} \) has repeated elements. For the sake of simplicity, but without loss of generality, consider that there exists \(r,s \in \{1, \ldots , n\}\) such that \(x_r = x_s = z \in [0,1]\) and, for all \(i \in \{1, \ldots , n\}\), with \(i \ne r,s\), it holds that \(x_i \ne x_r, x_s\). The only two possible increasing permutations are:
Denote by \(m_{(i)}^{(1)}= m^{(1)}(A_{(i)})\) and \(m_{(i)}^{(2)}=m^{(2)}(A_{(i)})\), with \(i \in \{1, \ldots , n\}\), the fuzzy measures of the subsets of \(A_{(i)}\) of indices corresponding to the \(n - i + 1\) largest components of \(\boldsymbol{x}\) with respect to the permutations (4) and (5), respectively. Observe that
hold, meaning it may be the case that \( m_{(k)}^{(1)} \ne m_{(k)}^{(2)}\). Now, denote by \(\mathfrak {CC}_{\delta , m}^C{(1)}\) and \(\mathfrak {CC}_{\delta , m}^C{(2)}\) the d-CC integrals with respect to the permutations (4) and (5), respectively, and suppose that
From Eqs. (6) and (7), whenever \(k \ne 1\), it follows that:
which is in contradiction to Eq. (8). Similarly, there is also a contradiction for \(k = 1\). The result can be easily generalized for any subsets of repeated elements in the input \(\boldsymbol{x}\). The conclusion is that for any different increasing permutations of the same input \(\boldsymbol{x}\), one always obtains the same output value of \(\mathfrak {CC}_{\delta , m}^C (\boldsymbol{x})\).
Remark 1
Note that, in \(\mathfrak {CC}_{\delta ,m}^C \) definition, the summation first element is just \(x_{(1)}\) instead of \(\delta \left( C(x_{(1)}, m_{(1)}),\ C(x_{(0)},m_{(1)} )\right) \). According to [19, 29], this can be used to avoid a discrepant behavior of non-averagingFootnote 1 functions in the initial phase of the aggregation process.
Example 1
It is immediate that any choices of RDF \(\delta \) from Table 2 and copula C from Table 1 can be combined to obtain an example of d-CC-integral \(\mathfrak {CC}_{\delta , m}^C\).
In the following, we first study some general properties and then monotonicity-like properties, for the RDFs of Table 2, any fuzzy measure m and copula C.
3.1 Some Important Properties for Aggregation-Like Processes
The next proposition shows that whenever the adopted RDF is constructed using Proposition 2, then, for some choices of the automorphism \(\varphi _2\), the result obtained is upper limited in relation to the dissimilarities (by \(\delta _0\)) of the inputs:
Proposition 4
Let \(\delta \) be an RDF constructed by Proposition 2, for any automorphism \(\varphi _1\) and the identity as \(\varphi _2\). Let \(\mathfrak {CC}_{\delta , m}^C :[0, 1]^n \rightarrow [0,1]\) be the derived d-CC integral for any fuzzy measure \(m\) and copula C. Then, for all \(\boldsymbol{x} \in [0,1]^n\), it holds that:
Proof
Let \(\delta \) be an RDF constructed by Proposition 2. Then, since \(\varphi _1\) is an automorphism, for any \(\boldsymbol{x} \in [0,1]\), FM \(m\) and copula C, we have that:
The following three results are immediate:
Proposition 5
For \(\delta \), m and C, \(\mathfrak {CC}_{\delta ,m}^C (\boldsymbol{x}) \ge \min (\boldsymbol{x})\), for all \(\boldsymbol{x} \in [0,1]\).
Proposition 6
\(\mathfrak {CC}_{\delta ,m}^C(\boldsymbol{x}) \le \max (\boldsymbol{x})\) if and only if, for all \(0 \le a_1 \le \ldots \le a_n \) and FM \(m\), the RDF \(\delta \) satisfies: \( \sum _{i=2}^n \delta (C(a_i, m_{i}),\ C(a_{i-1}, m_{i})) \le a_n - a_1, \) where \(m_i = m(A_{i})\), for \(A_i = \{i, \ldots , n\}\).
Corollary 1
\(\mathfrak {CC}_{\delta ,m}^C\) is averaging if and only if it satisfies Proposition 6.
Proposition 7
For any RDF \(\delta \), FM m and copula C, \(\mathfrak {CC}_{\delta ,m}^C\) is idempotent.
Proof
If \(\boldsymbol{x} = (x, \ldots , x)\), then, by (d3), one has that:
Since the range of the dCC-integral is [0, n], we avoid the term “boundary condition” when referring to condition (A2) in this context. Instead, we simply call it 0, 1-condition. The same term was also adopted in [29].
From Proposition 7, the following result is immediate:
Corollary 2
The d-CC integral satisfies the 0, 1-condition for any m, \(\delta \) and C.
3.2 Monotonicity-Like Properties of d-CC Integrals
As explained in [13], one important property of aggregation-like operators is to present some kind of “increasingness property” to guarantee that the more information is provided the higher the aggregated value is in the considered direction. From the discussions in [29], it is immediate that d-CC integrals can not be fully monotonic, in general, since they have to satisfy the following conditions: (i) For \(z_1, z_2, z_3, z_4 \in [0,1]\), with \(z_1 \le z_2 \le z_3 \le z_4\), \(w_1, w_2 \in [0,1]\), with \(w_1 \ge w_2\), we have: \(\delta (C(z_1, w_1), C(z_3, w_1)) + \delta (C(z_3, w_2), C(z_4, w_2)) \ge \delta (C(z_1,w_1), C(z_2, w_1))\) \(+ \delta (C(z_2, w_2), C(z_4, w_2))\); (ii) For all \(z_1, z_2, z_3 \in [0, 1]\), with \(z_1 \le z_2 \le z_3\), \(w \in [0,1]\), we have: \(z_2 + \delta (C(z_3, w) , C(z_2, w)) \ge z_1 + \delta (C(z_3, w), C(z_1, w)).\)
Considering the RDFs of Table 2, for any copula C and FM m, only \(\mathfrak {CC}_{\delta _0,m}^C\) is increasing, since it satisfies both conditions. Observe that \(\mathfrak {CC}_{\delta _0,m}^C\) is, in fact, a CC-integral [22], which is fully monotonic, for any copula C. When C is the product, both coincide with the d-XC integral (Definition 2) for \(\delta _0\), which is the standard CI in the expanded form. Nevertheless, d-CC integrals do satisfy weaker forms of monotonicity, as we discuss in this section.
Directional Monotonicity of d-CC Integrals. Directional monotonicity is one of the most adopted weaker notions of monotonicity, which enlarged the scope of aggregation processes in applications (e.g., [13, 18, 24, 25]).
Definition 4
[5]. Let \(\boldsymbol{r} = (r_1, \dots , r_n)\) be a real \(n\)-dimensional vector such that \(\boldsymbol{r} \ne \boldsymbol{0} = (0, \ldots , 0)\). A function \(F :[0,1]^n \rightarrow [0,1]\) is said to be \(\boldsymbol{r}\)-increasing if, for all \(\boldsymbol{x}= (x_1,\dots , x_n) \in [0,1]^n\) and \(c > 0\) such that \(\boldsymbol{x} + c{\textbf {r}} = (x_1 + c r_1, \dots , x_n + c r_n) \in [0,1]^n\), it holds that \(F(\boldsymbol{x} + c\boldsymbol{r}) \ge F(\boldsymbol{x})\).
Theorem 1
Let \(m :2^N \rightarrow [0,1]\), \(\delta :[0,1]^2 \rightarrow [0,1]\) and \(C :[0,1]^2 \rightarrow [0,1]\) be an FM, an RDF, and a copula, respectively. \(\mathfrak {CC}_{\delta ,m}^C\) is \(\boldsymbol{1}\)-increasing if and only if one of the following conditions hold: (i) the RDF \(\delta \) is \(\boldsymbol{1}\)-increasing; (ii) for all \(0 \le z_1 \le \ldots \le z_n \le 1\) and \(c>0\), such that \( z_i + c \in [0,1]\), for all \(i = 1, \ldots , n\):
where \(m_i = m(A_{i})\), for \(A_i = \{i, \ldots , n\}\).
Proof
(\(\Leftarrow \))(i) Suppose that \(\delta \) is \(\boldsymbol{1}\)-increasing and let \(\boldsymbol{c} = (c, \ldots , c)\), \(c>0\), such that \(\boldsymbol{x}, \boldsymbol{x} + \boldsymbol{c}\in [0,1]^n\). Then:
Now suppose that (ii) holds. Then, for all \( \boldsymbol{x} \in [0,1]^n\) and \(\boldsymbol{c} = (c, \ldots , c)\), with \(c > 0\), such that \(x_i + c \in [0,1]\), \(\forall i = 1, \ldots , n\), it follows that \(\sum _{i=2}^{n} \delta (C(x_{(i)} + c, m_{(i)}), C(x_{(i-1)} + c, m_{(i)})) \ge \sum _{i=2}^{n}\delta (C(x_{(i)}, m_{(i)}), C( x_{(i-1)}, m_{(i)})) - c\). This implies that \((x_{(1)} + c) + \sum _{i=2}^{n} \delta (C(x_{(i)} + c, m_{(i)}), C(x_{(i-1)} + c, m_{(i)})) \ge x_{(1)} + \sum _{i=2}^{n}\delta (C(x_{(i)}, m_{(i)}), C( x_{(i-1)}, m_{(i)}))\). Then, \(\mathfrak {CC}_{\delta ,m}^C(\boldsymbol{x} + \boldsymbol{c}) \ge \mathfrak {CC}_{\delta ,m}^C(\boldsymbol{x})\). Therefore, if (i) or (ii) holds, then \(\mathfrak {CC}_{\delta ,m}^C\) is \(\boldsymbol{1}\)-increasing.
(\(\Rightarrow \)) Suppose that \(\mathfrak {CC}_{\delta ,m}^C\) is \(\boldsymbol{1}\)-increasing, that is \(\mathfrak {CC}_{\delta ,m}^C(\boldsymbol{x} + \boldsymbol{c}) \ge \mathfrak {CC}^C_{\delta ,m}(\boldsymbol{x})\). Then it follows that \((x_{(1)} + c) + \sum _{i=2}^{n} \delta ( C(x_{(i)} + c, m_{(i)}), C (x_{(i-1)} + c, m_{(i)})) \) \(\ge x_{(1)} + \sum _{i=2}^{n}\delta ( C(x_{(i)},m_{(i)}), C(x_{(i-1)}, m_{(i)}))\) which implies that condition (ii) holds.
Ordered Directional Monotonicity of d-CC Integrals. Any d-CC integral is ordered directionally monotonic [9]. Such functions are monotonic along different directions according to the ordinal size of the coordinates of each input.
Definition 5
[9]. Consider a function \(F :[0, 1]^n \rightarrow [0, 1]\) and let \(\boldsymbol{r}=(r_1,\dots ,r_n)\) be a real n-dimensional vector, \(\boldsymbol{r} \ne \boldsymbol{0}\). F is said to be ordered directionally (OD) \(\boldsymbol{r}\)-increasing if, for each \(\boldsymbol{x} \in [0, 1]^n\), any permutation \(\sigma : \{1, \ldots , n\} \rightarrow \{1, \ldots , n\}\) with \(x_{\sigma (1)} \ge \ldots \ge x_{\sigma (n)}\), and \(c>0\) such that \(1 \ge x_{\sigma (1)} + cr_1 \ge \ldots \ge x_{\sigma (n)} + cr_n\), it holds that \(F(\boldsymbol{x} + c \boldsymbol{r}_{\sigma ^{-1}} ) \ge F(\boldsymbol{x})\), where \(\boldsymbol{r}_{\sigma ^{-1}} = (r_{\sigma ^{-1}(1)}, \ldots , r_{\sigma ^{-1}(n)})\).
Theorem 2
For any FM m, RDF \(\delta \), copula C and \(k>0\), the d-CC integral \(\mathfrak {CC}_{\delta ,m}^C\) is an (OD) \((k,0, \ldots , 0)\)-increasing function.
Proof
For all \(\boldsymbol{x} \in [0, 1]^n\) and permutation \(\sigma :\{1, \ldots , n\} \rightarrow \{1, \ldots , n\}\), with \(x_{\sigma (1)} \ge \ldots \ge x_{\sigma (n)}\), and \(c>0\) s.t. \(x_{\sigma (i)} + cr_i \in [0,1]\), for \(i \in \{1, \ldots , n\}\), and \(1 \ge x_{\sigma (1)} + cr_1 {\ge } {\ldots } {\ge } x_{\sigma (n)} + cr_n\), for \(\boldsymbol{r}_{\sigma ^{-1}} = (r_{\sigma ^{-1}(1)}, {\ldots }, r_{\sigma ^{-1}(n)})\), one has that:
4 Application of d-CC Integrals in Classification
A classification problem consists of P training examples \(\vec {x_p} = (x_{p1},\ldots ,x_{pn})\), \(p \in \{1,\ldots ,P\}\) where \(x_{pi}\) is the value of the i-th variable of the p-th example and each example belongs to one of M classes in \(C {=} \{C_1, \ldots , C_M\}\). The goal of the learned classifier is to identify the class of new/unknown examples.
A FRBCS [17] is a type of classification system that is based on rules with linguistic labels, modeled by fuzzy sets. The inference process, known as the Fuzzy Reasoning Method (FRM), is determined by four sequential steps:
-
(1)
Matching degree (\({A_j}\)) - It measures, for the example \(x_p\) to be classified, the strength of the IF-part of a rule \(R_j\). \({A_j}\) is calculated by a fuzzy conjunction operator \(\mathfrak {c}\), as follows: \({A_j}(x_p) = \mathfrak {c}(A_{j1}(x_{p1}),\cdots , A_{jn}(x_{pn}))\).
-
(2)
Association degree (\(b_j^k\)) - It weights the matching degree \({A_j}\) by the rule weight \(RW_j\), through a product operation:
-
(3)
Example classification soundness degree for all classes (\(Y_k\)) - For each class \(C_k \in C\), we aggregate all the positive association degrees \(b_j^k\) that were obtained in the previous step with respect to \(C_k\), through an aggregation function A: \(Y_k = A(b_j^k)\), with \(j = 1, \ldots , L\), and \(b_j^k>0\).
-
(4)
Classification - The final decision is made in this step. For that, a function \(F: [0,1]^M \rightarrow C\) is applied over all example classification soundness degrees calculated in the previous step: \(F(Y_1,\ldots ,Y_M)= \arg \max \limits _{k=1,\ldots ,M}(Y_k)\).
In this paper, we will apply d-CC integrals as the aggregation operator A in the third step of the FRM to analyze their effect on the classification process.Footnote 2
4.1 Experimental Framework
The application of our new family of functions is based on a benchmark composed of 33 different public datasets found in KEEL [2] dataset repositoryFootnote 3. It is important to mention that this benchmark has been used for all the most important generalizations of the CI discussed before. The selection of the same datasets and partitions allows us to directly compare the methods and consequently provide a more complete analysis. In Table 3, we provide the information of the datasets along with their identification (ID), Number of instances (#NoI), Number of Attributes (#NoA), and the Total of Classes (#ToC).
The experimental framework lies in the same context as other works in the literature (See [19, 20, 23]), i.e. a 5-fold cross validation approach. Consequently, the results provided in Sect. 5 are the Accuracy Mean (AM) related to these folds. Moreover, following the same approach seen in [22], the hyperparameters used by the model are the same originally performed by FARC-HD [1].
5 Analysis of the Experimental Results
This section explores the application of d-CC Integrals. So, it is presented the performance of d-CC integrals over the datasets to achieve it. Then, the best function is pointed out, and we compare it against other operators found in the literature to highlight the best-performing method efficiency.
5.1 The Performance of the d-CC Integrals in Classification
This section provides the results obtained by the application of the d-CC Integrals in the FRM. To ease the comprehension, in Table 4, we present the combination of the RDF and d-CC integral per dataset that led to the largest accuracy. Moreover, we show in Table 5 the mean accuracy for our approach considering the 33 different datasetsFootnote 4 In order to provide a better analysis, we also highlight, per RDF, the largest mean in boldface.
The combination of the RDF \(\delta _5\) and copula \(O_{mM}\) achieves the largest AM in the study. Moreover, considering \(\delta _5\), for more than half of the considered d-CC Integrals, the AM is superior to 80%. The RDF \(\delta _2\) is the next one to achieve satisfactory performance. In fact, only for two copulas, \(T_M\), and \(T_L\), this RDF presents an AM inferior to 80%. Observe that \(\delta _5\) and \(\delta _2\) also performed well in [27, 28]. Additionally, both \(\delta _0\) and \(\delta _3\) presented similar AMs, around 79%. On the other hand, the remaining cases (\(\delta _1\) and \(\delta _4\)) presented the worst scenarios in this study, with AMs around 78%.
Up to this point, the superiority of \(\delta _2\) and \(\delta _5\) among the RDFs can be perceived. However, to complete the analysis we have conducted a set of statistical tests to reinforce this conclusion. Considering that the conditions necessary to perform parametric tests are not fulfilled, we considered non-parametric statistical tests. Hence, the analysis was made by a group comparison using the Align-Friedman Rank test [14], and the results were analyzed in terms of the adjusted p-value (APV) computed by Holm’s post hoc test [15].
The analysis was done by performing a statistical test per RDF, with a local statistical analysis among the different functions. After that, we chose the control variables of each group, the ones having the lowest obtained rank, and once again we statistically compared them. This approach allowed us to point out the best d-CC integral, used to compare against other aggregation operators.
The results, in Table 6, sort the methods by ranks per RDF. The last column shows a comparison among the selected control variables. If the obtained APV is smaller than 0.1 (10% of significance level), it means the methods are statistically different and, so, we underline them. Among the groups of RDFs, in general, the number of statistical differences are low. The test rejected the null hypothesis when comparing the control variable against three different approaches for \(\delta _4\), two for \(\delta _0\) and \(\delta _2\), one for \(\delta _5\) and none for \(\delta _1\) and \(\delta _2\).
The last column performs a group comparison among the control variables of each family based on an RDF. See that the d-CC integral with \(\delta _5\) and \(O_{mM}\), when compared against almost all approaches, is statically superior. But this does not hold when \(\delta _2\) and \(O_{\alpha }\) are considered. Finally, \(\delta _5-O_{mM}\) provides the winning combination in this first analysis and so, it is the representing method of the class of d-CC integrals.
5.2 Comparing the Best d-CC Integral Against Classical Operators
This subsection states the position of the d-CC integrals among the generalizations of the CI as well as known FRMs found in the literature. We have selected averaging and non-averaging operators:
-
Averaging operators: [WR] (Winning Rule [11], based on the maximum as aggregation function), [Cho] (CI), [CC] (the best CC-integral [22]), [\(C_T\)] (the best \(C_T\)-integral, a CI generalization by t-norms [20]), [\(C^F_{AVG}\)] (the best averaging \(C_F\)-integral [23]), [d-CF] (the best averaging d-CF integral [28]).
-
Non-Averaging functions: [AC] (the Additive Combination [12], based on the normalized sum, FARC-HD algorithm basis), [PS] (the Probabilistic Sum [12]), [\(C_{F1F2}\)] (the \(C_{F1F2}\)-integral [19], being the state-of-the-art), [\(C^F_{N-AVG}\)] (the best on-averaging \(C_F\)-integral [23]).
The obtained results are in Table 7, detailed per dataset. We highlight in boldface the largest obtained AM among all results.
We observed that the \(C_{F1F2}\)-integral is the approach having the largest AM in the study. However, the d-CC integral (our new approach) is the second one, followed by the \(C_F\) and the d-CF integrals. It is also noticeable that the d-CC integral achieves a superior mean than the classical FRMs of AC and PS.
Considering the methods that present averaging characteristics and their obtained AM, it is easy to see that their performance is, in general, inferior to the non-averaging ones. The accuracy in these cases is around 79% which is inferior to the best d-CC integral, with the exception of the d-CF integral.
Once again, we performed a set of analyses to statistically compare the approaches. Therefore, we have compared the best d-CC integral (based on OmM and \(\delta _5\)) against the averaging and non-averaging approaches. The results are presented in Table 8, following the same structure used in the previous analysis.
Regarding the results obtained when comparing the d-CC integral against the averaging approaches, we see that it is statistically superior to all other approaches, with the exception of the d-CF integral, considered equivalent. It is noteworthy that the CC-integral is among the surpassed methods, pointing out that its generalization by RFDs resulted in better performances. If the non-averaging approaches are considered, it is noticeable that our new method is the only one that is able to be considered statistically equivalent to the state-of-the-art. For all the other cases, the \(C_{F1F2}\)-integral is still superior.
6 Conclusions
New generalizations of the discrete CI have been developed in recent years, by generalizing the product and/or the difference operations from the original expression or its expanded form. The study of those generalizations has proven to be worthwhile since they can provide promising results in applications. In this paper, we continue this trend by focusing on the generalization of the CC-integral by means of RDFs, calling them d-CC-integrals. Then, we studied some relevant properties for application purposes.
The final part of our work applied the d-CC-integrals in a classification problem, followed by a two-step analysis. First, we compared different d-CC-integrals and identified that the one based on the copula OmM and the RDF \(\delta _5\) had the best overall performance. Then, we compared this particular d-CC-integral with other methods from the literature, showing that it is statistically equivalent to the state-of-the-art \(C_{F1F2}\)-integral and surpasses most of the other considered approaches, including the CC-integral.
Notes
- 1.
A function F is said to be averaging if \(min \le F \le max\).
- 2.
- 3.
- 4.
To analyze the particular cases and results per fold please check the folowing link - https://github.com/Giancarlo-Lucca/d-CC-Integrals-generalizing-CC-integrals-by-restricted-dissimilarity-functions.
References
Alcala-Fdez, J., Alcala, R., Herrera, F.: A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning. IEEE Trans. Fuzzy Syst. 19(5), 857–872 (2011)
Alcalá-Fdez, J., et al.: Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft. Comput. 13(3), 307–318 (2009)
Alsina, C., Frank, M.J., Schweizer, B.: Associative Functions: Triangular Norms and Copulas. World Scientific Publishing Company, Singapore (2006)
Boczek, M., Kaluszka, M.: On the extended Choquet-Sugeno-like operator. Int. J. Approx. Reason. 154, 48–55 (2023)
Bustince, H., Fernandez, J., Kolesárová, A., Mesiar, R.: Directional monotonicity of fusion functions. Eur. J. Oper. Res. 244(1), 300–308 (2015)
Bustince, H., Fernandez, J., Mesiar, R., Montero, J., Orduna, R.: Overlap functions. Nonlinear Anal. Theory Methods App. 72(3–4), 1488–1499 (2010)
Bustince, H., Jurio, A., Pradera, A., Mesiar, R., Beliakov, G.: Generalization of the weighted voting method using penalty functions constructed via faithful restricted dissimilarity functions. Eur. J. Oper. Res. 225(3), 472–478 (2013)
Bustince, H., et al.: d-Choquet integrals: choquet integrals based on dissimilarities. Fuzzy Sets Syst. (2020)
Bustince, H., et al.: Ordered directionally monotone functions: justification and application. IEEE Trans. Fuzzy Syst. 26(4), 2237–2250 (2018)
Choquet, G.: Theory of capacities. Institut Fourier 5, 131–295 (1953–1954)
Cordón, O., del Jesus, M.J., Herrera, F.: A proposal on reasoning methods in fuzzy rule-based classification systems. Int. J. Approx. Reason. 20(1), 21–45 (1999)
Cordon, O., del Jesus, M.J., Herrera, F.: Analyzing the reasoning mechanisms in fuzzy rule based classification systems. Math. Soft Comput. 5(2–3), 321–332 (1998)
Dimuro, G.P., et al.: The state-of-art of the generalizations of the Choquet integral: from aggregation and pre-aggregation to ordered directionally monotone functions. Inf. Fusion 57, 27–43 (2020)
Hodges, J.L., Lehmann, E.L.: Ranks methods for combination of independent experiments in analysis of variance. Ann. Math. Stat. 33, 482–497 (1962)
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)
Ishibuchi, H., Nakashima, T.: Effect of rule weights in fuzzy rule-based classification systems. Fuzzy Syst. IEEE Trans. 9(4), 506–515 (2001)
Ishibuchi, H., Nakashima, T., Nii, M.: Classification and Modeling with Linguistic Information Granules, Advanced Approaches to Linguistic Data Mining. Advanced Information Processing. Springer, Berlin, Heidelberg (2005). https://doi.org/10.1007/b138232
Ko, L., et al.: Multimodal fuzzy fusion for enhancing the motor-imagery-based brain computer interface. IEEE Comput. Intell. Mag. 14(1), 96–106 (2019)
Lucca, G., Dimuro, G.P., Fernandez, J., Bustince, H., Bedregal, B., Sanz, J.A.: Improving the performance of fuzzy rule-based classification systems based on a nonaveraging generalization of CC-integrals named \(C_{F_1F_2}\)-integrals. IEEE Trans. Fuzzy Syst. 27(1), 124–134 (2019)
Lucca, G., Sanz, J., Pereira Dimuro, G., Bedregal, B., Mesiar, R., Kolesárová, A., Bustince Sola, H.: Pre-aggregation functions: construction and an application. IEEE Trans. Fuzzy Syst. 24(2), 260–272 (2016)
Lucca, G., Sanz, J.A., Dimuro, G.P., Borges, E.N., Santos, H., Bustince, H.: Analyzing the performance of different fuzzy measures with generalizations of the Choquet integral in classification problems. In: 2019 FUZZ-IEEE, pp. 1–6 (2019)
Lucca, G., et al.: CC-integrals: choquet-like copula-based aggregation functions and its application in fuzzy rule-based classification systems. KBS 119, 32–43 (2017)
Lucca, G., Sanz, J.A., Dimuro, G.P., Bedregal, B., Bustince, H., Mesiar, R.: CF-integrals: a new family of pre-aggregation functions with application to fuzzy rule-based classification systems. Inf. Sci. 435, 94–110 (2018)
Marco-Detchart, C., Lucca, G., Lopez-Molina, C., De Miguel, L., Pereira Dimuro, G., Bustince, H.: Neuro-inspired edge feature fusion using Choquet integrals. Inf. Sci. 581, 740–754 (2021)
Mesiar, R., Kolesárová, A., Bustince, H., Dimuro, G., Bedregal, B.: Fusion functions based discrete Choquet-like integrals. EJOR 252(2), 601–609 (2016)
Sanz, J.A., Bernardo, D., Herrera, F., Bustince, H., Hagras, H.: A compact evolutionary interval-valued fuzzy rule-based classification system for the modeling and prediction of real-world financial applications with imbalanced data. IEEE Trans. Fuzzy Syst. 23(4), 973–990 (2015)
Wieczynski, J., et al.: Applying d-XChoquet integrals in classification problems. In: 2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–7 (2022)
Wieczynski, J., et al.: \(dc_{F}\)-integrals: generalizing c\(_{F}\)-integrals by means of restricted dissimilarity functions. IEEE TFS 31(1), 160–173 (2023)
Wieczynski, J.C., et al.: d-XC integrals: on the generalization of the expanded form of the Choquet integral by restricted dissimilarity functions and their applications. IEEE TFS 30(12), 5376–5389 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sartori, J. et al. (2023). d-CC Integrals: Generalizing CC-Integrals by Restricted Dissimilarity Functions with Applications to Fuzzy-Rule Based Systems. In: Naldi, M.C., Bianchi, R.A.C. (eds) Intelligent Systems. BRACIS 2023. Lecture Notes in Computer Science(), vol 14195. Springer, Cham. https://doi.org/10.1007/978-3-031-45368-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-45368-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45367-0
Online ISBN: 978-3-031-45368-7
eBook Packages: Computer ScienceComputer Science (R0)
