1 Introduction

Artificial neural networks (ANNs) are widely applied in tasks like computer vision, speech recognition, and pattern recognition [13]. Despite their success, ANNs are often considered black-box algorithms. Such a lack of interpretability poses risks in critical domains such as medical and financial applications, where understanding model decisions is crucial. Additionally, the presence of adversarial examples highlights the need for explainability in machine learning algorithms, including neural networks. An adversarial example is an instance misclassified by a machine learning model and also slightly different from another correctly classified instance [6].

In this work, an explanation for a prediction made by an ANN is a subset of features and their values that alone suffice for the prediction. If an instance has the features in this subset, the ANN makes the same prediction, regardless of the values of other features. For example, given an instance \(\{sneeze=True\), \(weight=70 \ kg\), \(headache=True\), \(age=40 \ years\}\) and its ANN output flu, a possible explanation could be \(\{sneeze=True, headache=True\}\). That is, if an instance has the features \(sneeze=True\) and \(headache=True\), the ANN prediction is flu, regardless of weight and age values. A minimal explanation avoids redundancy by including only essential information. An explanation is considered minimal when removing any feature results in the loss of assurance that every instance satisfying the explanation maintains the same output. Then, a minimal explanation avoids redundancy, providing only essential information.

Heuristic methods, such as ANCHOR [15] and LIME [14], have been used to provide explanations for machine learning models. However, these approaches explore the instance space locally, not resulting in explanations that have minimal sizes and formal guarantees of correctness. Correctness guarantees are provided when there are no instances with the values specified in the explanation such that the ANN makes a different prediction. Moreover, minimal explanations are desired since they do not contain redundancy, making them easier to understand and interpret.

Some approaches aim to provide explanations for machine learning models with formal guarantees of correctness [1, 4, 7, 8, 16, 18]. Ignatiev et al. [8] proposed a logic-based algorithm that gives minimal and correct explanations for ANNs, utilizing logical constraints originally designed for finding adversarial examples Fischetti and Jo [5]. These constraints include linear equations, inequalities, and logical implications, solved using a Mixed Integer Linear Programming (MILP) solver. However, scalability issues arise, particularly with large ANNs, necessitating further development before deployment in large-scale production environments.

This work explores two different encodings to improve the scalability of providing correct minimal explanations for ANNs, building upon [8]. In addition to the logical constraints of [5], we adopt the encoding proposed by Tjeng et al. [17], which uses fewer variables and constraints, and excludes logical implications. By reducing variables and constraints compared to [5], our approach aims to enhance explanation computation performance. To adapt the approach of [17] for explanations, we introduce new constraints to ensure correctness. In line with the encodings proposed by Fischetti and Jo [5] and Tjeng et al. [17], we also compute lower and upper bounds for each neuron. These bounds are found through optimization using a MILP solver. Moreover, these bounds can aid the solver in computing explanations more rapidly. In this manner, we compare the time required for constructing logical constraints with lower and upper bounds of each neuron, along with the time needed for computing explanations.

We conducted experiments to evaluate both encodings. Our adaptation of the encoding proposed in [17] exhibits a better running time in building encodings for ANNs with two layers and tens of neurons, showing an improvement of up to 18%. Surprisingly, both methods exhibit similar running times for computing explanations. Furthermore, our adaptation outperforms the other encoding in the overall time, encompassing both building logical constraints and computing explanations. In this case, the results indicate an improvement of up to 16%. In summary, our main contributions are described in the following:

  • Adaptation of the encoding proposed in [17] to provide explanations for ANNs; additional constraints were incorporated to address the problem of computing explanations.

  • Comparative analysis of the running time for building the logical constraints between the two approaches. Additionally, we analyze the time for generating explanations using both encodings.

  • Publicly available implementations of both encodings for finding explanations for ANNsFootnote 1.

In the next section, we review some concepts and terminologies about Logic, MILP and ANNs. Sections 3 and 4 show how to compute explanations with and without implications, respectively. Section 4 describes our adaptation of the encoding proposed in [17]. Experiments and results are presented in Sect. 5. Finally, conclusions and future work are described in Sect. 6.

2 Background

In this section, we introduce some initial concepts and terminology to understand the rest of this work.

2.1 First-Order Logic over LRA

In this work, we use first-order logic (FOL) to give explanations with guarantees of correctness. We use quantifier-free first-order formulas over the theory of linear real arithmetic (LRA). Then, first-order variables are allowed to take values from the real numbers \(\mathbb {R}\). For details, see [11]. Therefore, we consider formulas as defined below:

$$\begin{aligned} \begin{aligned} F, G &:= p \mid (F \wedge G) \mid (F \vee G) \mid (\lnot F) \mid (F \rightarrow G),\\ p &:= \sum ^n_{i=1} w_i x_i \le b \mid \sum ^n_{i=1} w_i x_i < b, \end{aligned} \end{aligned}$$
(1)

such that F and G are quantifier-free first-order formulas over the theory of linear real arithmetic. Moreover, p represents the atomic formulas such that \(n \ge 1\), each \(w_i\) and b are fixed real numbers, and each \(x_i\) is a first-order variable. Observe that we allow the use of other letters for variables instead of \(x_i\), such as \(s_i\), \(z_i\), \(q_i\). For example, \((2.5x_1 + 3.1x_2 \ge 6) \wedge (x_1=1 \vee x_1=2) \wedge (x_1=2 \rightarrow x_2 \le 1.1)\) is a formula by this definition. Observe that we allow standard abbreviations as \(\lnot (2.5x_1 + 3.1x_2 < 6)\) for \(2.5x_1 + 3.1x_2 \ge 6\).

Since we are assuming the semantics of formulas over the domain of real numbers, an assignment \(\mathcal {A}\) for a formula F is a mapping from the first-order variables of F to elements in the domain of real numbers. For instance, \(\{x_1 \mapsto 2.3, x_2 \mapsto 1\}\) is an assignment for \((2.5x_1 + 3.1x_2 \ge 6) \wedge (x_1=1 \vee x_1=2) \wedge (x_1=2 \rightarrow x_2 \le 1.1)\). An assignment \(\mathcal {A}\) satisfies a formula F if F is true under this assignment. For example, \(\{x_1 \mapsto 2, x_2 \mapsto 1.05\}\) satisfies the formula in the above example, whereas \(\{x_1 \mapsto 2.3, x_2 \mapsto 1\}\) does not satisfy it.

A formula F is satisfiable if there exists a satisfying assignment of F. To give an example, the formula in the above example is satisfiable since \(\{x_1 \mapsto 2, x_2 \mapsto 1.05\}\) satisfies it. As another example, the formula \((x_1 \ge 2) \wedge (x_1 < 1)\) is unsatisfiable since no assignment satisfies it. Given formulas F and G, the notation \(F \models G\) is used to denote logical consequence or entailment, i.e., each assignment that satisfies F also satisfies G. As an illustrative example, let \(F = (x_1 = 2 \wedge x_2 \ge 1)\) and \(G = (2.5x_1 + x_2 \ge 5) \wedge (x_1=1 \vee x_1=2)\). Then, \(F \models G\). The essence of entailment lies in ensuring the correctness of the conclusion G based on the given premise F. In the context of computing explanations, as presented in [8], logical consequence serves as a fundamental tool for guaranteeing the correctness of predictions made by ANNs. Therefore, our adaptation of the encoding proposed by Tjeng et al. [17] also incorporates the principles of entailment for computing explanations.

The relationship between satisfiability and entailment is a fundamental aspect of logic. It is widely known that, for all formulas F and G, it holds that \(F \models G\) iff \(F \wedge \lnot G\) is unsatisfiable. For instance, \(( x_1 = 2 \wedge x_2 \ge 1) \wedge \lnot ((2.5x_1 + x_2 \ge 5) \wedge (x_1=1 \vee x_1=2))\) has no satisfying assignment since an assignment that satisfies \((x_1 = 2 \wedge x_2 \ge 1)\) also satisfies \((2.5x_1 + x_2 \ge 5) \wedge (x_1=1 \vee x_1=2)\) and, therefore, does not satisfy \(\lnot ((2.5x_1 + x_2 \ge 5) \wedge (x_1=1 \vee x_1=2))\). Since our approach builds upon the concept of logical consequence, we can leverage this connection in the context of computing explanations for ANNs.

2.2 Mixed Integer Linear Programming

In Mixed Integer Linear Programming (MILP), the objective is to optimize a linear function subject to linear constraints, where some or all of the variables are required to be integers [2]. MILP is a crucial technique in our work for determining the lower and upper bounds of each neuron in the ANNs. For example, we utilize a minimization problem to determine the lower bound of neurons within ANNs. This process involves formulating an objective function that seeks to minimize the lower bound, subject to constraints that reflect the behaviour of ANNs. To illustrate the structure of a MILP, we provide an example below:

$$\begin{aligned} \begin{aligned} \min \quad & y_1 \\ \text {s.t.} \quad & 1 \le x_1 \le 3\\ & 3x_1 + s_1 - 2 = y_1 \\ & 0 \le y_1 \le 3x_1 - 2 \\ & 0 \le s_1 \le 3x_1 - 2\\ & z_1 = 1 \rightarrow y_1 \le 0 \\ & z_1 = 0 \rightarrow s_1 \le 0 \\ & z_1 \in \{0, 1\} \end{aligned} \end{aligned}$$
(2)

In the MILP in (2), we want to find values for variables \(x_1, y_1, s_1, z_1\) minimizing the value of the objective function \(y_1\) among all values that satisfy the constraints. Variable \(z_1\) is binary since \(z_1 \in \{0, 1\}\) is a constraint in the MILP, while variables \(x_1, y_1, s_1\) have the real numbers \(\mathbb {R}\) as their domain. The constraints in a MILP may appear as linear equations, linear inequalities, and indicator constraints. Indicator constraints can be seen as logical implications of the form \(z = v \rightarrow \sum ^n_{i=1} w_i x_i \le b\) such that z is a binary variable, v is a constant 0 or 1 [3].

An important observation is that a MILP problem without an objective function corresponds to a satisfiability problem, as discussed in Sect. 2.1. Given that the approach for computing explanations relies on logical consequence, and considering the connection between satisfiability and logical consequence, we employ a MILP solver to address explanation tasks. Additionally, throughout the construction of the MILP model, we utilize optimization, specifically employing a MILP solver, to determine tight lower and upper bounds for the neurons of ANNs.

2.3 Classification Problems and Artificial Neural Networks

In machine learning, classification problems are defined over a set of n features \(\mathcal {F} = \{x_1, ..., x_n\}\) and a set of \(\mathcal {N}\) classes \(\mathcal {K} = \{c_1, c_2,...,c_\mathcal {N}\}\). In this work, we consider that each feature \(x_i \in \mathcal {F}\) takes its values \(v_i\) from the domain of real numbers. Moreover, each feature \(x_i\) has an upper bound \(u_i\) and a lower bound \(l_i\) such that \(l_i \le x_i \le u_i\), and its domain is the closed interval \([l_i, u_i]\). This is represented as a set of domain constraints or feature space \(D = \{l_1 \le x_1 \le u_1,\text l_2 \le x_2 \le u_2, ..., l_n \le x_n \le u_n \}\). For example, a feature for the height of a person belongs to the real numbers and may have lower and upper bounds of 0.5 and 2.1 meters, respectively. Furthermore, \(\{x_1 = v_1, x_2 = v_2, ..., x_n = v_n\}\) represents a specific point or instance of the feature space such that each \(v_i\) is in the domain of \(x_i\).

An ANN is a function that maps elements in the feature space into the set of classes \(\mathcal {K}\). A feedforward ANN is composed of \(L+1\) layers of neurons. Each layer \(l \in \{0, 1, ..., L\}\) is composed of \(n_l\) neurons, numbered from 1 to \(n_l\). Layer 0 is fictitious and corresponds to the input of the ANN, while the last layer, K corresponds to its outputs. Layers 1 to \(L-1\) are typically referred to as hidden layers. Let \(x^l_i\) be the output of the ith neuron of the lth layer, with \(i \in \{1,...,n_l\}\). The inputs to the ANN can be represented as \(x^0_i\) or simply \(x_i\). Moreover, we represent the outputs as \(x^L_i\) or simply \(o_i\).

The values \(x^l_i\) of the neurons in a given layer l are computed through the output values \(x^{l-1}_j\) of the previous layer, with \(j \in \{1,...,n_{l-1}\}\). Each neuron applies a linear combination of the output of the neurons in the previous layer. Then, the neuron applies a nonlinear function, also known as an activation function. The output of the linear part is represented as \(\sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_{j} + b^{l}_{i}\) where \(w^{l}_{i,j}\) and \(b^{l}_{i}\) denote the weights and bias, respectively, serving as parameters of the ith neuron of layer l. In this work, we consider only feedforward ANNs with the Rectified Linear Unit (\(\textrm{ReLU}\)) as activation function because it can be represented by linear constraints due to its piecewise-linear nature. This function is a widely used activation whose output is the maximum between its input value and zero. Then, \(x^{l}_{i} = \textrm{ReLU}(\sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_{j} + b^{l}_{i})\) is the output of the \(\textrm{ReLU}\).

For classification tasks, the last layer L is composed of \(n_L = \mathcal {N}\) neurons, one for each class. Moreover, it is common to normalize the output layer using a Softmax layer. Consequently, these values represent the probabilities associated with each class. The class with the highest probability is chosen as the predicted class. However, we do not need to consider this normalization transformation as it does not change the maximum value of the last layer. Thus, the predicted class is \(c_i \in \mathcal {K}\) such that \(i = \mathop {\mathrm {arg\,max}}\nolimits _{j \in \{1, ..., \mathcal {N}\}} x^L_j\).

3 Explanations for ANNs with Logical Implications

Ignatiev et al. [8] proposed an algorithm that computes minimal explanations for ANNs, yielding a subset of the input features sufficient for the prediction. This approach is based on logic with guarantees on the correctness and minimality of explanations. A flowchart for computing explanations using such an algorithm is shown in Fig. 1.

Fig. 1.
figure 1

Flowchart for calculating explanations.

First, the ANN and the feature space \(\{l_1 \le x_1 \le u_1,\text l_2 \le x_2 \le u_2, ..., l_n \le x_n \le u_n \}\) are encoded as a formula F, an instance \(\{x_1 = v_1, x_2 = v_2, ..., x_n = v_n\}\) of the feature space is encoded as a conjunction in a formula C, and the associated prediction by the ANN is encoded as a formula E. Then, it holds that \(C \wedge F \models E\). The minimal explanation \(C_m\) of C is calculated removing feature by feature from C. For example, given a feature \(x_i\) with value v in C, if \(C \setminus \{x_i=v\} \wedge F \models E\), feature \(x_i\) may be considered as irrelevant in the explanation and is removed from C. Otherwise, if \(C \setminus \{x_i=v\} \wedge F \not \models E\), then \(x_i\) is kept in C since the same class cannot be guaranteed. This \(C \setminus \{x_i=v\}\) notation represents the removal of \(x_i=v\) from formula C. This process is described in Algorithm 1 and is performed for all features. Then, \(C_m\) is the result at the end of this procedure. This means that for the values of the features in \(C_m\), the ANN makes the same classification, whatever the values of the remaining features. Since to check entailments \(C \wedge F \models E\) is equivalent to test whether \(C \wedge F \wedge \lnot E\) is unsatisfiable and F, C and \(\lnot E\) are encoded as linear constraints and indicator constraints, such a entailment can be addressed by a MILP solver.

figure a

The encoding of ANNs used in [8] and originally proposed by Fischetti and Jo [5] uses implications to represent the behavior of the \(\textrm{ReLU}\) activation function. We encode an ANN with \(L+1\) layers as in Eqs. (3)–(5). In the following, we explain the notation. The encoding uses variables \(x^{l}_{i}\) and \(o_n\) with the same meaning as in the notation for ANNs. Auxiliary variables \(s^{l}_{i}\) and \(z^{l}_{i}\) control the behaviour of \(\textrm{ReLU}\) activations. Variable \(z^{l}_{i}\) is binary and if \(z^{l}_{i}\) is equal to 1, the \(\textrm{ReLU}\) output \(x^{l}_{i}\) is 0 and \(- s^{l}_{i}\) is equal to the linear part. Otherwise, the output \(x^{l}_{i}\) is equal to the linear part and \(s^{l}_{i}\) is equal to 0. The constant \(ub^{l}_{s,i}\) is the upper bound of variable \(s^{l}_{i}\), and the constant \(ub^{l}_{x,i}\) is the upper bound of variable \(x^{l}_{i}\). Each variable \(x^{0}_{i}\) has also lower and upper bounds \(l_i\), \(u_i\), respectively, defined by the domain of the features.

$$\begin{aligned} \left. \begin{aligned} &\sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_i + b^{l}_{i} = x^{l}_{i} - s^{l}_{i}\\ &z^{l}_{i} = 1 \rightarrow x^{l}_{n} \le 0 \\ &z^{l}_{i} = 0 \rightarrow s^{l}_{n} \le 0 \\ &z^{l}_{i} \in \{0, 1\} \\ &0 \le x^{l}_{i} \le ub^{l}_{x,i} \\ &0 \le s^{l}_{i} \le ub^{l}_{s,i} \\ \end{aligned} \right\} l = 1, ..., L-1, \ i = 1, ..., n_l \end{aligned}$$
(3)
$$\begin{aligned} &l_i \le x_{i} \le u_i,\quad i = 1, ..., n_0\end{aligned}$$
(4)
$$\begin{aligned} &o_i = \sum _{j=1}^{n_{L-1}} w^{L}_{i,j} x^{L-1}_i + b^{L}_{i},\quad i = 1, ..., n_L \end{aligned}$$
(5)

The constraints in (3)–(5) represent the formula F. The bounds \(ub^{l}_{x,i}\) are defined by isolating variable \(x^{l}_{i}\) from other constraints in subsequent layers. Then, \(x^{l}_{i}\) is maximized to find its upper bound. A similar process is applied to find the bounds \(ub^{l}_{s,i}\) for variables \(s^{l}_{i}\). This optimization is possible due to the bounds of the features. Furthermore, these bounds can assist the solver in accelerating the computation of explanations. Therefore, the time required for this process must be considered when building F.

To check the unsatisfiability of the expression \(C \wedge F \wedge \lnot E\), we still need to take into account the formula \(\lnot E\), referring to the prediction of the ANN. Given an input C predicted as class \(c_i\) by the ANN, formula E must be equivalent to \(\bigwedge _{j=1, j \ne i}^{\mathcal {N}} o_i > o_j\). This formula asserts that the maximum value of the last layer is in output \(o_i\). Therefore, \(\lnot E\) must ensure that \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\). Since MILP solvers can not directly represent disjunctions, we use implications (6) and a linear constraint (7) over binary variables to define \(\lnot E\).

$$\begin{aligned} &q_j = 1 \rightarrow o_i \le o_j, \quad j \in \{1, ..., \mathcal {N}\} \setminus \{i\} \end{aligned}$$
(6)
$$\begin{aligned} &\sum _{j=1, j \ne i}^{\mathcal {N}} q_j \ge 1\end{aligned}$$
(7)
$$\begin{aligned} &q_{j} \in \{0, 1\}, \quad j \in \{1, ..., \mathcal {N}\} \setminus \{i\} \end{aligned}$$
(8)

If an assignment \(\mathcal {A}\) satisfies \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\), then \(o_i \le o_j\) is true under \(\mathcal {A}\) for some j. Therefore, the assignment \(\mathcal {A} \cup \{q_j \mapsto 1\}\) satisfies the Eqs. (6)–(8). Conversely, if an assignment \(\mathcal {A}\) satisfies Eqs. (6)–(8), it clearly also satisfies \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\). We conclude this section with a proposition regarding the number of variables and constraints of the encoding discussed above.

Proposition 1

Let C be an instance predicted as a class \(c \in \mathcal {K}\) by an ANN, the formula \(C \wedge F \wedge \lnot E\) has \(n_0 + n_L + 2\sum _{l=1}^{n_{L-1}} n_l\) real variables and \(n_L - 1 + \sum _{l=1}^{n_{L-1}} n_l\) binary variables. Also, the formula has \(n_0 + 2n_L + 5\sum _{l=1}^{n_{L-1}}\) constraints.

4 Explanations for ANNs Without Implications

In this section, we present an adaptation of the encoding proposed by Tjeng et al. [17] for logic-based explainability. In such a work, the authors originally used the encoding to find adversarial examples without using logical implications. Even more importantly, such an encoding uses fewer variables and constraints compared to [5]. Then, we expect that our adaptation can lead to a better execution time for both building the logical constraints and computing explanations. Adapting the encoding in [17] to the context of computing explanations requires incorporating additional constraints that were not part of the original work. These new constraints represent the class predicted by the ANN as a formula E, as seen in Sect. 3. However, to maintain the concept of the original encoding without implications, we define these additional constraints without implications.

In the following, we apply the same algorithm from Fig. 1, but replacing the encoding of F as in [5] with the one in [17]. We encode an ANN with \(L+1\) layers as in Eqs. (9), (4) and (5). The variables \(x^{l}_{i}\) and \(o_i\) have the same meaning as in Eqs. (3)–(5). Furthermore, auxiliary variables \(s^l_i\) are not required, as observed in the encoding by Fischetti and Jo [5]. Constants \(lb^{l}_{i}\) and \(ub^{l}_{i}\) are, respectively, the lower and upper bounds of \(\sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_j + b^{l}_{i}\). Again, we find such bounds via a MILP solver. The behavior of \(\textrm{ReLU}\) is modeled using these bounds and binary variables \(z^{l}_{i}\). If \(z^{l}_{i}\) is equal to 0, the \(\textrm{ReLU}\) output \(x^{l}_{i}\) is 0. Otherwise, \(x^{l}_{i}\) is equal to \(\sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_j + b^{l}_{i}\). The bounds \(lb^{l}{i}\) and \(ub^{l}{i}\) are necessary to maintain the integrity of the set of constraints for the entire feature space. Regardless of the value of \(z^{l}{i}\), the bounds ensure that the constraints remain valid for the entire feature space.

$$\begin{aligned} \left. \begin{aligned} &x^{l}_{i} \le \sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_j + b^{l}_{i} - lb^{l}_{i} (1 - z^{l}_{i})\\ &x^{l}_{i} \ge \sum _{j=1}^{n_{l-1}} w^{l}_{i,j} x^{l-1}_j + b^{l}_{i} \\ &x^{l}_{i} \le ub^{l}_{i} z^{l}_{i} \\ &z^{l}_{i} \in \{0, 1\} \\ &x^{l}_{i} \ge 0 \\ \end{aligned} \right\} l = 1, ..., L-1, \ i = 1, ..., n_l \end{aligned}$$
(9)

In our proposal for computing explanations, constraints in Eqs. (9), (4) and (5) represent the formula F. As in Sect. 3, an instance is a conjunction C, and the associated prediction by the ANN is a formula E. Given an input C predicted as class \(c_i\) by the ANN, again formula \(\lnot E\) must ensure that \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\). Therefore, we must add new constraints to represent \(\lnot E\). Maintaining the concept of the original encoding in [17] without implications, we define these additional constraints accordingly. We employ binary variables \(q_j\) and the upper and lower bounds \(ub_j\) and \(lb_j\) of variables \(o_j\). As for \(lb^{l}_{i}\) and \(ub^{l}_{i}\), we find the bounds \(ub_j\) and \(lb_j\) through a MILP solver. We recall such elements are not originally present in [17]. However, they are necessary for the context of computing explanations for ANNs. In Equations (10)-(12) we represent our proposal for encoding formula \(\lnot E\), where the prediction associated with an input C is class \(c_i\).

$$\begin{aligned} &o_i - o_j \le (ub_i - lb_j) (1 - q_j) , \quad j \in \{1, ..., \mathcal {N}\} \setminus \{i\} \end{aligned}$$
(10)
$$\begin{aligned} &\sum _{j=1, j \ne i}^{\mathcal {N}} q_j \ge 1 \end{aligned}$$
(11)
$$\begin{aligned} &q_{j} \in \{0, 1\}, \quad j \in \{1, ..., \mathcal {N}\} \setminus \{i\} \end{aligned}$$
(12)

In what follows, we prove that Equations (10)-(12) correctly ensure that \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\).

Proposition 2

Let \(\lnot E\) be defined as in Equations (10)-(12). Let \(i \in \{1, ..., \mathcal {N}\}\) be fixed and \(ub_i\) be such that \(o_i \le ub_i\). Let \(lb_j\) be such that \(lb_j \le o_j\), for \(j \in \{1, ..., \mathcal {N}\} \setminus \{i\}\). Therefore,

$$ \lnot E \text { is satisfiable iff } \bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j \text { is satisfiable.} $$

Proof

If an assignment \(\mathcal {A}\) satisfies \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\), then \(o_i \le o_j'\) is true under \(\mathcal {A}\) for \(j' \ne i\). Let \(\mathcal {A}' = \mathcal {A} \cup \{q_{j} \mapsto v \mid v=1 \text { if } j=j', \text { else } v=0, \text { for } j \in \{1, ..., \mathcal {N}\} \setminus \{i\}\}\) be an assignment. Then, \(\mathcal {A}'\) imposes that \(o_i \le o_j'\) in Equation (10) for \(j = j'\), which in clearly true under this assignment. For \(j \ne j'\), it follows that \(o_i - o_j \le (ub_i - lb_j)\) must hold, which is also true under \(\mathcal {A}'\) since \(lb_j \le o_j\) and \(o_i \le ub_i\).

Conversely, if an assignment \(\mathcal {A}\) satisfies Equations (10)-(12), it satisfies some \(q_j'\), for \(j' \ne i\) by Equation (11). Moreover, \(\mathcal {A}\) satisfies \(o_i \le o_{j'}\) by Equation (10), for \(j = j'\). Therefore, \(\mathcal {A}\) also satisfies \(\bigvee _{j=1, j \ne i}^{\mathcal {N}} o_i \le o_j\).

Finally, we give a proposition on the number of variables and constraints in our adaptation of the encoding in [17].

Proposition 3

Let C be an instance predicted as a class \(c \in \mathcal {K}\) by an ANN, then formula \(C \wedge F \wedge \lnot E\) has \(n_0 + n_L + \sum _{l=1}^{n_{L-1}} n_l\) real variables and \(n_L - 1 + \sum _{l=1}^{n_{L-1}} n_l\) binary variables. Moreover, the formula has \(n_0 + 2n_L + 4\sum _{l=1}^{n_{L-1}} n_l\) constraints.

Therefore, this encoding has \(\sum _{l=1}^{n_{L-1}} n_l\) fewer variables than the one presented in Sect. 3. Additionally, this encoding has \(\sum _{l=1}^{n_{L-1}} n_l\) fewer constraints. Consequently, one would expect a reduction in running time for both building the logical constraints and computing explanations.

5 Experiments

In this section, we detail the experiments conducted to compare our proposal against the encoding presented in [5]. Our evaluation consists of two main experiments. In the first one, we compare the two encodings using 12 datasets. In the second one, we conduct a detailed comparison using a single dataset. We vary the architecture of the trained ANNs to explore the effect of the number of layers and neurons. We evaluate the performance of each encoding in terms of time for building logical constraints and time for computing explanations. We explained all instances in a given dataset and calculated the average time and standard deviation to compare times for computing explanations. In the other case, given a trained ANN on a dataset, we built the logical constraints 10 times and calculated the average time and standard deviation.

Next, we present the experimental setup, describing technologies, datasets, the trained ANNs and hyperparameters. After that, we discuss the results providing a comparative analysis of the running times for building logical constraints and computing explanations. Finally, we highlight specific improvements observed in our proposal.

5.1 Experimental Setup

We used Python to implement the approaches and to run the experiments. TensorFlow was used to manipulate ANNs, including the training and testing steps. CPLEX was used as the MILP solver and accessed by the DOcplex library.

We used 12 datasets from the UCI Machine Learning RepositoryFootnote 2 and Penn Machine Learning BenchmarksFootnote 3, each ranging from 9 to 32 integer, continuous, categorical or binary features. The number of instances in the selected datasets ranges from 156 to 691. The types of classification problems related to these datasets are binary and multi-class classification. The preprocessing performed on the datasets included one-hot encoding of the categorical data and normalization of the continuous features to the range [0, 1]. This normalization was not applied to the integer features to avoid transforming their space into continuous, which could compromise formal guarantees on the correctness of the algorithm. As far as we know, such a methodology was not considered in earlier works.

The ANNs training was accomplished using a batch size of 4 and a maximum of 100 epochs, applying early stopping regularization with 10 epochs based on validation loss. The optimization algorithm used was Adam and the learning rate was 0.001. The datasets split was \(80\%\) for training and \(20\%\) for validation. The ANN architectures were limited to 2 layers to reduce the total running time, because many solver calls were performed in the experiments due to the large number of instances. Each solver call deals with an NP-complete problem, therefore, impacting the experiments running time.

The first experiment compared the two encodings presented on the 12 datasets. The architecture of the trained ANNs is two hidden layers with 20 neurons each. For each dataset and the associated ANN, the explanation of each instance was obtained using the Algorithm 1 and both presented encodings. The second experiment performed the comparison of the two encodings presented using the voting dataset of the first experiment. This experiment was conducted in two cases. In the first case, the trained ANNs consist of one hidden layer with the number of neurons ranging from 10 to 100. In the second case, the ANNs consist of two hidden layers such that both layers contains the same number of neurons, ranging from 10 to 40. In both cases, the number of neurons in the layers increases in increments of 5. Again, the explanations of each instance was obtained using Algorithm 1 and both presented encodings. The objective of this experiment is to verify the influence of the number of layers and the number of neurons in both encodings.

5.2 Results

Table 1. Comparison of both encodings.

The results of the first experiment are shown in Table 1. For each dataset, its number of features is indicated in parentheses. The column Exp (s) refers to the average running time for computing explanations in seconds, and the standard deviation is also presented. The column Build (s) refers to the average running time, in seconds, for building the logical constraints of the trained ANN. The running time for finding the bounds of variables is included in the time for building the encodings.

Despite the encoding by Fischetti and Jo [5] achieved a better average running time for computing explanations in 7 out of 12 cases, both encodings generally perform similarly when considering the variability of the time. For instance, in the spect dataset, the average execution time of the encoding by Fischetti and Jo [5] (2.64 seconds) falls within the range of the average minus the standard deviation of our approach (\(3.21 - 1.81\)). This pattern is observed across several datasets. In summary, the results indicate that both encodings generally perform similarly in terms of running time for computing explanations, with minor variations across different datasets.

With respect to the average running time for building the logical constraints, our adaptation generally outperformed the encoding by Fischetti and Jo [5]. However, there were exceptions noted, such as in the spect dataset. Overall, our adaptation achieved an improvement of up to \(18\%\) compared to the other one, as seen in the heart-statlog dataset. In summary, the results indicate that our adaptation is consistently more efficient than the other approach for building logical constraints. The variability in the results, as shown by the standard deviations, further supports this conclusion. For instance, the average running time of our proposal is less than the average minus the standard deviation of the other approach in 9 out of 12 datasets. These cases are highlighted in bold in Table 1. It is important to note that, in the spect dataset, the average running time of the encoding proposed by Fischetti and Jo [5] is less than the average minus the standard deviation of our adaptation. This indicates that, while our proposal appears generally more efficient, specific dataset characteristics can influence which encoding is more advantageous. Our adaptation yielded notably superior results not only in the average time for building logical constraints but also in the overall time, which includes both computing explanations and constructing logical constraints. For instance, it achieved an improvement of up to \(16\%\) compared to the other approach, as seen in the heart-statlog dataset.

Fig. 2.
figure 2

Comparison of average running time for computing explanations, using ANNs with one hidden layer and the voting dataset.

Fig. 3.
figure 3

Comparison of average running time for computing explanations, using ANNs with two hidden layers and the voting dataset.

The results of the second experiment are shown in Figs. 2 and 3. The x-axis shows the number of neurons in each hidden layer in both Figures. While the y-axis shows the average running time for computing explanations. Furthermore, the standard deviation is indicated by the shaded region in both Figures. Figure 2 refers to ANNs with one hidden layer, while Fig. 3 refers to ANNs with two hidden layers.

Figure 2 suggests that our proposal, for ANNs with one hidden layer, achieved a superior average running time for computing explanations in the voting dataset. This encoding outperforms the other approach with percentage improvements ranging from approximately \(7.69\%\) to \(40.82\%\). The most significant improvements are observed with higher neuron counts. For instance, with 85 neurons per layer, our proposal shows an improvement of around \(40.82\%\), and with 95 neurons, the improvement is about \(31.65\%\). Moreover, our proposal generally exhibits similar or lower standard deviations, indicating more consistent performance across different number of neurons.

On the other hand, Fig. 3 depicts comparable results between both encodings for ANNs with two hidden layers. Moreover, similar standard deviations were achieved for both encodings, indicating comparable levels of consistency in performance. The variability increases significantly with more complex networks. For example, with 35 and 40 neurons per layer, both encodings exhibit significant variability. Furthermore, with 30 neurons per layer, our adaptation shows higher variability compared to the other approach.

6 Conclusions and Future Work

Explanations for the outputs of ANNs are fundamental in many scenarios, due to critical systems, data protection laws, adversarial examples, among others. Therefore, several heuristic methods have been developed to provide explanations for the decisions made by ANNs. However, these approaches lack guarantees of correctness and may also produce redundant explanations. Logic-based approaches address these issues but often suffer from scalability problems.

In this work, we compare two logical encodings of ANNs: one has been used in the literature to provide explanations [5, 8], and another [17] that we have adapted for our context of explainability. Our experiments indicate that both encodings have similar running times for computing explanations, even as the number of neurons and layers increases. Furthermore, our experimental results suggest that our adaptation is generally more efficient for ANNs with one hidden layer, while the performance advantage diminishes for ANNs with two hidden layers. However, our proposal achieved a better running time for building the encoding for ANNs with two layers, showing an improvement of up to \(18\%\). This can help to decrease the scalability issue for building the logical constraints given an ANN. Furthermore, this encoding obtained better results also in the overall time, i.e., the time for computing explanations plus the time for building the logical constraints, showing an improvement of up to \(16\%\).

In the experiments of this work, we considered all instances of the datasets used, which considerably increased the experiments running time. As future work, we can change the design of experiments, using only a subset of the datasets, to allow the use of larger ANNs. More experiments are necessary, especially with additional layers and neurons, to further validate our findings and understand the performance of these encodings. Furthermore, others encodings [9, 10] can be evaluated for computing logic-based minimal explanations for ANNs. Moreover, in order to improve the scalability of computing logic-based explanations, the ANNs can be simplified, before or during building their encodings, via pruning or slicing as proposed by [12]. This results in equivalent ANNs with smaller sizes.