A new approach for rule extraction of expert system based on SVM


A new approach for rule extraction of expert system based
on SVM

Ai Li, Guo Chen ⇑
College of Civil Aviation, Nanjing University of Aeronautics & Astronautics, Nanjing 210016, PR China

a r t i c l e i n f o

Article history:
Received 18 October 2012
Received in revised form 9 August 2013
Accepted 19 August 2013
Available online 14 September 2013

Keywords:
Support Vector Clustering
Support Vector Machine
Rule extraction
Knowledge acquisition
Expert system
Genetic Algorithm
Feature selection

a b s t r a c t

Based on the SVM’s excellent generalization performance, a new approach is proposed to
extract knowledge rules from Support Vector Clustering (SVC). In this method, the first step
is to choose the features of the sample data by using Genetic Algorithm for improving the
comprehensibility of the knowledge rules. Then the SVC algorithm is adopted to obtain the
Clustering Distribution Matrix of the sample data whose features have been chosen.
Finally, hyper-rectangle rules are constructed using the Clustering Distribution Matrix.
To make the rules more concise, and easier to explain, hyper-rectangle rules are simplified
further by using rules combinations, dimension reduction and interval extension. In addi-
tion, the SMOTE (Synthetic Minority Over-sampling Technique) algorithm is adopted to
resample fault samples in order to solve the serious imbalance problem of samples. The
UCI datasets are used to validate the new method proposed in this paper, the results com-
pared with other rules extraction methods show that the new approach is more effective.
The new method is used to extract knowledge rules for aero-engine oil monitoring expert
system, and the results show that the new method can effectively extract knowledge rules
for expert system, and break through the bottleneck in expert system knowledge dynamic
acquisition.

� 2013 Elsevier Ltd. All rights reserved.

1. Introduction

At present, knowledge acquisition through data mining
[1,2] occurs mainly through machine learning or statistics.
Correlation analyses [3], artificial neural networks [4],
rough sets [5], and decision trees [6] are extensively em-
ployed for data mining. If data mining is applied to an ex-
pert system and if the knowledge rules are extracted
automatically from real data, then the intelligence level
and knowledge acquisition ability of the expert system will
be greatly improved.

In recent years, the Support Vector Machine (SVM) [7]
has become an emerging classification technology in data
mining. The SVM can approximate any continuous
bounded nonlinear function because of the perfect general-

ization theory and strong nonlinear mapping ability. The
SVM has several advantages over the neural network, such
as better generalization ability, no local minimum prob-
lem, the ability to automatically construct the learning ma-
chine, no dimension curse, and the ability to deal with
small samples. These advantages have caused data mining
technology based on SVM to receive the attention of
researchers worldwide. Furthermore, a number of promis-
ing SVM rule extraction algorithms published to date [8–
14] are not only simple but also broadly applicable. Nunez
et al. [9] introduced a rule extraction approach based on
the SVM, in which K-means clustering is used to obtain
clustering centers, which are then combined with support
vectors (SVs) to define ellipsoid rules. Finally, the ‘‘if-then’’
rules can be obtained when the ellipsoid rules are mapped
to the input space. However, the generated ellipsoid rules
seriously overlap. In addition, the solution quality of K-
means strongly depends on the initial values for the cen-
ters, and it is difficult to control the quantity and quality

0263-2241/$ - see front matter � 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.measurement.2013.08.028

⇑ Corresponding author. Tel./fax: +86 025 84891850.
E-mail address: cgzyx@263.net (G. Chen).

Measurement 47 (2014) 715–723

Contents lists available at ScienceDirect

Measurement

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / m e a s u r e m e n t

http://crossmark.crossref.org/dialog/?doi=10.1016/j.measurement.2013.08.028&domain=pdf
http://dx.doi.org/10.1016/j.measurement.2013.08.028
mailto:cgzyx@263.net
http://dx.doi.org/10.1016/j.measurement.2013.08.028
http://www.sciencedirect.com/science/journal/02632241
http://www.elsevier.com/locate/measurement


of the obtained rules. In a similar study, Zhang et al. [10]
introduced the hyper-rectangle rule extraction (HRE) algo-
rithm to extract rules from the trained SVM. The authors
used the Support Vector Clustering (SVC) algorithm to find
prototype vectors for each class, and then used those vec-
tors with the SVs to generate hyper-rectangles. A nested
generalized exemplar algorithm is utilized to first con-
struct small hyper-rectangles around the prototypes,
which are then grown incrementally until the stopping cri-
teria based on a user-defined minimum confidence thresh-
old (MCT) or minimum support threshold (MST) are met.
If-then rules are then generated by projecting these hy-
per-rectangles onto coordinate axes. The published results
for this method show that the rules provide good accuracy.
However, all the features are present as antecedents of
these rules. This limits their explanation capability, since
no indication is given about the most important features
for the classification.

Based on the aforementioned limitations, here, a new
method is proposed to extract knowledge rules from SVC.
The first step in this method is to choose the features of
the sample dataset using a Genetic Algorithm (GA) for
improving the comprehensibility of the knowledge rules.
The next step is to map the chosen features of the training
samples into a high-dimensional feature space to get opti-
mal separating hyper-planes and SVs. Finally, the hyper-
rectangles are constructed using the Clustering Distribu-
tion Matrix of the data obtained by the SVC, and the if-then
rules are generated by projecting these hyper-rectangles
onto coordinate axes. In order to make the rules more con-
cise and easier to explain, hyper-rectangle rules are further
simplified using a combination of rules, dimension reduc-
tion, and interval extension. In addition, the SMOTE (Syn-
thetic Minority Over-sampling Technique) algorithm is
adopted to resample fault samples in order to solve the
serious imbalance problem of samples. Experimental re-
sults show that it is easy to control the number and the
support degree of the generated rules; feature selection
and simplification of rules can greatly improve their expla-
nation capability.

Spectral oil diagnosis expert system is the advanced
stage of aero-engine wear fault diagnosis. At present, some
oil monitoring expert systems have been developed, such
as, the advanced rapid analysis system PFALink developed
by the United States Mobil oil company, lubricating oil
analysis expert system Lube Analyst and Atlas developed
by the United States and Canada. But these software only
provides a framework and management system, and the

users need to develop its core knowledge base and provide
the monitored wear element threshold value. In the intel-
ligent diagnosis expert system, these problems, such as
weak knowledge acquisition, hard knowledge updating
and poor knowledge adaptability, still did not get effective
to be overcome. The expert system knowledge acquisition
is basically by means of the mechanical learning methods
based on the experiences. The knowledge is hard to update
and the rules exist serious problems such as inconsistent,
redundancy, and combination explosion. Therefore, in this
paper, the new method is applied to the knowledge acqui-
sition of aero-engine spectral oil diagnosis expert system.
Experimental results to real dataset show the effectiveness
and the correctness of the new method.

2. Knowledge rules extracting method based on GA_SVC

The rule extraction process includes data preprocessing,
SVC, hyper-rectangle rule extraction and rule simplifica-
tion. The entire rule extraction procedure is shown in
Fig. 1.

2.1. Data preprocessing

2.1.1. Dalancing to unbalance data
In data mining experiments, the datasets are usually as-

sumed to balance distribution, which is the number of var-
ious types of samples is almost the same, while it is almost
non-existent in the real. In many real datasets, the number
of class with different label is unequal. These datasets are
called unbalanced datasets. Usually, the minority class
samples will be taken out as noise so that no rules about
the minority class can be extracted. Therefore, in order to
extract rules of various types of samples completely, and
improve the recognition rate of the rules, the first step is
to preprocess the unbalance data into balance data before
rules extraction.

In this paper, we resample fault samples by using Syn-
thetic Minority Over-sampling Technique (SMOTE) which
is the typical sampling algorithm. SMOTE [15] algorithm
is an over-sampling method put forward by Chawla. In or-
der to make the dataset be equilibrium, the main concept
of the method is to use k neighbor method and linear inter-
polation method to insert new samples according to cer-
tain rules between the two closer samples of minority
class. In Fig. 2, a two dimensional example {X = (x1, x2)} is
enlarged by using SMOTE over-sampling method. It can
be seen from Fig. 2 that the new re-sampling samples focus

Fig. 1. Rules extraction procedure.

716 A. Li, G. Chen / Measurement 47 (2014) 715–723


https://isiarticles.com/article/52631