PII: 0957-4174(95)00013-Y


Pergamon 
Expert Systems With Applications, Vol. 9. No. 3. pp. 407-421, 1995 

Copyright 0 1995 Elsevier Science Ltd 
Printed in the USA. All tights reserved 

0957-4 I74/95 $9.50 + .Oa 

0957-4174(95)00013-5 

Building a Fuzzy Expert System for Electric Load 
Forecasting Using a Hybrid Neural Network 

I? K. DASH AND A. C. LIEW 

National University of Singapore, 10, Kent Ridge Crescent, Singapore 

S. RAHMAN 

Department of Electrical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA 

G. RAMAKRISHNA 

Regional Engineering College, Rourkela-769 008, India 

Abstract-This paper presents the development of a hybrid neural network to model a fuzzy expert 
system for time series forecasting of electricc load. The hybrid neural network is trained to develop fuzzy 
logic rules andjind optimal inputloutput membership values of load and weather parameters. A hybrid 
learning algorithm consisting of unsupervised and supervised learning phases is used for training the 
fuzzifred neural network. In the supervised learning phase, both back-propagation and linear Kalman 
jilter algorithms are used for the adjustment of weights and membership functions. Extensive tests have 
been performed on a 2-year utility data for the generation of peak and average loadprojiles in 24 h. 48 h. 
and 168 h ahead time frame during summer and winter seasons. From the simulation results, it is 
observed that the fuzzy expert system using the Kalman $lter-based algorithm gives faster convergence 
and more accurate prediction of a load time series. 

1. INTRODUCTION 

IT IS WELL KNOWN that fuzzy logic provides an inference 
strategy that enables approximate human reasoning 
capabilities to knowledge-based systems. Also, it pro- 
vides a mathematical morphology to emulate certain 
perceptual and linguistic attributes associated with 
human cognition. Although fuzzy theory provides an 
inference mechanism under cognitive uncertainty, com- 
putational neural networks offer exciting advantages 
such as learning, adaptation, fault tolerance, parallelism, 
and generalization. The computational neural networks, 
comprising processing elements called neurons, are 
capable of coping with computational complexity, non- 
linearity, and uncertainty. 

To enable a system to deal with cognitive uncertain- 
ties in a manner more like humans, one may incorporate 

Requests for reprints should be sent to P K. Dash, Centre for Intelligent 
Systems, Electrical Engineering Department of Regional Engineering 
College, Rourkela 769008, India. 

the concepts of fuzzy logic into neural network. 
Although fuzzy logic is a natural mechanism for 
modelling cognitive uncertainty, it may involve an 
increase in the amount of computation required. This can 
be readily offset by using fuzzy neural network 
approaches having the potential for parallel computa- 
tions with high flexibility. 

The application of an artificial neural network (ANN) 
and fuzzy logic-based decision support system to time- 
series forecasting has gained much attention recently. 
ANN-based load forecasts give large errors when the 
weather profile changes very fast. Also extremely slow 
training or even training failure occurs in many cases due 
to difficulties in selecting proper structures of the neural 
network paradigm being used, and due to the errors in 
associated parameters such as learning rates, activation 
functions, etc., which are fundamental to any back- 
propagation neural network. On the other hand, the 
development of a fuzzy decision system (fuzzy expert 
system) for load forecasting requires detail analysis of 
data, and the fuzzy rule base has to be developed 

407 


408 P. K. Dash et al. 

heuristically for each season. The rules fixed in this way 
may not always yield the best forecast. Thus, a hybrid 
neural network model is used that combines the idea of 
fuzzy logic-based decision system and neural network 
structure and learning abilities into an integrated frame- 
work. Such a hybrid model provides human 
understandable meaning to the normal feedforward 
neural network in which the internal units are always 
opaque to the users. This structure also avoids the rule 
matching time of the inference engine in the traditional 
fuzzy logic system and results in enhanced learning 
speed and prediction accuracy. 

The present work is aimed at achieving the said 
objective of a robust load forecast with much improved 
accuracy using a fuzzy expert system modelled by the 
hybrid ANN architecture. The two types of fuzzy expert 
system models, abbreviated FES, and FES, are based on 
the argument that any fuzzy expert system employing 
one block of rules may be approximated by a neural 
network (feedforward, multilayered). The input vector to 
FES, and FES, consists of differences in weather 
parameters between the reference and the forecasted day 
or hour. The output of the FES, and FES, gives the load 
correction that, when added to the past load, gives the 
forecasted load. The learning algorithm for FES, and 
FES, combines unsupervised and supervised learning 
procedures to build the rule nodes and train the 
membership functions. The supervised learning proce- 
dure for FES, uses a gradient back-propagation 
algorithm for finding the optimum weights and member- 
ship functions. Although for FES,, the supervised 
learning procedure uses a linear Kalman filter-based 
learning algorithm suggested by Scaler0 and Tepede- 
lenlioglu (1992), which is similar to the least square 
adaptive learning techniques. The least square adaptive 
filtering techniques are known to have rapid convergence 
properties over the back-propagation algorithm. 

A few examples of peak load forecasting and daily 
average load forecasting of a typical utility with 24 h and 
168 h lead time are shown in this paper. The approach 
presented in this paper is also applicable to load forecasts 
with longer lead times. 

2. FUZZY EXPERT SYSTEM 

2.1. Fuzzy Statements 

This section portrays inexact information presented by 
fuzzy statements, and explains both fuzzy conditional 
statements and inference mechanisms. Typically, an 
engineering model requires the use of exact or mathe- 
matical statements. These statements correspond to 
precise information such as x=3.0,2<8<0, ory=3t+24. 
The value of x = 3.0 has a grade membership comparable 
to 100% (= l), for all other values (2.8,2.9, 3.1,3.2), the 
grades of membership in the solution is zero. In the case 
of real world values, however, this grade of membership 

is not true due to the imprecision of tools, the influence 
of the observer, etc. Additionally, human reasoning is 
often imprecise, that is, inexact statements such as, “the 
value of x is not big” or “the temperature is about 4”.” 
Therefore, a theory to correctly express the grade of 
membership is desirable. 

The Fuzzy Set Theory allows manipulation of both 
exact and inexact (fuzzy) statements. This is very 
important for load forecasting because there are so many 
fuzzy factors that are difficult to characterize by a 
number. An instance of this could be weather conditions 
such as temperature, humidity, cloud cover, etc. For 
example, “the temperature this morning will be about 
ll’.” Temperature is an important factor in load 
forecasting, but is not easy to characterise by an exact 
numerical quantity. In addition, the linguistic hedges 
(such as small, medium, large) can be modified by other 
linguistic hedges (such as not, very, very very) (Zadeh, 
1965) For example, “value of x is not big,” where big is 
a linguistic hedge and not a modifier. 

2.2. Fuzzy Rule Base 

A fuzzy expert system consists of a collection of fuzzy 
IF-THEN rules. For example, the rule base R is written 
as 

R = [R’, R=, . . ., R”] (1) 

with 

R’:IF (x, is F{ and . . and xp is FL) 

THEN (y, is G: and . . . y4 is Gb) (2) 

where x = (x, , . . ., x,JT and y= (yl, . . ., y,)r are the input 
and output vectors to the fuzzy system, respectively. Fi 
and GI are labels of fuzzy sets Vi and V,, respectively, and 
l<p~n;l~q~m,andl=1,2 ,..., M. 

2.3. Fuzzy Inference Engine 

A fuzzy inference engine uses the rules in the fuzzy rule 
base to determine a mapping from fuzzy sets in U to 
fuzzy sets in V based on fuzzy logic operations. In the 
fuzzy inference engine, the IF part of Rj defines a 
Cartesian product of F{, . . ., Fi and the Rj itself is viewed 
as a fuzzy implication F: x . . . x Fi*G:. If A is an 
arbitrary fuzzy set in U, then each R’ of eqn. (2) 
determines a fuzzy set AR’ in V based on the sup-star 
composition 

WW(YA.RO = supr; t JCLA(X . Y,) 

F,* . . . *F;*G; 

= supx E &,4(~*rLL,;(x,)* . . . *~Fpp)*PG~(Y,>l 

(3) 

where y, E V,. 


Hybrid Neural Nehvork for Fuzzy Expert System 409 

The final fuzzy set AoR, (R,= [R,!, . . . , RF]) in V, 
determined by the fuzzy inference engine is obtained by 
combining eqn. (2) for 1= 1,2, . . ., M using the triangu- 
lar co-norm 

~~dt,(Y,) = PA-R,! (Y,) + . . . + PA++ (YJ>. (4) 

2.3.1. Fuzzijication and Defuzzification. In many cases 
it is convenient to express the membership function of a 
fuzzy subset in terms of a standard nonlinear function. 
The following Gaussian membership function suggested 
by Lin and Lee (1991) is used for the fuzzification of 
input and output linguistic parameters of the fuzzy expert 
system: 

PA,(x) = exp -Iv 
i I (5) 

where a and h are the center and width of the Gaussian 
membership function, respectively. 

The defuzzifier is needed to give a crisp output for 
any practical application like forecasting. The output of 
the fuzzy inference engine is a fuzzy set AQR in V, 
therefore, the defuzzifier maps AoR into a crisp point y E 
V. The most commonly used centroid defuzzifier is 
defined as 

(6) 

3. FUZZY EXPERT SYSTEM MODEL USING 
NEURAL NETWORK APPROACH 

An alternative to the neural network-based load forecast 
is the expert system approach. A fuzzy expert system 
approach for load forecast consists of rules similar to the 
rules R’, . . ., RM given in Section 2.2. One of the 
difficulties with the fuzzy expert system is the rule 
matching and composition time, apart from the time- 
consuming process of adapting the rules. The neural 
network eliminates the rule matching process and stores 
the knowledge in the link weights. The decision signals 
can be pumped out immediately after the input data are 
fed in. Figure 3 shows the proposed hybrid neural 
network to model the fuzzy expert system using the ANN 
architecture. The fuzzy expert system clusters the 
differential temperatures and humidities of the ith and 
(i + n)th days into fuzzy term sets. Here n is the lead time 
for the forecast (i.e., n = 24 for 24 h ahead forecast, 
n = 48 for 48 h ahead forecast, and so on). The output of 
the expert system is the final load correction (ELc). 

Hence, the forecasted load on (i + n)th day (PAi + n)) is 
given by: 

P,(i + n) = P(i) +6,,(i). (7) 

The fuzzy expert system modelled as a hybrid neural 
network has a total of five layers. Nodes at layer 1 are the 
input linguistic nodes. Layer 5 is the output layer and 
consists of two nodes [one is for the actual load 
correction (&) and the other is the desired load 
correction (e&l. Nodes at layer 2 and 4 are term nodes 
that act as membership functions to represent the term 
sets of the respective linguistic variable. Each node at 
layer 3 represents the preconditions of the rule nodes, 
and layer 4 links define the consequence of the rules. The 
functions of each layer is described as follows: 

a> 

b) 

cl 

Layer 1: The nodes in this layer just transmit the 
input feature x,, i = 1, 2 to the next layer. 
Layer 2: Each input feature x,, i = 1, 2 is expressed in 
terms of membership values &a,,, l~,~), where i 
corresponds to the input feature and j corresponds to 
the number of term sets for the linguistic variable x,. 

The membership function &, uses the Gaussian 
membership function (5). 
Layer 3: The links in this layer are used to perform 
precondition matching of fuzzy logic rules. Hence, 
the rule nodes perform the product operation (or AND 
operation). 

pRuRp = n pJ I, (8) 

where R,= 1, 2,. . ., n. R, corresponds to the rule 
node and n is the maximum number of rule nodes. 
However, if the fuzzy AND operation is used 

d) Layer 4: The nodes in this layer have two operations 
(i.e., forward and backward transmission). In forward 
transmission mode, the nodes perform the fuzzy OR 
operation to integrate the fired rules, which have the 
same consequence: 

p4=C14; (10) 
,=I 

where p corresponds to the links terminating at the 
node. In the backward transmission mode, the links 
function exactly the same as the layer 2 nodes. 

e) Layer 5: There are two nodes in this layer (i.e., for 
obtaining the actual and desired output load correc- 
tion, respectively). The desired output load correction 
(e,J is fed into the hybrid neural network during 
learning whereas the actual load correction (a,,) is 
obtained by using the centroid defuzzification method 
(6). 


410 

4. HYBRID LEARNING ALGORITHM FOR THE 
FUZZY EXPERT SYSTEM (FES) 

The hybrid learning scheme consists of two phases. In 
phase I, the initial membership functions of the input and 
output linguistic variables are fixed and an initial form of 
the network is constructed. Then, during the learning 
process, some nodes and links of this initial network are 
deleted or combined to form the final structure of the 
network. In phase II learning, the input and output 
membership functions are optimally adjusted to obtain 
the desired outputs. Phase I represents the unsupervised 
learning phase. Phase II represents the supervised 
learning phase. 

4.1. Phase-I: Unsupervised Learning Phase 

Given the training input data x,(r), i = 1,2, the desired 
output load correction (e,(r)) and the fuzzy partitions 
I&,1. We want to locate the membership functions (i.e., 
aij and b,) and find the fuzzy logic rules. 

The Kohonen’s feature maps algorithm (Lin & Lee, 
1991) is used to find the values for ai, and b,: 

ai, clo~est(~ + l) = ui, doscAr) + tit) 
[ (‘)I xi(r)-ui, closest 

u,(r + 1) = uij(t) for uii f a,, C,Oscst 

(11) 

(12) 

(13) 

where rl(f) is the monotonically decreasing learning rate 
and t= IT( (i.e., the number of term sets for the 
linguistic variable xi). 

This adaptive formulation runs independently for 
each input linguistic variable xi: 

The width, b, is determined heuristically at this stage 
(Lin & Lee, 1991) as follows: 

b,j = Ia, jwui, closest1 
r 

(14) 

where r is an overlap parameter. After the parameters of 
the membership functions have been found, the weights 
in layer 4 are obtained by using the competitive learning 
algorithm (Rumelhart & Zipses, 1985) as follows: 

w, = Ll;<u; - wi,, (1% 

where LJ; serves as the win-loss index of the rule node 
at layer 3 and Ll;’ serves as the win-loss index of the jth 
term node at layer 4, respectively. 

After the competitive learning through the whole 
training data set, the link weights at layer 4 represent the 

P. K. Dash et al. 

strength of the existence of the corresponding rule 
consequence. If a link weight between rule node and the 
term node of the output linguistic node is very small, 
then all the corresponding links are deleted, meaning that 
this rule node has little or no relation to the output. 

After the consequences of rule nodes are determined, 
the rule combination is performed to reduce the number 
of rules in the following manner. The criteria for the 
choice of rule nodes are: 

1. 
2. 

3. 

they have the same consequences, 
some preconditions are common to all the rule nodes 
in this set, 
the union of other preconditions of these rule nodes 
composes the whole term set of some input linguistic 
variables. 

The rule nodes that satisfy these criteria are replaced 
by a new rule node with common preconditions. 

4.2. Phase II: Supervised Learning Phase 

Once the fuzzy logic rules have been found, the phase II 
learning is used to find the optimum weights and the 
input and output membership functions by using the 
gradient descent backpropagation algorithm (Scaler0 & 
Tepedelenlioglu, 19921. 

a) The weight update equation,for layer 4 is: 

Wij(f) = W;,(t)+ 7) g . 
[ I ‘I 

The error function E is given by 

Because 

c (@i&d 
c,, = 

c bijd 
using centroid defuzzification method (6) 

P 

/+p+L; w, 
i=l 

where W,= 1 fort= 1. 
Now, 

dE aE ae^,* -=-- 
a w, a;,, a CL,’ a w;,’ 

(16) 

(17) 

(18) 

(19) 

(20) 


Hybrid Neural Network for Fuzzy Expert System 411 

Therefore, 

$ = [eLcW - 4&)1 
V 

Training the output membership functions at layer 5: 

-dE 
a,,0 + 1) = U;,(f) + 771 

[ 1 __ a 4, (22) 

where (a,,, b,,) correspond to the output term set. 
The error signal at layer 3 is found by performing the 

summation over the consequences of a rule node (i.e., 
layer 4). Therefore. 

@=Xcsp. (2% 

The adaptive rule for a;, (layer 2) is derived as: 

-l!iE 
u&t + 1) = ujjt) + r), 

[ 1 __ a u,, 
where qj is the learning rate. 
Now, 

where n1 is the learning rate. 
Now, 

aE aE ae^, 

G= a;,, au,,’ 

Therefore, using eqns. (17) and (18) we get 

(23) Also, 

S2= CS3, 

(30) 

(31) 

(32) 

a,,(t + 1) = a,(r) + ~l(eK(~)-4_c(t)) . (24) in a similar way as eqn. (29). 
Therefore, 

Similarly, 
(33) 

b,,(t + 1) = h,(t) = 712 (25) 
Similarly, the adaptive rule for b,, (layer 2) is derived as 

where v2 is the learning rate. 
Now, 

aE aE ac, 

a==ae^,. 
(26) 

Therefore, using eqns. (17) and (18) we get 

h,,(t + 1) = h,(t) + ~2h_C(t) -%&)) 

c) Tuning the input membership functions at layer 2: 
The error signal # at layer 4 is given by 

b,,(t+ l)=bi,(~)--~~#e( -(~~~‘)‘)r~$~}. (34) 

The convergence speed of the phase 11 supervisory 
learning scheme for fuzzy expert systems is found to be 
superior to the supervisory learning scheme for ordinary 
fuzzy neural network approach (Dash et al., 1993) 
because the phase I unsupervised learning process had 
done much of the learning in advance. The convergence 
speed of the supervised learning process can be 
improved further by solving the weight update equations 
at layer 3 and the input and output membership function 
at layer 1 and layer 2 by linear Kalman filter equations 
(Singhal & Wu, 1989). 

5. FUZZY EXPERT SYSTEM USING LINEAR 
KALMAN FILTER (FE&) 

Referring to Figure 1, the tuning of Gaussian member- 
ship function at layers 2 and 4 (a,,, b,) is similar to the 
weight update equations at layer 3. In (Scaler0 & 
Tepedelenlioglu, 1992) it has been shown that the 


412 P. K. Dash et al. 

III-I- 
Layer- 1 Layer-2 Layer-3 Layer-4 Layer-5 

(input linguistic (input (rule nodes) (output term (output linguistic 
nodes) terms nodes) nodes) 

nodes) 

AtJ - Differential temperature t - Iteration No. 
AH- Differential humidity Wij - weight 
OLC - Actual load correction eLC - Desired load correction 

FIGURE 1. Hybrid neural network modelled as a fuzzy expert system. 

presence of nonlinear function (in this study the 
Gaussian membership function) between the weights and 
the error makes the back-propagation algorithm different 
from standard least square adaptive filtering techniques, 
which are known to have rapid convergence (Haykin, 
1986). 

The FE& model uses the linear Kalman filter 
equations for updating the weights and the membership 
function. Unlike the back-propagation technique, this 
algorithm assumes that the estimated weight matrix is 
nonstationary and hence will allow the tracking of a time 
varying data like that of load forecasting. 

This algorithm defines locally at each node a gradient 
based on present and past data, and updates the weights 
of each node using the linear Kalman filter equations to 
bring this gradient identically to zero whenever an 
update is made. Performing the update and defining the 
gradient in this manner ensures that maximum use is 
made of available information. 

The gradient for the linear combiner in each is defined 
as 

G=R W-C. (35) 

The weight vector that makes G = R W-C zero is the 
solution to the normal equations. 

Here R is the auto correlation matrix for each layer 
and is calculated as 

NP 

R = c fNPmnpxnp XI;, 
flp=l 

and C is the cross-correlation matrix and is given by 

NP 

C = c f”‘-” dnp XI;, 
flp=l 

where NP denotes the total number of patterns. dnp and 
xnp are the summation output and the output of the 
nonlinearity (Gaussian membership function), respec- 
tively, for the layer 2 and layer 5 nodes, respectively. As 
the layer 4 nodes contain no nonlinearity term, dnp = xnp. 
The weights given by eqn. (35) are updated by using a 
Kalman filter at each layer with a variable forgetting 
factor (J as shown in eqns. (36) and (37). A variable 
forgetting factor is used to take care of the new estimates 
giving less weightage to old estimates. 

The Kalman gain vector K, (t) at each layer j is given 

by 

K,(t) = 
RF'(f)xi(r) 

+2(r) R,-‘(o-w I 
I 

(38) 

where x,(t) corresponds to the previous layer, ct, is the 
learning rate, and t denotes the iteration number. The 
error, E, at each iteration is obtained from eqn. (17). The 
first term in eqn. (38) expresses the Kalman gain, K,, in 


Hybrid Neural Network for Fuzzy Expert System 413 

TABLE 1 
The Learned Fuzzy Logic Rules for 24 Hour Ahead 

Peak Load Forecasting Using FE!& in Winter 
_--_ 

Term Sets 

Preconditions Consequence 

Rule A &,,(i, i+ 1) AH,,,(i, i+ 1) &(I) 
-- 

0 0 3 7 
1 0 4 7 
2 1 0 8 
3 1 1 7 
4 1 2 7 
5 1 3 6 
6 1 4 6 
7 2 0 8 
8 2 1 7 
9 2 2 7 

10 2 3 6 
11 2 4 7 
12 3 1 2 
13 3 2 4 
14 3 3 6 
15 4 2 5 
16 4 3 6 
17 4 4 1 
18 5 0 3 
19 5 1 2 
20 5 2 1 
21 5 3 1 
22 5 4 0 
23 6 0 1 
24 6 1 1 
25 6 2 0 

terms of the inverse autocorrelation matrix R, and the 
input to the previous layer x,. The second term is used to 
track the error along a gradient descent surface to attain 
the final convergence (similar to tuning of membership 
functions). This helps in a faster rate of convergence. 

During training the network is randomly initialized. 
The original training pairs become obsolete as the 
adjacent layers adapt their weights according to their 
perceived observations. Hence, the R,(t) matrix for a 
particular layer, containing the sum total of all prior 
observations, becomes outdated. Because the Kalman 
filter minimizes the prediction effort of all prior observa- 
tions, parameter estimates become biased towards initial 
conditions and the solution to this problem is the 
modification of the covariance matrix and the forgetting 
factor in the following manner: 

f,(r + 1) =fi- Y, w> 
R,-‘U+ 1) = [R,-‘W-K,(O$(r) R,- ‘(W/J; 

where 

(39) 

(40) 

and ‘y, is a constant that is inversely proportional to the 
prediction error, e(r), at that layer. As a result,& remains 
close to 1 when R, is already large and the weight 
estimates are sensitive to parameter variations. A lower 
bound fO and f, is used to prevent the forgetting factor 
from becoming excessively small, resulting in large 
estimate fluctuations in spite of small prediction errors. 
The strategy for employing this forgetting factor is to 
initially allow it to assume small values during learning 
such that the initial conditions are quickly forgotten. 
However, as the average prediction error decreases, it is 
important to cause 6 to remain near one to avoid the 
winding up of the weight estimates. This can be simply 
implemented by setting y, smaller as the mean squared 
error decreases. 

The unsupervised learning phase of FES, is same as 
for FES,. The supervised learning phase of FES, is 
modified using the linear Kalman filter equations instead 
of the back-propagation method used for FES,. There- 
fore, the update equations are modified as follows: 

a) The weight update equation for layer 4 is: 

P(f) = - 
r(O 

1 +x;(t)R,-‘(t)x,(t) 
(41) (42) 


414 P. K. Dash et al. 

where 8 El 8 W, is given by eqn. (21) and K,(t) is given 
by eqn. (38). 

b) The update equations for a,, and b,, at layer 5 are: 
b;,(t + 1) = b,,(t) + rl,K,(O 

i 1 2 V (4.4) 
where a El a d,, is given by eqn. (26) and K,(t) is given 
by eqn. (38). 

c) The update equations for a,, and b,j at layer 2 are: 
a,,@ + 1) = a,,(t) + rll K,(t) 

-aE 

[ 1 a (43) 
u&t + 1) = a,(t) + T3K,(t) 

[ 1 c (45) V where d El i_l a,_ is given by eqn. (23) and K,(t) is given by eqn. (38). 
Maximum temperature difference 

(after supervised learning) 
Maximum temperature difference 

(after unsupervised learning) 

1.0 

2 
0.8 

E 
: 0.6 
2 
c 
2 0.4 

5 
E 0.2 

0 
-12 -8 -4 0 4 8 12 -12 -8 -4 0 4 8 12 

(W 09 

Maximum humidity difference 
(after unsupervised learning) 

Maximum humidity difference 
(after supervised learning) 

0 
-55 -15 25 

(%I (%) 

Peak load difference Peak load difference 
(after unsupervised learning) (after supervised learning) 

1 .o 1 

4 0.8 
E 
M 
.> 0.6 
c 
:: 
2 
E 

0.4 

E 
0.2 

0 

1 .o 

4 0.8 
E 
M 
2 0.6 
@ 
2 E 0.4 

E 
0.2 

0 
-300 -180 -60 60 180 300 -300 -180 -60 60 180 300 

(MW (MW 

FIGURE 2. Learned membership functions for 24 h ahead peak load forecasting, in January (winter) using FES,. 


Hybrid Neural Network for Fuzy Expert System 415 

25 r 

n 

- ANN 

----- FES, 
. . . . . . . . . . FESZ 

‘!, r 
I 
i 
1 

i L .."\ '; I, :\ .'. . . 
I 400 800 1200 1600 2000 2400 

Number of iterations 

FIGURE 3. Comparison of the mean absolute percentage 
errors versus the iteration number for 24 h ahead forecast in 

January (winter). 

where, SE/ 8 a,] is given by eqn. (31) and K,(t) is given 
by eqn. (38). 

Similarly, 

b&r + 1) = b,,(t)- ~4K,W~e 

(46) 

where df is given by eqn (32) and K,(t) is given by eqn. 
(38). 

The convergence properties of the supervisory leam- 
ing scheme for FES, are found to be superior to FES, 
using the Kalman filter equation in the weight and 
membership update equations, as shown in Sections 6.3 
and 6.4. Also, the accuracy of the forecast increases for 
a similar reason. 

R,-‘(t+ 1) =$(t)R,-‘(t) +$(t)x,(t) 

and the forgetting factorA is modified as 

f;(t+l)=l-yt?(t+l) 

(46) 

(47) 

TABLE 2 
Peak Load Forecasting in June (Summer) Using 24 Hour Ahead Forecast 

Peak Peak Peak 
Actual Forecast Forecast Forecast 
Load NJ) 

,K) 
(FEW 

&SE,, 
(FEW 

Day (MW (MN (MN WW &SE,, 

1 
2 

: 
5 

; 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 

1848 
1850 
1852 
1687 
1573 
1752 
1679 
1674 
1712 
1778 
1614 
1520 
1730 
1663 
1722 
1710 
1806 
1730 
1622 
1752 
1658 
1672 
1701 
1736 
1588 
1482 
1720 
1699 
1684 
1693 30 

MAPE 
Iterations 
required 

1872.8 -1.34 
1883.7 -1.82 
1827.8 1.31 
1627.6 3.52 
1499.6 4.67 
1752.4 -0.02 
1661.4 1.05 
1726.6 -3.14 
1750.3 -2.24 
1858.3 -4.52 
1660.4 -2.87 
1456.7 4.16 
1705.0 1.45 
1709.5 -2.80 
1755.8 -1.96 
1745.5 -2.08 
1838.3 -1.79 
1788.2 -3.37 
1584.9 2.29 
1775.7 -1.35 
1648.6 0.57 
1684.1 -0.72 
1643.7 3.37 
1669.5 3.83 
1556.1 2.01 
1413.4 4.63 
1672.0 2.79 
1750.4 -3.02 
1666.6 1.03 
1757.5 -3.81 

2.45 
970 

1823.3 
1844.7 
1833.6 
1669.7 
1539.4 
1747.6 
1671.1 
1652.6 
1686.0 
1752.7 
1620.6 
1474.5 
1747.4 
1674.5 
1755.7 
1681.4 
1805.1 
1757.7 
1649.1 
1772.9 
1671.4 
1685.3 
1718.8 
1702.9 
1621.0 
1463.9 
1695.7 
1708.2 
1705.2 
1676.9 

1.34 
0.29 
0.99 
1.03 
2.14 
0.25 
0.47 
1.28 
1.52 
1.42 

-0.41 
3.00 

-1.01 
-0.69 
-1.96 
1.67 
0.05 

-1.60 
-1.67 
-1.20 
-0.81 
-0.80 
-1.05 
1.91 

-2.08 
1.22 
1.41 

-0.53 
-1.26 
0.95 
1.20 
520 

1825.7 
1831.8 
1868.0 
1688.9 
1544.1 
1749.1 
1668.5 
1698.5 
1715.3 
1761.8 
1602.7 
1559.6 
1719.5 
1657.8 
1741.7 
1693.5 
1803.3 
1718.3 
1644.7 
1738.9 
1662.6 
1680.3 
1727.7 
1714.4 
1619.2 
1462.9 
1705.4 
1665.9 
1667.5 
1721.2 

1.21 
0.99 

-0.86 
-0.11 
1.84 
0.17 
0.62 

-1.46 
-0.19 
0.91 
0.70 

-2.60 
0.61 
0.31 

-1.14 
0.97 
0.15 
0.68 

-1.40 
0.75 

-0.28 
-0.50 
-1.57 
1.24 

-1.96 
1.29 
0.85 
1.95 
0.98 

-1.66 
1.00 
440 


416 P. K. Dash et al. 

TABLE 3 
Peak Load Forecasting in January (Winter) Using 48 Hour Ahead Forecast 

Peak Peak Peak 
Actual Forecast Forecast Forecast 
Load U'JN) 

Day (M'W (MW 

1 
2 
3 
4 
5 
6 
7 
8 
9 
IO 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 

MAPE 
Iterations 
required 

2690 2553.1 
2628 2554.4 
2703 2657.0 
2592 2631.9 
2530 2583.9 
2574 2590.2 
2389 2383.7 
2513 2482.6 
2500 2372.0 
2450 2366.7 
2551 2504.3 
2763 2889.0 
2603 2563.4 
2914 2781.7 
2761 2865.1 
2514 2425.0 
2543 2647.8 
2435 2500.0 
2496 2468.8 
2551 2484.2 
2813 2882.8 
2537 2427.4 
2381 2282.2 
2459 2378.3 
2505 2487.0 
2429 2500.2 
2438 2320.7 
2748 2771.4 
2388 2503.3 
2175 2132.4 
2539 2466.1 

5.09 2663.6 0.98 
2.80 2577.5 1.92 
1.70 2672.7 1.12 

-1.54 2560.6 1.21 
-2.13 2509.5 0.81 
-0.63 2591.2 -0.67 
0.22 2400.0 -0.46 
1.21 2493.4 0.78 
5.12 2595.8 -3.83 
3.40 2490.9 -1.67 
1.83 2569.9 -0.74 

-4.56 2849.2 -3.12 
1.52 2623.3 -0.78 
4.54 2875.8 1.31 

-3.77 2780.9 -0.72 
3.54 2567.3 -2.12 

-4.12 2511.5 1.24 
-2.67 2486.6 -2.12 
1.09 2487.0 0.36 
2.62 2505.6 1.78 

-2.48 2773.3 1.41 
4.32 2591.3 -2.14 
4.15 2323.1 2.43 
3.28 2432.7 1.07 
0.72 2521.3 -0.65 

-2.93 2451.3 -0.92 
4.81 2379.0 2.42 

-0.85 2711.7 1.32 
-4.83 2347.2 1.71 
1.96 2208.7 -1.55 
2.87 2520.0 0.75 
2.82 1.42 
1560 880 

2673.3 
2663.2 
2728.7 
2561.4 
2544.9 
2555.5 
2367.0 
2497.7 
2568.0 
2478.9 
2557.4 
2811.6 
2620.2 
2895.9 
2783.4 
2552.0 
2516.3 
2414.8 
2488.3 
2522.9 
2794.2 
2574.8 
2333.1 
2433.2 
2517.3 
2450.6 
2403.1 
2702.9 
2350.5 
2142.2 
2521.0 

0.62 
-1.34 
-0.95 
1.18 

-0.59 
0.72 
0.92 
0.61 

-2.72 
-1.18 
-0.25 
-1.76 
-0.66 
0.62 

-0.81 
-1.51 
1.05 
0.83 
0.31 
1.10 
0.67 

-1.49 
2.01 
1.05 

-0.49 
-0.89 
1.43 
1.64 
1.57 
1.51 
0.71 
1.07 
700 

t?(t+ l)= 
kw12 

1 +x’(QR,(t- 1)x(t) 
(48) 

where y is a constant inversely proportional to the 
prediction error e(t). 

6. IMPLEMENTATION RESULTS 

In order to evaluate the performance of the fuzzy expert 
system, load forecasting is performed on a typical utility 
data. The models ANN, FES,, and FES, are tested on a 
2-year utility data for generating peak and average load 
profiles and some of the results are given in the 
subsequent subsections. In (Bunn and Farmer (1989) and 
Brace (199 1)) it has been shown that ANN gives the best 
prediction and accuracy compared to conventional 
approaches. So in this case the results of FES, and FES, 
are compared to that of the ANN approach. 

The training sets are formed separately for each of the 

seven day types (i.e., Tuesdays through Thursdays, 
Mondays, Fridays, Saturdays, Sundays, Holidays). The 
selection of training patterns and the selection of variable 
ranges are given in Sections 6.1 and 6.2. 

6.1. Optimum Selection of Training Patterns 

The utility data studied here are susceptible to large and 
sudden changes in weather and load, so selection of 
appropriate training cases plays a vital role in training the 
network. Several techniques for the selection of training 
patterns have been suggested in Peng, Hubele, and 
Karady (1992). 

The following load model is used for peak load 
forecasting: 

P,,(i) =f(P,,,(i--n), P,,,(i-n-l), . . ., P,,(i-n-n,), 
O,,,(i--n), O,,,(i), O,,,(i- l), . . ., 
OmaxG-dr H,,,(i-n), H,,,(i), H&i-l), 


Hybrid Neural Network for Fuzzy Expert System 

TABLE 4 
Peak Load Forecasting in January (Winter) Using 166 Hour Ahead Forecast 

417 

Peak Peak Peak 
Actual Forecast Forecast Forecast 
Load (NV 

Day (MW (MY 

1 2690 2534.5 
2 2628 2556.0 
3 2703 2610.8 
4 2592 2651.4 
5 2530 2616.5 
6 2574 2503.5 
7 2389 2433.2 
8 2513 2433.1 
9 2500 2353.8 

IO 2450 2324.3 
11 2551 2620.1 
12 2763 2953.9 
13 2603 2531.9 
14 2914 2761.6 
15 2761 2894.6 
16 2514 2389.6 
17 2543 2697.4 
18 2435 2494.7 
19 2496 2463.8 
20 2551 2456.6 
21 2813 2886.4 
22 2537 2382.0 
23 2381 2279.8 
24 2459 2540.6 
25 2505 2575.4 
26 2429 2500.7 
27 2438 2281.2 
28 2748 2839.0 
29 2388 2513.8 
30 2175 2125.4 
31 2539 2456.2 

MAPE 
Iterations 
required 

5.78 
2.74 
3.41 

-2.29 
-3.42 
2.74 

-1.85 
3.18 
5.85 
5.13 

-2.71 
-6.91 
2.73 
5.23 

-4.84 
4.95 

-6.07 
-2.45 
1.29 
3.70 

-2.61 
6.11 
4.25 

-3.32 
-2.81 
-2.95 
6.43 

-3.31 
-5.27 
2.28 
3.26 
3.87 
2050 

2630.0 
2564.7 
2649.5 
2636.6 
2580.1 
2519.9 
2351.5 
2444.9 
2370.8 
2340.5 
2510.7 
2886.0 
2555.9 
2853.1 
2658.3 
2408.4 
2653.9 
2454.7 
2509.2 
2493.9 
2864.5 
2441.9 
2285.0 
2518.5 
2538.8 
2480.5 
2351.0 
2808.2 
2481.8 
2198.9 
2476.3 

2.23 
2.41 
1.98 

-1.72 
-1.98 
2.10 
1.57 
2.71 
5.17 
4.47 
1.58 

-4.45 
1.81 
2.09 
3.72 
4.20 

-4.36 
-0.81 
-0.53 
2.24 

-1.83 
3.75 
4.03 

-2.42 
-1.35 
-2.12 
3.57 

-2.19 
-3.93 
-1.10 
2.47 
2.61 
1170 

2742.2 
2571.0 
2666.5 
2551.0 
2480.7 
2628.8 
2354.8 
2449.7 
2378.5 
2360.1 
2519.6 
2874.1 
2649.6 
2972.3 
2660.2 
2415.2 
2651.8 
2424.8 
2481.5 
2514.5 
2762.9 
2462.2 
2285.0 
2507.4 
2537.8 
2462.3 
2370.2 
2784.3 
2474.2 
2159.3 
2489.2 

-1.94 
2.17 
1.35 
1.58 
1.95 

-2.13 
1.43 
2.52 
4.86 
3.67 
1.23 

-4.02 
-1.79 
-2.00 
3.65 
3.93 

-4.28 
0.83 
0.58 
1.43 
1.78 
2.95 
4.03 

-1.97 
-1.31 
-1.37 
2.78 

-1.32 
-3.61 
0.72 
1.96 
2.29 
890 

. ., H&i-4)). (49) 

For average load forecast a similar model as eqn. 
(49) is chosen, given by: 

P,,(i) =KP,,(i-n), P,,(i- l), . . ., PJi-n,), %di-n), 
O,,,W~ O,,,G- I), . . ., Omax(i-Ut 
OmJi-nh O,i,(& O,,,(i- 11, . . ., 
O,,,U-n,L %,,(i-n), H,,,(i), ff,,A-I), . . ., 
~,,,(i-~J). H,,,(i-fi), HmilI(i)7 H,i,(i- I), . . ., 
H,,,,(i- 0. (50) 

Here, n indicates the lead time of the forecast (n * n,, n2, 
n?). P, 0, and H stand for load, temperature, and 
humidity, respectively. Also different values of n,, n,, and 
n3 were tested to find the effect of the past data on the 
load forecast. With n,, n,, n,>O there was no marked 
improvement in the results. Therefore, we chose, 
n1 = n2 = n3 = 0 in this study. However, the choice of n,, 
n2, n3 depends entirely on the utility data concerned. 

6.2. Scaling of the Variable Range 

Because the input variables and the estimated ones from 
the hybrid neural network have wide variations in 
magnitudes, they will cause convergence problem and 
the system will behave completely erratic. To circumvent 
this problem, the variables are scaled between 0.1 and 
0.9 (Rahman, Drezga, & Rajagopalan, 1993). This is 
performed so as to maximise accuracy and minimise 
training time. The following scheme is adopted to scale 
the variables between 0.1 and 0.9. Let X,,,,, and X, m,n 
denote the upper and lower bounds of the observed range 
of feature X, in all the patterns in the historical data base 
considering numerical values only. Similarly, O,,,,, and 
0 ,, m,n denote the upper and lower bounds of the observed 
range of outputs. Then X, is normalised as 

X: = 0.1 + WW-X. ,,,MX.,,,-X,. n,,n II (51) 

where (i, j) corresponds to the jth pattern of the ith 
training set. 


418 P. K. Dash et al. 

TABLE 5 
Average Load Forecasting in January (Winter) Using 24 Hour Ahead Forecast 

Peak Peak Peak 
Actual Forecast Forecast Forecast 
Load WJ) 

Day WV (M'W (%) 'FG (FpE& 'Kz &I,, 

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 

23: 
31 

MAPE 
Iterations 
required 

2166 2223.4 
2153 2209.8 
2233 2218.1 
2093 2092.1 
2050 2041.0 
2109 2109.3 
1937 1933.4 
1983 1974.9 
2045 1991.6 
2003 2036.6 
1982 1954.9 
2117 2116.2 
2064 2035.6 
2237 2262.3 
2198 2239.1 
2068 2065.3 
2021 2057.0 
1976 1953.7 
2019 2019.9 
2073 2090.0 
2195 2224.6 
2015 1982.8 
1917 1866.8 
1977 1943.5 
2021 2017.5 
1977 1961.6 
1964 2010.8 
2086 2124.2 
1876 1889.6 
1772 1751.3 
1960 200.1 

-2.65 
-2.64 
0.67 
0.04 
0.44 

-0.01 
0.18 
0.41 
2.61 

-1.68 
1.37 
0.04 
1.37 

-1.13 
-1.87 
0.13 

-1.78 
1.13 

-0.04 
-0.82 
-1.35 
1.60 
2.62 
1.70 
0.17 
0.78 

-2.38 
-1.83 
-0.72 
1.17 

-2.05 
1.21 
1700 

2161.2 
2192.4 
2242.3 
2092.6 
2050.1 
2107.9 
1934.2 
1969.4 
2008.7 
2007.1 
1980.2 
2116.8 
2042.8 
2241.5 
2171.0 
2065.2 
2047.9 
1970.8 
2030.1 
2053.3 
2175.8 
2041.4 
1935.1 
1961.3 
2017.1 
1965.6 
1921.8 
2078.7 
1889.4 
1757.2 
1928.1 

0.22 
-1.83 
-0.42 
0.02 

-0.00 
0.05 
0.14 
0.68 
1.78 

-0.21 
0.09 
0.01 
1.03 

-0.20 
1.23 
0.14 

-1.33 
0.26 

-0.55 
0.95 
0.87 

-1.31 
-0.94 
0.80 
0.20 
0.58 
2.15 
0.35 

-0.72 
0.83 
1.63 
0.69 
1020 

2163.5 
2120.8 
2216.6 
2089.0 
2050.0 
2108.2 
1936.9 
1976.2 
2080.1 
1998.2 
1980.5 
2116.6 
2049.9 
2240.6 
2221.3 
2067.1 
1994.9 
1982.1 
2007.3 
2087.2 
2203.5 
2003.3 
1936.8 
1961.0 
2024.1 
1970.9 
1939.0 
2077.9 
1883.0 
1765.5 
1924.6 

0.12 
1.49 
0.73 
0.19 

-0.00 
0.04 
0.00 
0.34 

-1.72 
0.24 
0.07 
0.02 
0.70 

-0.17 
-1.06 
0.04 
1.29 

-0.31 
0.58 

-0.68 
-0.39 
0.58 

-1.03 
0.81 

-0.15 
0.31 
1.27 
0.39 

-0.37 
0.37 
1.81 
0.56 
680 

Similarly, the output 0, can be expressed 

0~=0~1+[0~8(0~~0~,~~)/(0~,~~~~0~,~~~)] (52) 

where X7 and 07 denote the normalised input and output 
vectors, respectively. 

The normalised predicted values can be converted 
back to the actual values using the above expressions. 

The ideas discussed in sections 6.1. and 6.2 are used 
for obtaining the peak and average load forecasts. 

6.3. Peak Load Forecasting 

For n h ahead peak load forecasting, the following 
training data are used for the ANN: 

Input pattern: P,,,(i), O,,(i), H,,,(i), o’,,,(i + n), 
H/,,( i + n) 

Output pattern: P ,,,(i + n) for the ANN. 

Superscript f denotes the forecasted values for @ and 
H. The forecasted values for (i + n)th day are used to get 
a more realistic forecast. 

For FES, and FES, the training patterns used are: 

Input pattern: A@,,,(i, i + n) and AH_(i, i + n) 

Output pattern: erc(i), the desired load correction. 

The P,,(i + n) for FES, and FES, is obtained using 
eqn. (7). 

Table 1 gives the learned membership functions for 
24 h ahed peak load forecasting in winter using FES,, for 
example, rule 0 is interpreted as: 

RO: IF A@,,, is term 0 and AH_ is term 3 
THEN e^, is term 7. 

Figure 2 gives the learned membership functions after 
the first phase (unsupervised learning phase) and the 


Hybrid Neural Network for Fuzzy Expert System 

TABLE 6 
Average Load Forecast in June (Summer) Using 46 Hour Ahead Forecast 

-- 
Peak Peak Peak 

Actual Forecast Forecast Forecast 
Load NV 

Day (MW) WW 

1 1521 
2 1531 
3 1473 
4 1336 
5 1285 
6 1433 
7 1406 
a 1403 
9 1428 
10 1431 
11 1301 
12 1235 
13 1418 
14 1408 
15 1421 
16 1447 
17 1440 
la 1352 
19 1305 
20 1430 
21 i 383 
22 1414 
23 1423 
24 1406 
25 1267 
26 1221 
27 1397 
28 1401 
29 1403 
30 1411 

MAPE 
Iterations 
required 

1541.1 -1.32 
1562.5 -2.06 
1455.2 1.21 
i 380.9 -3.36 
1254.0 2.41 
1446.2 -0.92 
1375.5 2.17 
1442.3 -2.80 
1461.0 -2.31 
1467.3 -2.54 
1272.8 2.17 
1206.7 2.29 
1394.7 1.64 
1395.5 0.89 
1451.3 -2.13 
1480.1 -2.29 
1420.7 1.34 
1313.9 2.82 
1279.8 1.93 
1407.5 1.57 
i 387.8 -0.35 
1434.1 -1.42 
1386.9 2.54 
1361.6 3.16 
1230.6 2.87 
1254.3 -2.73 
1371.7 i .ai 
1374.1 1.92 
1415.2 -0.87 
1376.1 2.47 

2.01 
1630 

1509.7 
1512.3 
1453.1 
1370.3 
1260.8 
1443.5 
1386.6 
1437.0 
1450.7 
i 459.8 
1286.0 
1217.8 
1419.3 
1401.9 
1443.0 
1468.4 
1425.3 
1324.8 
1282.6 
1416.6 
1395.3 
1425.0 
1404.1 
1370.4 
1250.3 
1207.9 
1401.7 
I 384.3 
1416.5 
1393.6 

0.74 
1.22 
1.35 

-2.57 
1.88 

-0.73 
1.38 

-2.42 
-1.59 
-2.01 
1.15 
1.39 

-0.09 
0.43 

-1.55 
-1.48 
1.02 
2.01 
1.72 
0.94 

-0.89 
-0.78 
1.33 
2.53 
1.32 
1.07 

-0.34 
1.19 

-0.96 
1.23 
1.31 
1520 

1516.7 
1549.5 
1456.5 
1305.9 
1272.9 
1423.5 
1418.5 
1418.0 
i 448.8 
1411.3 
1322.7 
1219.9 
1419.8 
I 399.8 
1439.6 
1429.9 
i 428.3 
1330.5 
1290.3 
1440.3 
i 389.2 
1401.1 
1410.9 
1373.1 
1248.6 
1234.4 
1395.9 
I 384.9 
I 389.3 
1390.1 

0.28 
-1.21 
1.12 
2.25 
0.94 
0.66 

-0.89 
-1.07 
-1.46 

i .3a 
-1.67 
1.22 

-0.13 
0.58 

-1.31 
1.18 
0.81 
1.59 
1.13 

-0.72 
-0.45 
0.91 
0.85 
2.34 
1.45 

-1.10 
0.08 
1.15 
0.98 
1.48 
i .oa 
1070 

419 

second phase (supervised learning phase). Figure 3 gives 
the mean absolute percentage errors (MAPE’s) versus 
the number of iterations. The results in Figures 2 and 3 
are obtained for 24 h ahead load forecasting in January 
(winter). From Figure 3 we find that the FES, gives 
fastest convergence followed by FFS, and ANN. The 
convergence speed of the FES, was found to be superior 
because of the linear Kalman filter equations used for 
weight update and the error-dependent forgetting factor 
was responsible for driving the MAPE low during the 
first few hundred iterations until the bias introduced by 
the initial conditions was eliminated. 

Table 2 gives the peak load forecasting results in 
terms of mean absolute percentage errors (MAPEs) for 
ANN, FES, and FES, in the month of June (winter) using 
24 h ahead forecast. Tables 3 and 4 give the results for 
the month of January (winter) using 48 h and 168 h 
ahead forecast, respectively. 

From Tables 2,3 and 4 we see that the FES, and FES, 
give better prediction accuracy compared to ANN. Also 

we find that the results for 48 h and 168 h predictions are 
comparable with that of the 24 h ahead predictions. This 
is because the load forecasting was performed as a one- 
step process (i.e. looking 24 h ahead, 48 h ahead, and so 
on). However, as the lead time increased to 168 h the PEs 
were found to be greater than 4% even with FES,. As the 
primary aim of this paper was to make a comparative 
assessment between ANN, FES, and FES,, no attempt 
was made to improve the forecast errors further. 

6.4. Daily Average Load Forecast 

For n h ahead average load forecasting, the following 
training data are used for ANN. 

Input pattern P,,(i), O,,(i), O,,,(i), H,,,(i), 
&&). @L(i + n), o’,,,(i + n). 
Hf,,,(i + n), Hti,(i + n) 

Output pattern: P,,(i + n) for ANN. 

For FES, and FES,, the training patterns used are: 


420 R K. Dash et al. 

TABLE 7 
Average Load Forecasting in June (Summer) Using 168 Hour Ahead Forecast 

Peak Peak 
Actual Forecast Forecast 
Load VW 

Day (MW) (MW (K, 'Kz cFPE:,, 

1 

z 
4 
5 

; 

: 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 

z; 
26 
27 
28 

z: 
MAPE 

Iterations 
required 

1521 1558.6 
1531 1558.9 
1473 1436.0 
1336 1409.3 
1285 1244.1 
1433 1409.4 
1406 1377.5 
1403 1441.2 
1428 1484.1 
1431 1469.6 
1301 1358.0 
1235 1202.5 
1418 1390.3 
1408 1394.9 
1421 1484.4 
1447 1482.3 
1440 1402.7 
1352 1308.5 
1305 1269.2 
1430 1404.0 
1383 1350.4 
1414 1463.1 
1423 1371.5 
1406 1359.5 
1267 1219.4 
1221 1256.7 
1397 1372.7 
1401 1373.1 
1403 1414.4 
1411 1359.6 

-2.47 
-1.82 
2.51 

-5.49 
3.18 
1.65 
2.03 

-2.72 
-3.93 
-2.70 
-4.38 
2.63 
1.95 
0.93 

-4.46 
-2.44 
2.59 
3.22 
2.74 
1.82 
2.36 

-3.47 
3.62 
3.31 
3.76 

-2.92 
1.74 
1.99 

-0.81 
3.64 
2.78 
2650 

1542.9 
1551.8 
1452.7 
1388.2 
1250.8 
1428.4 
1385.3 
1435.3 
1471.8 
1447.0 
1345.8 
1212.5 
1403.0 
1399.7 
1453.7 
1473.2 
1412.1 
1357.7 
1290.0 
1406.3 
1363.2 
1448.6 
1393.0 
1372.1 
1237.6 
1241.6 
1384.6 
1379.7 
1405.5 
1374.9 

-1.44 
-1.36 
1.38 

-3.91 
2.66 
0.32 
1.47 

-2.30 
-3.07 
-1.12 
-3.44 
1.82 
1.06 
0.59 

-2.30 
-1.81 
1.94 

-0.42 
1.15 
1.66 
1.43 

-2.45 
2.11 
2.41 
2.32 

-1.69 
0.89 
1.52 

-0.18 
2.56 
1.76 
2480 

Peak 
Forecast 

1538.2 
1550.6 
1456.5 
1364.9 
1261.6 
1434.6 
1385.9 
1431.1 
1448.7 
1416.5 
1330.0 
1221.7 
1432.0 
1411.0 
1451.8 
1463.8 
1415.5 
1358.1 
1295.0 
1441.9 
1399.7 
1396.6 
1404.8 
1389.7 
1254.2 
1236.1 
1387.9 
1380.5 
1399.5 
1378.1 

-1.13 
-1.28 
1.12 

-2.16 
1.82 

-0.11 
1.43 

-2.00 
-1.45 
1.01 

-2.23 
1.08 

-0.99 
-0.21 
-2.17 
-1.16 
1.70 

-0.45 
0.77 

-0.83 
-1.21 
1.23 
1.28 
1.16 
1.01 

-1.24 
0.65 
1.46 
0.25 
2.33 
1.23 
1310 

Input pattern: A@,,,,(& i + n), AQmin(ir i + n), 
AH&i, i + n), AHmi,(i, i + n) 

Output pattern: e,,(i), the desired load correction. 

Again forecasted temperature and humidity values are 
used for the day of the forecast. 

The P,,(i+n) for FES, and FES, is obtained using 
eqn. (7). 

Table 5 gives the average load forecasting results, 
number of iterations for convergence, PEs and the 
MAPEs for the ANN, FES,, and FES, models in the 
month of January (winter) using 24 h ahead forecast. 
Tables 6 and 7 give the results for 48 h and 168 h ahead 
predictions in the month of June (summer). 

Again, from Tables 5, 6, and 7 we find the improved 
performance of FES, in terms of faster convergence and 
improved overall accuracy over the ANN and FES,. 

7. DISCUSSION 

The fuzzy expert system presented in this paper is 
constructed from training examples by machine learning 

techniques, and the neural network model is trained to 
develop fuzzy logic rules and find input/output member- 
ship functions. By combining both unsupervised and 
supervised learning schemes, the learning speed for time 
series forecasting problems converges much faster than 
original back-propagation learning algorithm. Further, by 
using a linear Kalman filter for the supervised learning 
phase, the learning time is shortened with rapid con- 
vergence due to the adaptive nature of the learning 
algorithm. The fuzzy expert system using a linear 
Kalman filter produces a more accurate load forecast for 
lead times varying from 24 h to 168 h in comparison to 
the one using gradient descent back-propagation algo- 
rithm. The forecasting results presented in Section 6 are 
based upon only true temperature information because 
all the data for this utility are historical. However, in 
reality the temperature information will be the forecasted 
value and this will add to the forecast error. The 
developed expert systems are used to forecast the load 
during weekends but no attempt is made to forecast for 
days with unusual events. Such events will require 


Hybrid Neural Network for Fuzzy Expert System 421 

extensive data analysis to track the events in that day and 
to select training cases accordingly from the previous 
days with similar events. Such activities are continuing, 
and will be reported in a future paper. 

As a final note, a shorter period is used for training as 
load patterns change very fast. But training and predic- 
tion can be done over longer periods with less chaotic 
data having strong correlation to any particular weather 
and environmental variable. This flexibility permits the 
adaptation of the proposed fuzzy expert systems for 
producing accurate load forecasts for different geo- 
graphic areas. 

8. CONCLUSIONS 

This paper presents the development of a fuzzy expert 
system using a hybrid neural network approach to predict 
peak and daily average load profiles in an energy 
management system. The fuzzy expert system is mod- 
elled as a hybrid neural network and uses fuzzy 
membership values of load and weather parameters. A 
hybrid learning scheme consisting of both self-organized 
and supervised learning phases is used for training the 
hybrid neural network. Further, the paper presents 
simulation results using both back-propagation and 
Kalman filter-based algorithms, the latter producing 
faster convergence during training and more accurate 
predictions. 

Acknowledgement-The authors gratefully acknowledge the funds 
from the National Science Foundation (NSF Grant No. INT-9209103 
and INT-9 117624). USA for undertaking this research. 

REFERENCES 

Box, G. E. P., & Jenkins, G. M. (1976). Time series onolysisforecosring 
and control, Oakland, CA: Holden-Day. 

Brace, M. C. (1991). A comparison of the forecasting accuracy of 
neural networks with other established techniques, Firsr Inrer- 
national Forum on Applications on Neural Networks to Power 
Systems, Seattle, WA, July, 1991, pp. 23-26. 

Buckley, J. J., Hayashi, Y., & Czogala, E. (1993) On the equivalence of 
neural nets and fuzzy expert systems, Fuzzy Sets Systems, 53 
129-134. 

Bunn. D. W.. & Farmer, E. D. (1989). Comparative models for electric 
loadforecasting, New York: John Wiley and Sons. 

Dash, P. K., Dash, S., Rama Krishna, G., & Rahman, S. (1993). 
Forecasting of a load time series using a fuzzy expert system and 
fuzzy neural networks, Engineering International Systems, l(2), 
103-117. 

Dash, P. K., Satpathy, J. K., Rama Krishna, G., & Rahman. S. (1993). 
An improved kalman filter based neural network approach for 
short-term load forecasting, Proceedings of the Third International 

Symp. on Electricity Distribution and Energy Management, Vol. 1, 
pp. 3-8. 

El-Sharkawi, M. A., Oh, S., Marks, R. J., Dambourg. M. J., & Brace, 
C. M. (1991). Short term electric load forecasting using an 
adaptively trained layered perceptron, First International Forum on 
Applications of Neural Networks to Power Systems, Seattle WA., 
July 23-26, 199 1, pp. 3-6. 

Hayashi, Y., Buckley, J. J., & Czogala, E. (1992). Fuzzy expert systems 
versus neural networks, Proceedings of International Joint Con- 
ference on Neural Networks, Baltimore, Vol. II, June 7-12 
pp. 720-726. 

Haykin, S. S. (1986). Adaptive filter theory (pp. 3 12-3 14). Englewood 
Cliffs, NJ: Prentice-Hall. 

Ho, K. L., Hsu, Y. Y., &Yang, C. C. (1992). Short term load forecasting 
using a multilayer neural network with an adaptive learning 
algorithm, IEEE Transactions on Power Systems, 7( 1). 141-I 50. 

Lee, C. C. (1990). Fuzzy logic in control systems: Fuzzy logic 
controller, part 2. IEEE Transactions Systems, Man & Cybernetics, 
20(3), 266-275. 

Lee, K. Y., Cha, Y. T., & Park, J. H. (1992). Short term load forecasting 
using an artificial neural network, IEEE Transactions on Power 
Systems, 7(I), 124-133. 

Lin, C. T., & Lee, G. C. S. (1991). Neural network-based fuzzy logic 
control and decision system, IEEE Transactions on Computers, 
40(12), 1320-1336. 

Moghram, I., & Rahman. S, (1989). Analysis and evaluation of five 
short-term load forecasting techniques, lEEE Transactions on 
Power Systems, 4(4), 1484-1490. 

Pal, S. K., & Mitra, S. ( 1992). Multilayer perceptrons, fuzzy sets and 
classifications, IEEE Transactions on Neural Networks, 3(5), 
683-698. 

Park, D. C., El-Sharkawi, M. A., & Marks, R. J., (1991). Adaptively 
trained neural networks, IEEE Transactions on Neural Networks, 
224-245. 

Park, J. H., Park, Y. M.. & Lee, K. Y. (1991). Composite models for 
adaptive short term load forecasting, lEEE Transactions on Power 
Systems, l(2), 450-157. 

Peng, T. M., Hubele, N. F., & Karady, G. G. (1992). Advancement in 
the application of neural networks for short term load forecasting, 
IEEE Transactions on Power Systems, 7(l), 250-258. 

Rahman, S., Drezga, I., & Rajagopalan, J. (1993). Knowledge 
enhanced connectionist models for short-term electric load fore- 
casting, International Conference on ANN applications to Ponaer 
Systems, Yokohama, Japan, April. 

Rumelhart, D. E., & Zipses, D. (1985). Feature discovery by 
competitive learning. Cognitive Science, 9, 75-l 12. 

Scalero. R. S., & Tepedelenlioglu, N. (1992). A fast algorithm for 
training feedforward neural networks, IEEE Transactions on Signal 
Processing, 40( 1). 202-2 10. 

Singhal, S., & Wu, L. (1989). Training feed-forward network with the 
extended Kalman algorithm, Proceedings of ICASSP, pp. 1187- 
1190. 

Singhal, S., & Wu, L. (1989). Training feed-forward network with the 
extended Kalman algorithm. Proceedings of ICASSP, pp. 
1187-1190. 

Wang, L.-X. & Mendel, J. M. (1992). Generating fuzzy rules by 
learning from examples, IEEE Transacrions on Systems Man & 
Cybernetics, 22(6), 1414-1427. 

Zadeh. L. A. (1965). Fuzzy sets, Information and Control. 8, 
338-358.