PII: 0957-4174(95)00013-Y Pergamon Expert Systems With Applications, Vol. 9. No. 3. pp. 407-421, 1995 Copyright 0 1995 Elsevier Science Ltd Printed in the USA. All tights reserved 0957-4 I74/95 $9.50 + .Oa 0957-4174(95)00013-5 Building a Fuzzy Expert System for Electric Load Forecasting Using a Hybrid Neural Network I? K. DASH AND A. C. LIEW National University of Singapore, 10, Kent Ridge Crescent, Singapore S. RAHMAN Department of Electrical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA G. RAMAKRISHNA Regional Engineering College, Rourkela-769 008, India Abstract-This paper presents the development of a hybrid neural network to model a fuzzy expert system for time series forecasting of electricc load. The hybrid neural network is trained to develop fuzzy logic rules andjind optimal inputloutput membership values of load and weather parameters. A hybrid learning algorithm consisting of unsupervised and supervised learning phases is used for training the fuzzifred neural network. In the supervised learning phase, both back-propagation and linear Kalman jilter algorithms are used for the adjustment of weights and membership functions. Extensive tests have been performed on a 2-year utility data for the generation of peak and average loadprojiles in 24 h. 48 h. and 168 h ahead time frame during summer and winter seasons. From the simulation results, it is observed that the fuzzy expert system using the Kalman $lter-based algorithm gives faster convergence and more accurate prediction of a load time series. 1. INTRODUCTION IT IS WELL KNOWN that fuzzy logic provides an inference strategy that enables approximate human reasoning capabilities to knowledge-based systems. Also, it pro- vides a mathematical morphology to emulate certain perceptual and linguistic attributes associated with human cognition. Although fuzzy theory provides an inference mechanism under cognitive uncertainty, com- putational neural networks offer exciting advantages such as learning, adaptation, fault tolerance, parallelism, and generalization. The computational neural networks, comprising processing elements called neurons, are capable of coping with computational complexity, non- linearity, and uncertainty. To enable a system to deal with cognitive uncertain- ties in a manner more like humans, one may incorporate Requests for reprints should be sent to P K. Dash, Centre for Intelligent Systems, Electrical Engineering Department of Regional Engineering College, Rourkela 769008, India. the concepts of fuzzy logic into neural network. Although fuzzy logic is a natural mechanism for modelling cognitive uncertainty, it may involve an increase in the amount of computation required. This can be readily offset by using fuzzy neural network approaches having the potential for parallel computa- tions with high flexibility. The application of an artificial neural network (ANN) and fuzzy logic-based decision support system to time- series forecasting has gained much attention recently. ANN-based load forecasts give large errors when the weather profile changes very fast. Also extremely slow training or even training failure occurs in many cases due to difficulties in selecting proper structures of the neural network paradigm being used, and due to the errors in associated parameters such as learning rates, activation functions, etc., which are fundamental to any back- propagation neural network. On the other hand, the development of a fuzzy decision system (fuzzy expert system) for load forecasting requires detail analysis of data, and the fuzzy rule base has to be developed 407 408 P. K. Dash et al. heuristically for each season. The rules fixed in this way may not always yield the best forecast. Thus, a hybrid neural network model is used that combines the idea of fuzzy logic-based decision system and neural network structure and learning abilities into an integrated frame- work. Such a hybrid model provides human understandable meaning to the normal feedforward neural network in which the internal units are always opaque to the users. This structure also avoids the rule matching time of the inference engine in the traditional fuzzy logic system and results in enhanced learning speed and prediction accuracy. The present work is aimed at achieving the said objective of a robust load forecast with much improved accuracy using a fuzzy expert system modelled by the hybrid ANN architecture. The two types of fuzzy expert system models, abbreviated FES, and FES, are based on the argument that any fuzzy expert system employing one block of rules may be approximated by a neural network (feedforward, multilayered). The input vector to FES, and FES, consists of differences in weather parameters between the reference and the forecasted day or hour. The output of the FES, and FES, gives the load correction that, when added to the past load, gives the forecasted load. The learning algorithm for FES, and FES, combines unsupervised and supervised learning procedures to build the rule nodes and train the membership functions. The supervised learning proce- dure for FES, uses a gradient back-propagation algorithm for finding the optimum weights and member- ship functions. Although for FES,, the supervised learning procedure uses a linear Kalman filter-based learning algorithm suggested by Scaler0 and Tepede- lenlioglu (1992), which is similar to the least square adaptive learning techniques. The least square adaptive filtering techniques are known to have rapid convergence properties over the back-propagation algorithm. A few examples of peak load forecasting and daily average load forecasting of a typical utility with 24 h and 168 h lead time are shown in this paper. The approach presented in this paper is also applicable to load forecasts with longer lead times. 2. FUZZY EXPERT SYSTEM 2.1. Fuzzy Statements This section portrays inexact information presented by fuzzy statements, and explains both fuzzy conditional statements and inference mechanisms. Typically, an engineering model requires the use of exact or mathe- matical statements. These statements correspond to precise information such as x=3.0,2<8<0, ory=3t+24. The value of x = 3.0 has a grade membership comparable to 100% (= l), for all other values (2.8,2.9, 3.1,3.2), the grades of membership in the solution is zero. In the case of real world values, however, this grade of membership is not true due to the imprecision of tools, the influence of the observer, etc. Additionally, human reasoning is often imprecise, that is, inexact statements such as, “the value of x is not big” or “the temperature is about 4”.” Therefore, a theory to correctly express the grade of membership is desirable. The Fuzzy Set Theory allows manipulation of both exact and inexact (fuzzy) statements. This is very important for load forecasting because there are so many fuzzy factors that are difficult to characterize by a number. An instance of this could be weather conditions such as temperature, humidity, cloud cover, etc. For example, “the temperature this morning will be about ll’.” Temperature is an important factor in load forecasting, but is not easy to characterise by an exact numerical quantity. In addition, the linguistic hedges (such as small, medium, large) can be modified by other linguistic hedges (such as not, very, very very) (Zadeh, 1965) For example, “value of x is not big,” where big is a linguistic hedge and not a modifier. 2.2. Fuzzy Rule Base A fuzzy expert system consists of a collection of fuzzy IF-THEN rules. For example, the rule base R is written as R = [R’, R=, . . ., R”] (1) with R’:IF (x, is F{ and . . and xp is FL) THEN (y, is G: and . . . y4 is Gb) (2) where x = (x, , . . ., x,JT and y= (yl, . . ., y,)r are the input and output vectors to the fuzzy system, respectively. Fi and GI are labels of fuzzy sets Vi and V,, respectively, and ll (3) where y, E V,. Hybrid Neural Nehvork for Fuzzy Expert System 409 The final fuzzy set AoR, (R,= [R,!, . . . , RF]) in V, determined by the fuzzy inference engine is obtained by combining eqn. (2) for 1= 1,2, . . ., M using the triangu- lar co-norm ~~dt,(Y,) = PA-R,! (Y,) + . . . + PA++ (YJ>. (4) 2.3.1. Fuzzijication and Defuzzification. In many cases it is convenient to express the membership function of a fuzzy subset in terms of a standard nonlinear function. The following Gaussian membership function suggested by Lin and Lee (1991) is used for the fuzzification of input and output linguistic parameters of the fuzzy expert system: PA,(x) = exp -Iv i I (5) where a and h are the center and width of the Gaussian membership function, respectively. The defuzzifier is needed to give a crisp output for any practical application like forecasting. The output of the fuzzy inference engine is a fuzzy set AQR in V, therefore, the defuzzifier maps AoR into a crisp point y E V. The most commonly used centroid defuzzifier is defined as (6) 3. FUZZY EXPERT SYSTEM MODEL USING NEURAL NETWORK APPROACH An alternative to the neural network-based load forecast is the expert system approach. A fuzzy expert system approach for load forecast consists of rules similar to the rules R’, . . ., RM given in Section 2.2. One of the difficulties with the fuzzy expert system is the rule matching and composition time, apart from the time- consuming process of adapting the rules. The neural network eliminates the rule matching process and stores the knowledge in the link weights. The decision signals can be pumped out immediately after the input data are fed in. Figure 3 shows the proposed hybrid neural network to model the fuzzy expert system using the ANN architecture. The fuzzy expert system clusters the differential temperatures and humidities of the ith and (i + n)th days into fuzzy term sets. Here n is the lead time for the forecast (i.e., n = 24 for 24 h ahead forecast, n = 48 for 48 h ahead forecast, and so on). The output of the expert system is the final load correction (ELc). Hence, the forecasted load on (i + n)th day (PAi + n)) is given by: P,(i + n) = P(i) +6,,(i). (7) The fuzzy expert system modelled as a hybrid neural network has a total of five layers. Nodes at layer 1 are the input linguistic nodes. Layer 5 is the output layer and consists of two nodes [one is for the actual load correction (&) and the other is the desired load correction (e&l. Nodes at layer 2 and 4 are term nodes that act as membership functions to represent the term sets of the respective linguistic variable. Each node at layer 3 represents the preconditions of the rule nodes, and layer 4 links define the consequence of the rules. The functions of each layer is described as follows: a> b) cl Layer 1: The nodes in this layer just transmit the input feature x,, i = 1, 2 to the next layer. Layer 2: Each input feature x,, i = 1, 2 is expressed in terms of membership values &a,,, l~,~), where i corresponds to the input feature and j corresponds to the number of term sets for the linguistic variable x,. The membership function &, uses the Gaussian membership function (5). Layer 3: The links in this layer are used to perform precondition matching of fuzzy logic rules. Hence, the rule nodes perform the product operation (or AND operation). pRuRp = n pJ I, (8) where R,= 1, 2,. . ., n. R, corresponds to the rule node and n is the maximum number of rule nodes. However, if the fuzzy AND operation is used d) Layer 4: The nodes in this layer have two operations (i.e., forward and backward transmission). In forward transmission mode, the nodes perform the fuzzy OR operation to integrate the fired rules, which have the same consequence: p4=C14; (10) ,=I where p corresponds to the links terminating at the node. In the backward transmission mode, the links function exactly the same as the layer 2 nodes. e) Layer 5: There are two nodes in this layer (i.e., for obtaining the actual and desired output load correc- tion, respectively). The desired output load correction (e,J is fed into the hybrid neural network during learning whereas the actual load correction (a,,) is obtained by using the centroid defuzzification method (6). 410 4. HYBRID LEARNING ALGORITHM FOR THE FUZZY EXPERT SYSTEM (FES) The hybrid learning scheme consists of two phases. In phase I, the initial membership functions of the input and output linguistic variables are fixed and an initial form of the network is constructed. Then, during the learning process, some nodes and links of this initial network are deleted or combined to form the final structure of the network. In phase II learning, the input and output membership functions are optimally adjusted to obtain the desired outputs. Phase I represents the unsupervised learning phase. Phase II represents the supervised learning phase. 4.1. Phase-I: Unsupervised Learning Phase Given the training input data x,(r), i = 1,2, the desired output load correction (e,(r)) and the fuzzy partitions I&,1. We want to locate the membership functions (i.e., aij and b,) and find the fuzzy logic rules. The Kohonen’s feature maps algorithm (Lin & Lee, 1991) is used to find the values for ai, and b,: ai, clo~est(~ + l) = ui, doscAr) + tit) [ (‘)I xi(r)-ui, closest u,(r + 1) = uij(t) for uii f a,, C,Oscst (11) (12) (13) where rl(f) is the monotonically decreasing learning rate and t= IT( (i.e., the number of term sets for the linguistic variable xi). This adaptive formulation runs independently for each input linguistic variable xi: The width, b, is determined heuristically at this stage (Lin & Lee, 1991) as follows: b,j = Ia, jwui, closest1 r (14) where r is an overlap parameter. After the parameters of the membership functions have been found, the weights in layer 4 are obtained by using the competitive learning algorithm (Rumelhart & Zipses, 1985) as follows: w, = Ll; R,-‘U+ 1) = [R,-‘W-K,(O$(r) R,- ‘(W/J; where (39) (40) and ‘y, is a constant that is inversely proportional to the prediction error, e(r), at that layer. As a result,& remains close to 1 when R, is already large and the weight estimates are sensitive to parameter variations. A lower bound fO and f, is used to prevent the forgetting factor from becoming excessively small, resulting in large estimate fluctuations in spite of small prediction errors. The strategy for employing this forgetting factor is to initially allow it to assume small values during learning such that the initial conditions are quickly forgotten. However, as the average prediction error decreases, it is important to cause 6 to remain near one to avoid the winding up of the weight estimates. This can be simply implemented by setting y, smaller as the mean squared error decreases. The unsupervised learning phase of FES, is same as for FES,. The supervised learning phase of FES, is modified using the linear Kalman filter equations instead of the back-propagation method used for FES,. There- fore, the update equations are modified as follows: a) The weight update equation for layer 4 is: P(f) = - r(O 1 +x;(t)R,-‘(t)x,(t) (41) (42) 414 P. K. Dash et al. where 8 El 8 W, is given by eqn. (21) and K,(t) is given by eqn. (38). b) The update equations for a,, and b,, at layer 5 are: b;,(t + 1) = b,,(t) + rl,K,(O i 1 2 V (4.4) where a El a d,, is given by eqn. (26) and K,(t) is given by eqn. (38). c) The update equations for a,, and b,j at layer 2 are: a,,@ + 1) = a,,(t) + rll K,(t) -aE [ 1 a (43) u&t + 1) = a,(t) + T3K,(t) [ 1 c (45) V where d El i_l a,_ is given by eqn. (23) and K,(t) is given by eqn. (38). Maximum temperature difference (after supervised learning) Maximum temperature difference (after unsupervised learning) 1.0 2 0.8 E : 0.6 2 c 2 0.4 5 E 0.2 0 -12 -8 -4 0 4 8 12 -12 -8 -4 0 4 8 12 (W 09 Maximum humidity difference (after unsupervised learning) Maximum humidity difference (after supervised learning) 0 -55 -15 25 (%I (%) Peak load difference Peak load difference (after unsupervised learning) (after supervised learning) 1 .o 1 4 0.8 E M .> 0.6 c :: 2 E 0.4 E 0.2 0 1 .o 4 0.8 E M 2 0.6 @ 2 E 0.4 E 0.2 0 -300 -180 -60 60 180 300 -300 -180 -60 60 180 300 (MW (MW FIGURE 2. Learned membership functions for 24 h ahead peak load forecasting, in January (winter) using FES,. Hybrid Neural Network for Fuzy Expert System 415 25 r n - ANN ----- FES, . . . . . . . . . . FESZ ‘!, r I i 1 i L .."\ '; I, :\ .'. . . I 400 800 1200 1600 2000 2400 Number of iterations FIGURE 3. Comparison of the mean absolute percentage errors versus the iteration number for 24 h ahead forecast in January (winter). where, SE/ 8 a,] is given by eqn. (31) and K,(t) is given by eqn. (38). Similarly, b&r + 1) = b,,(t)- ~4K,W~e (46) where df is given by eqn (32) and K,(t) is given by eqn. (38). The convergence properties of the supervisory leam- ing scheme for FES, are found to be superior to FES, using the Kalman filter equation in the weight and membership update equations, as shown in Sections 6.3 and 6.4. Also, the accuracy of the forecast increases for a similar reason. R,-‘(t+ 1) =$(t)R,-‘(t) +$(t)x,(t) and the forgetting factorA is modified as f;(t+l)=l-yt?(t+l) (46) (47) TABLE 2 Peak Load Forecasting in June (Summer) Using 24 Hour Ahead Forecast Peak Peak Peak Actual Forecast Forecast Forecast Load NJ) ,K) (FEW &SE,, (FEW Day (MW (MN (MN WW &SE,, 1 2 : 5 ; 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1848 1850 1852 1687 1573 1752 1679 1674 1712 1778 1614 1520 1730 1663 1722 1710 1806 1730 1622 1752 1658 1672 1701 1736 1588 1482 1720 1699 1684 1693 30 MAPE Iterations required 1872.8 -1.34 1883.7 -1.82 1827.8 1.31 1627.6 3.52 1499.6 4.67 1752.4 -0.02 1661.4 1.05 1726.6 -3.14 1750.3 -2.24 1858.3 -4.52 1660.4 -2.87 1456.7 4.16 1705.0 1.45 1709.5 -2.80 1755.8 -1.96 1745.5 -2.08 1838.3 -1.79 1788.2 -3.37 1584.9 2.29 1775.7 -1.35 1648.6 0.57 1684.1 -0.72 1643.7 3.37 1669.5 3.83 1556.1 2.01 1413.4 4.63 1672.0 2.79 1750.4 -3.02 1666.6 1.03 1757.5 -3.81 2.45 970 1823.3 1844.7 1833.6 1669.7 1539.4 1747.6 1671.1 1652.6 1686.0 1752.7 1620.6 1474.5 1747.4 1674.5 1755.7 1681.4 1805.1 1757.7 1649.1 1772.9 1671.4 1685.3 1718.8 1702.9 1621.0 1463.9 1695.7 1708.2 1705.2 1676.9 1.34 0.29 0.99 1.03 2.14 0.25 0.47 1.28 1.52 1.42 -0.41 3.00 -1.01 -0.69 -1.96 1.67 0.05 -1.60 -1.67 -1.20 -0.81 -0.80 -1.05 1.91 -2.08 1.22 1.41 -0.53 -1.26 0.95 1.20 520 1825.7 1831.8 1868.0 1688.9 1544.1 1749.1 1668.5 1698.5 1715.3 1761.8 1602.7 1559.6 1719.5 1657.8 1741.7 1693.5 1803.3 1718.3 1644.7 1738.9 1662.6 1680.3 1727.7 1714.4 1619.2 1462.9 1705.4 1665.9 1667.5 1721.2 1.21 0.99 -0.86 -0.11 1.84 0.17 0.62 -1.46 -0.19 0.91 0.70 -2.60 0.61 0.31 -1.14 0.97 0.15 0.68 -1.40 0.75 -0.28 -0.50 -1.57 1.24 -1.96 1.29 0.85 1.95 0.98 -1.66 1.00 440 416 P. K. Dash et al. TABLE 3 Peak Load Forecasting in January (Winter) Using 48 Hour Ahead Forecast Peak Peak Peak Actual Forecast Forecast Forecast Load U'JN) Day (M'W (MW 1 2 3 4 5 6 7 8 9 IO 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 MAPE Iterations required 2690 2553.1 2628 2554.4 2703 2657.0 2592 2631.9 2530 2583.9 2574 2590.2 2389 2383.7 2513 2482.6 2500 2372.0 2450 2366.7 2551 2504.3 2763 2889.0 2603 2563.4 2914 2781.7 2761 2865.1 2514 2425.0 2543 2647.8 2435 2500.0 2496 2468.8 2551 2484.2 2813 2882.8 2537 2427.4 2381 2282.2 2459 2378.3 2505 2487.0 2429 2500.2 2438 2320.7 2748 2771.4 2388 2503.3 2175 2132.4 2539 2466.1 5.09 2663.6 0.98 2.80 2577.5 1.92 1.70 2672.7 1.12 -1.54 2560.6 1.21 -2.13 2509.5 0.81 -0.63 2591.2 -0.67 0.22 2400.0 -0.46 1.21 2493.4 0.78 5.12 2595.8 -3.83 3.40 2490.9 -1.67 1.83 2569.9 -0.74 -4.56 2849.2 -3.12 1.52 2623.3 -0.78 4.54 2875.8 1.31 -3.77 2780.9 -0.72 3.54 2567.3 -2.12 -4.12 2511.5 1.24 -2.67 2486.6 -2.12 1.09 2487.0 0.36 2.62 2505.6 1.78 -2.48 2773.3 1.41 4.32 2591.3 -2.14 4.15 2323.1 2.43 3.28 2432.7 1.07 0.72 2521.3 -0.65 -2.93 2451.3 -0.92 4.81 2379.0 2.42 -0.85 2711.7 1.32 -4.83 2347.2 1.71 1.96 2208.7 -1.55 2.87 2520.0 0.75 2.82 1.42 1560 880 2673.3 2663.2 2728.7 2561.4 2544.9 2555.5 2367.0 2497.7 2568.0 2478.9 2557.4 2811.6 2620.2 2895.9 2783.4 2552.0 2516.3 2414.8 2488.3 2522.9 2794.2 2574.8 2333.1 2433.2 2517.3 2450.6 2403.1 2702.9 2350.5 2142.2 2521.0 0.62 -1.34 -0.95 1.18 -0.59 0.72 0.92 0.61 -2.72 -1.18 -0.25 -1.76 -0.66 0.62 -0.81 -1.51 1.05 0.83 0.31 1.10 0.67 -1.49 2.01 1.05 -0.49 -0.89 1.43 1.64 1.57 1.51 0.71 1.07 700 t?(t+ l)= kw12 1 +x’(QR,(t- 1)x(t) (48) where y is a constant inversely proportional to the prediction error e(t). 6. IMPLEMENTATION RESULTS In order to evaluate the performance of the fuzzy expert system, load forecasting is performed on a typical utility data. The models ANN, FES,, and FES, are tested on a 2-year utility data for generating peak and average load profiles and some of the results are given in the subsequent subsections. In (Bunn and Farmer (1989) and Brace (199 1)) it has been shown that ANN gives the best prediction and accuracy compared to conventional approaches. So in this case the results of FES, and FES, are compared to that of the ANN approach. The training sets are formed separately for each of the seven day types (i.e., Tuesdays through Thursdays, Mondays, Fridays, Saturdays, Sundays, Holidays). The selection of training patterns and the selection of variable ranges are given in Sections 6.1 and 6.2. 6.1. Optimum Selection of Training Patterns The utility data studied here are susceptible to large and sudden changes in weather and load, so selection of appropriate training cases plays a vital role in training the network. Several techniques for the selection of training patterns have been suggested in Peng, Hubele, and Karady (1992). The following load model is used for peak load forecasting: P,,(i) =f(P,,,(i--n), P,,,(i-n-l), . . ., P,,(i-n-n,), O,,,(i--n), O,,,(i), O,,,(i- l), . . ., OmaxG-dr H,,,(i-n), H,,,(i), H&i-l), Hybrid Neural Network for Fuzzy Expert System TABLE 4 Peak Load Forecasting in January (Winter) Using 166 Hour Ahead Forecast 417 Peak Peak Peak Actual Forecast Forecast Forecast Load (NV Day (MW (MY 1 2690 2534.5 2 2628 2556.0 3 2703 2610.8 4 2592 2651.4 5 2530 2616.5 6 2574 2503.5 7 2389 2433.2 8 2513 2433.1 9 2500 2353.8 IO 2450 2324.3 11 2551 2620.1 12 2763 2953.9 13 2603 2531.9 14 2914 2761.6 15 2761 2894.6 16 2514 2389.6 17 2543 2697.4 18 2435 2494.7 19 2496 2463.8 20 2551 2456.6 21 2813 2886.4 22 2537 2382.0 23 2381 2279.8 24 2459 2540.6 25 2505 2575.4 26 2429 2500.7 27 2438 2281.2 28 2748 2839.0 29 2388 2513.8 30 2175 2125.4 31 2539 2456.2 MAPE Iterations required 5.78 2.74 3.41 -2.29 -3.42 2.74 -1.85 3.18 5.85 5.13 -2.71 -6.91 2.73 5.23 -4.84 4.95 -6.07 -2.45 1.29 3.70 -2.61 6.11 4.25 -3.32 -2.81 -2.95 6.43 -3.31 -5.27 2.28 3.26 3.87 2050 2630.0 2564.7 2649.5 2636.6 2580.1 2519.9 2351.5 2444.9 2370.8 2340.5 2510.7 2886.0 2555.9 2853.1 2658.3 2408.4 2653.9 2454.7 2509.2 2493.9 2864.5 2441.9 2285.0 2518.5 2538.8 2480.5 2351.0 2808.2 2481.8 2198.9 2476.3 2.23 2.41 1.98 -1.72 -1.98 2.10 1.57 2.71 5.17 4.47 1.58 -4.45 1.81 2.09 3.72 4.20 -4.36 -0.81 -0.53 2.24 -1.83 3.75 4.03 -2.42 -1.35 -2.12 3.57 -2.19 -3.93 -1.10 2.47 2.61 1170 2742.2 2571.0 2666.5 2551.0 2480.7 2628.8 2354.8 2449.7 2378.5 2360.1 2519.6 2874.1 2649.6 2972.3 2660.2 2415.2 2651.8 2424.8 2481.5 2514.5 2762.9 2462.2 2285.0 2507.4 2537.8 2462.3 2370.2 2784.3 2474.2 2159.3 2489.2 -1.94 2.17 1.35 1.58 1.95 -2.13 1.43 2.52 4.86 3.67 1.23 -4.02 -1.79 -2.00 3.65 3.93 -4.28 0.83 0.58 1.43 1.78 2.95 4.03 -1.97 -1.31 -1.37 2.78 -1.32 -3.61 0.72 1.96 2.29 890 . ., H&i-4)). (49) For average load forecast a similar model as eqn. (49) is chosen, given by: P,,(i) =KP,,(i-n), P,,(i- l), . . ., PJi-n,), %di-n), O,,,W~ O,,,G- I), . . ., Omax(i-Ut OmJi-nh O,i,(& O,,,(i- 11, . . ., O,,,U-n,L %,,(i-n), H,,,(i), ff,,A-I), . . ., ~,,,(i-~J). H,,,(i-fi), HmilI(i)7 H,i,(i- I), . . ., H,,,,(i- 0. (50) Here, n indicates the lead time of the forecast (n * n,, n2, n?). P, 0, and H stand for load, temperature, and humidity, respectively. Also different values of n,, n,, and n3 were tested to find the effect of the past data on the load forecast. With n,, n,, n,>O there was no marked improvement in the results. Therefore, we chose, n1 = n2 = n3 = 0 in this study. However, the choice of n,, n2, n3 depends entirely on the utility data concerned. 6.2. Scaling of the Variable Range Because the input variables and the estimated ones from the hybrid neural network have wide variations in magnitudes, they will cause convergence problem and the system will behave completely erratic. To circumvent this problem, the variables are scaled between 0.1 and 0.9 (Rahman, Drezga, & Rajagopalan, 1993). This is performed so as to maximise accuracy and minimise training time. The following scheme is adopted to scale the variables between 0.1 and 0.9. Let X,,,,, and X, m,n denote the upper and lower bounds of the observed range of feature X, in all the patterns in the historical data base considering numerical values only. Similarly, O,,,,, and 0 ,, m,n denote the upper and lower bounds of the observed range of outputs. Then X, is normalised as X: = 0.1 + WW-X. ,,,MX.,,,-X,. n,,n II (51) where (i, j) corresponds to the jth pattern of the ith training set. 418 P. K. Dash et al. TABLE 5 Average Load Forecasting in January (Winter) Using 24 Hour Ahead Forecast Peak Peak Peak Actual Forecast Forecast Forecast Load WJ) Day WV (M'W (%) 'FG (FpE& 'Kz &I,, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 23: 31 MAPE Iterations required 2166 2223.4 2153 2209.8 2233 2218.1 2093 2092.1 2050 2041.0 2109 2109.3 1937 1933.4 1983 1974.9 2045 1991.6 2003 2036.6 1982 1954.9 2117 2116.2 2064 2035.6 2237 2262.3 2198 2239.1 2068 2065.3 2021 2057.0 1976 1953.7 2019 2019.9 2073 2090.0 2195 2224.6 2015 1982.8 1917 1866.8 1977 1943.5 2021 2017.5 1977 1961.6 1964 2010.8 2086 2124.2 1876 1889.6 1772 1751.3 1960 200.1 -2.65 -2.64 0.67 0.04 0.44 -0.01 0.18 0.41 2.61 -1.68 1.37 0.04 1.37 -1.13 -1.87 0.13 -1.78 1.13 -0.04 -0.82 -1.35 1.60 2.62 1.70 0.17 0.78 -2.38 -1.83 -0.72 1.17 -2.05 1.21 1700 2161.2 2192.4 2242.3 2092.6 2050.1 2107.9 1934.2 1969.4 2008.7 2007.1 1980.2 2116.8 2042.8 2241.5 2171.0 2065.2 2047.9 1970.8 2030.1 2053.3 2175.8 2041.4 1935.1 1961.3 2017.1 1965.6 1921.8 2078.7 1889.4 1757.2 1928.1 0.22 -1.83 -0.42 0.02 -0.00 0.05 0.14 0.68 1.78 -0.21 0.09 0.01 1.03 -0.20 1.23 0.14 -1.33 0.26 -0.55 0.95 0.87 -1.31 -0.94 0.80 0.20 0.58 2.15 0.35 -0.72 0.83 1.63 0.69 1020 2163.5 2120.8 2216.6 2089.0 2050.0 2108.2 1936.9 1976.2 2080.1 1998.2 1980.5 2116.6 2049.9 2240.6 2221.3 2067.1 1994.9 1982.1 2007.3 2087.2 2203.5 2003.3 1936.8 1961.0 2024.1 1970.9 1939.0 2077.9 1883.0 1765.5 1924.6 0.12 1.49 0.73 0.19 -0.00 0.04 0.00 0.34 -1.72 0.24 0.07 0.02 0.70 -0.17 -1.06 0.04 1.29 -0.31 0.58 -0.68 -0.39 0.58 -1.03 0.81 -0.15 0.31 1.27 0.39 -0.37 0.37 1.81 0.56 680 Similarly, the output 0, can be expressed 0~=0~1+[0~8(0~~0~,~~)/(0~,~~~~0~,~~~)] (52) where X7 and 07 denote the normalised input and output vectors, respectively. The normalised predicted values can be converted back to the actual values using the above expressions. The ideas discussed in sections 6.1. and 6.2 are used for obtaining the peak and average load forecasts. 6.3. Peak Load Forecasting For n h ahead peak load forecasting, the following training data are used for the ANN: Input pattern: P,,,(i), O,,(i), H,,,(i), o’,,,(i + n), H/,,( i + n) Output pattern: P ,,,(i + n) for the ANN. Superscript f denotes the forecasted values for @ and H. The forecasted values for (i + n)th day are used to get a more realistic forecast. For FES, and FES, the training patterns used are: Input pattern: A@,,,(i, i + n) and AH_(i, i + n) Output pattern: erc(i), the desired load correction. The P,,(i + n) for FES, and FES, is obtained using eqn. (7). Table 1 gives the learned membership functions for 24 h ahed peak load forecasting in winter using FES,, for example, rule 0 is interpreted as: RO: IF A@,,, is term 0 and AH_ is term 3 THEN e^, is term 7. Figure 2 gives the learned membership functions after the first phase (unsupervised learning phase) and the Hybrid Neural Network for Fuzzy Expert System TABLE 6 Average Load Forecast in June (Summer) Using 46 Hour Ahead Forecast -- Peak Peak Peak Actual Forecast Forecast Forecast Load NV Day (MW) WW 1 1521 2 1531 3 1473 4 1336 5 1285 6 1433 7 1406 a 1403 9 1428 10 1431 11 1301 12 1235 13 1418 14 1408 15 1421 16 1447 17 1440 la 1352 19 1305 20 1430 21 i 383 22 1414 23 1423 24 1406 25 1267 26 1221 27 1397 28 1401 29 1403 30 1411 MAPE Iterations required 1541.1 -1.32 1562.5 -2.06 1455.2 1.21 i 380.9 -3.36 1254.0 2.41 1446.2 -0.92 1375.5 2.17 1442.3 -2.80 1461.0 -2.31 1467.3 -2.54 1272.8 2.17 1206.7 2.29 1394.7 1.64 1395.5 0.89 1451.3 -2.13 1480.1 -2.29 1420.7 1.34 1313.9 2.82 1279.8 1.93 1407.5 1.57 i 387.8 -0.35 1434.1 -1.42 1386.9 2.54 1361.6 3.16 1230.6 2.87 1254.3 -2.73 1371.7 i .ai 1374.1 1.92 1415.2 -0.87 1376.1 2.47 2.01 1630 1509.7 1512.3 1453.1 1370.3 1260.8 1443.5 1386.6 1437.0 1450.7 i 459.8 1286.0 1217.8 1419.3 1401.9 1443.0 1468.4 1425.3 1324.8 1282.6 1416.6 1395.3 1425.0 1404.1 1370.4 1250.3 1207.9 1401.7 I 384.3 1416.5 1393.6 0.74 1.22 1.35 -2.57 1.88 -0.73 1.38 -2.42 -1.59 -2.01 1.15 1.39 -0.09 0.43 -1.55 -1.48 1.02 2.01 1.72 0.94 -0.89 -0.78 1.33 2.53 1.32 1.07 -0.34 1.19 -0.96 1.23 1.31 1520 1516.7 1549.5 1456.5 1305.9 1272.9 1423.5 1418.5 1418.0 i 448.8 1411.3 1322.7 1219.9 1419.8 I 399.8 1439.6 1429.9 i 428.3 1330.5 1290.3 1440.3 i 389.2 1401.1 1410.9 1373.1 1248.6 1234.4 1395.9 I 384.9 I 389.3 1390.1 0.28 -1.21 1.12 2.25 0.94 0.66 -0.89 -1.07 -1.46 i .3a -1.67 1.22 -0.13 0.58 -1.31 1.18 0.81 1.59 1.13 -0.72 -0.45 0.91 0.85 2.34 1.45 -1.10 0.08 1.15 0.98 1.48 i .oa 1070 419 second phase (supervised learning phase). Figure 3 gives the mean absolute percentage errors (MAPE’s) versus the number of iterations. The results in Figures 2 and 3 are obtained for 24 h ahead load forecasting in January (winter). From Figure 3 we find that the FES, gives fastest convergence followed by FFS, and ANN. The convergence speed of the FES, was found to be superior because of the linear Kalman filter equations used for weight update and the error-dependent forgetting factor was responsible for driving the MAPE low during the first few hundred iterations until the bias introduced by the initial conditions was eliminated. Table 2 gives the peak load forecasting results in terms of mean absolute percentage errors (MAPEs) for ANN, FES, and FES, in the month of June (winter) using 24 h ahead forecast. Tables 3 and 4 give the results for the month of January (winter) using 48 h and 168 h ahead forecast, respectively. From Tables 2,3 and 4 we see that the FES, and FES, give better prediction accuracy compared to ANN. Also we find that the results for 48 h and 168 h predictions are comparable with that of the 24 h ahead predictions. This is because the load forecasting was performed as a one- step process (i.e. looking 24 h ahead, 48 h ahead, and so on). However, as the lead time increased to 168 h the PEs were found to be greater than 4% even with FES,. As the primary aim of this paper was to make a comparative assessment between ANN, FES, and FES,, no attempt was made to improve the forecast errors further. 6.4. Daily Average Load Forecast For n h ahead average load forecasting, the following training data are used for ANN. Input pattern P,,(i), O,,(i), O,,,(i), H,,,(i), &&). @L(i + n), o’,,,(i + n). Hf,,,(i + n), Hti,(i + n) Output pattern: P,,(i + n) for ANN. For FES, and FES,, the training patterns used are: 420 R K. Dash et al. TABLE 7 Average Load Forecasting in June (Summer) Using 168 Hour Ahead Forecast Peak Peak Actual Forecast Forecast Load VW Day (MW) (MW (K, 'Kz cFPE:,, 1 z 4 5 ; : 10 11 12 13 14 15 16 17 18 19 20 21 22 23 z; 26 27 28 z: MAPE Iterations required 1521 1558.6 1531 1558.9 1473 1436.0 1336 1409.3 1285 1244.1 1433 1409.4 1406 1377.5 1403 1441.2 1428 1484.1 1431 1469.6 1301 1358.0 1235 1202.5 1418 1390.3 1408 1394.9 1421 1484.4 1447 1482.3 1440 1402.7 1352 1308.5 1305 1269.2 1430 1404.0 1383 1350.4 1414 1463.1 1423 1371.5 1406 1359.5 1267 1219.4 1221 1256.7 1397 1372.7 1401 1373.1 1403 1414.4 1411 1359.6 -2.47 -1.82 2.51 -5.49 3.18 1.65 2.03 -2.72 -3.93 -2.70 -4.38 2.63 1.95 0.93 -4.46 -2.44 2.59 3.22 2.74 1.82 2.36 -3.47 3.62 3.31 3.76 -2.92 1.74 1.99 -0.81 3.64 2.78 2650 1542.9 1551.8 1452.7 1388.2 1250.8 1428.4 1385.3 1435.3 1471.8 1447.0 1345.8 1212.5 1403.0 1399.7 1453.7 1473.2 1412.1 1357.7 1290.0 1406.3 1363.2 1448.6 1393.0 1372.1 1237.6 1241.6 1384.6 1379.7 1405.5 1374.9 -1.44 -1.36 1.38 -3.91 2.66 0.32 1.47 -2.30 -3.07 -1.12 -3.44 1.82 1.06 0.59 -2.30 -1.81 1.94 -0.42 1.15 1.66 1.43 -2.45 2.11 2.41 2.32 -1.69 0.89 1.52 -0.18 2.56 1.76 2480 Peak Forecast 1538.2 1550.6 1456.5 1364.9 1261.6 1434.6 1385.9 1431.1 1448.7 1416.5 1330.0 1221.7 1432.0 1411.0 1451.8 1463.8 1415.5 1358.1 1295.0 1441.9 1399.7 1396.6 1404.8 1389.7 1254.2 1236.1 1387.9 1380.5 1399.5 1378.1 -1.13 -1.28 1.12 -2.16 1.82 -0.11 1.43 -2.00 -1.45 1.01 -2.23 1.08 -0.99 -0.21 -2.17 -1.16 1.70 -0.45 0.77 -0.83 -1.21 1.23 1.28 1.16 1.01 -1.24 0.65 1.46 0.25 2.33 1.23 1310 Input pattern: A@,,,,(& i + n), AQmin(ir i + n), AH&i, i + n), AHmi,(i, i + n) Output pattern: e,,(i), the desired load correction. Again forecasted temperature and humidity values are used for the day of the forecast. The P,,(i+n) for FES, and FES, is obtained using eqn. (7). Table 5 gives the average load forecasting results, number of iterations for convergence, PEs and the MAPEs for the ANN, FES,, and FES, models in the month of January (winter) using 24 h ahead forecast. Tables 6 and 7 give the results for 48 h and 168 h ahead predictions in the month of June (summer). Again, from Tables 5, 6, and 7 we find the improved performance of FES, in terms of faster convergence and improved overall accuracy over the ANN and FES,. 7. DISCUSSION The fuzzy expert system presented in this paper is constructed from training examples by machine learning techniques, and the neural network model is trained to develop fuzzy logic rules and find input/output member- ship functions. By combining both unsupervised and supervised learning schemes, the learning speed for time series forecasting problems converges much faster than original back-propagation learning algorithm. Further, by using a linear Kalman filter for the supervised learning phase, the learning time is shortened with rapid con- vergence due to the adaptive nature of the learning algorithm. The fuzzy expert system using a linear Kalman filter produces a more accurate load forecast for lead times varying from 24 h to 168 h in comparison to the one using gradient descent back-propagation algo- rithm. The forecasting results presented in Section 6 are based upon only true temperature information because all the data for this utility are historical. However, in reality the temperature information will be the forecasted value and this will add to the forecast error. The developed expert systems are used to forecast the load during weekends but no attempt is made to forecast for days with unusual events. Such events will require Hybrid Neural Network for Fuzzy Expert System 421 extensive data analysis to track the events in that day and to select training cases accordingly from the previous days with similar events. Such activities are continuing, and will be reported in a future paper. As a final note, a shorter period is used for training as load patterns change very fast. But training and predic- tion can be done over longer periods with less chaotic data having strong correlation to any particular weather and environmental variable. This flexibility permits the adaptation of the proposed fuzzy expert systems for producing accurate load forecasts for different geo- graphic areas. 8. CONCLUSIONS This paper presents the development of a fuzzy expert system using a hybrid neural network approach to predict peak and daily average load profiles in an energy management system. The fuzzy expert system is mod- elled as a hybrid neural network and uses fuzzy membership values of load and weather parameters. A hybrid learning scheme consisting of both self-organized and supervised learning phases is used for training the hybrid neural network. Further, the paper presents simulation results using both back-propagation and Kalman filter-based algorithms, the latter producing faster convergence during training and more accurate predictions. Acknowledgement-The authors gratefully acknowledge the funds from the National Science Foundation (NSF Grant No. INT-9209103 and INT-9 117624). USA for undertaking this research. REFERENCES Box, G. E. P., & Jenkins, G. M. (1976). Time series onolysisforecosring and control, Oakland, CA: Holden-Day. Brace, M. C. (1991). A comparison of the forecasting accuracy of neural networks with other established techniques, Firsr Inrer- national Forum on Applications on Neural Networks to Power Systems, Seattle, WA, July, 1991, pp. 23-26. Buckley, J. J., Hayashi, Y., & Czogala, E. (1993) On the equivalence of neural nets and fuzzy expert systems, Fuzzy Sets Systems, 53 129-134. Bunn. D. W.. & Farmer, E. D. (1989). Comparative models for electric loadforecasting, New York: John Wiley and Sons. Dash, P. K., Dash, S., Rama Krishna, G., & Rahman, S. (1993). Forecasting of a load time series using a fuzzy expert system and fuzzy neural networks, Engineering International Systems, l(2), 103-117. Dash, P. K., Satpathy, J. K., Rama Krishna, G., & Rahman. S. (1993). An improved kalman filter based neural network approach for short-term load forecasting, Proceedings of the Third International Symp. on Electricity Distribution and Energy Management, Vol. 1, pp. 3-8. El-Sharkawi, M. A., Oh, S., Marks, R. J., Dambourg. M. J., & Brace, C. M. (1991). Short term electric load forecasting using an adaptively trained layered perceptron, First International Forum on Applications of Neural Networks to Power Systems, Seattle WA., July 23-26, 199 1, pp. 3-6. Hayashi, Y., Buckley, J. J., & Czogala, E. (1992). Fuzzy expert systems versus neural networks, Proceedings of International Joint Con- ference on Neural Networks, Baltimore, Vol. II, June 7-12 pp. 720-726. Haykin, S. S. (1986). Adaptive filter theory (pp. 3 12-3 14). Englewood Cliffs, NJ: Prentice-Hall. Ho, K. L., Hsu, Y. Y., &Yang, C. C. (1992). Short term load forecasting using a multilayer neural network with an adaptive learning algorithm, IEEE Transactions on Power Systems, 7( 1). 141-I 50. Lee, C. C. (1990). Fuzzy logic in control systems: Fuzzy logic controller, part 2. IEEE Transactions Systems, Man & Cybernetics, 20(3), 266-275. Lee, K. Y., Cha, Y. T., & Park, J. H. (1992). Short term load forecasting using an artificial neural network, IEEE Transactions on Power Systems, 7(I), 124-133. Lin, C. T., & Lee, G. C. S. (1991). Neural network-based fuzzy logic control and decision system, IEEE Transactions on Computers, 40(12), 1320-1336. Moghram, I., & Rahman. S, (1989). Analysis and evaluation of five short-term load forecasting techniques, lEEE Transactions on Power Systems, 4(4), 1484-1490. Pal, S. K., & Mitra, S. ( 1992). Multilayer perceptrons, fuzzy sets and classifications, IEEE Transactions on Neural Networks, 3(5), 683-698. Park, D. C., El-Sharkawi, M. A., & Marks, R. J., (1991). Adaptively trained neural networks, IEEE Transactions on Neural Networks, 224-245. Park, J. H., Park, Y. M.. & Lee, K. Y. (1991). Composite models for adaptive short term load forecasting, lEEE Transactions on Power Systems, l(2), 450-157. Peng, T. M., Hubele, N. F., & Karady, G. G. (1992). Advancement in the application of neural networks for short term load forecasting, IEEE Transactions on Power Systems, 7(l), 250-258. Rahman, S., Drezga, I., & Rajagopalan, J. (1993). Knowledge enhanced connectionist models for short-term electric load fore- casting, International Conference on ANN applications to Ponaer Systems, Yokohama, Japan, April. Rumelhart, D. E., & Zipses, D. (1985). Feature discovery by competitive learning. Cognitive Science, 9, 75-l 12. Scalero. R. S., & Tepedelenlioglu, N. (1992). A fast algorithm for training feedforward neural networks, IEEE Transactions on Signal Processing, 40( 1). 202-2 10. Singhal, S., & Wu, L. (1989). Training feed-forward network with the extended Kalman algorithm, Proceedings of ICASSP, pp. 1187- 1190. Singhal, S., & Wu, L. (1989). Training feed-forward network with the extended Kalman algorithm. Proceedings of ICASSP, pp. 1187-1190. Wang, L.-X. & Mendel, J. M. (1992). Generating fuzzy rules by learning from examples, IEEE Transacrions on Systems Man & Cybernetics, 22(6), 1414-1427. Zadeh. L. A. (1965). Fuzzy sets, Information and Control. 8, 338-358.