Verifying regional climate model results with web-based expert-system Procedia Technology 1 ( 2012 ) 24 – 30 2212-0173 © 2012 Published by Elsevier Ltd. doi: 10.1016/j.protcy.2012.02.007 INSODE 2011 Verifying regional climate model results with web-based expert-system Emrullah Sonuca *, Baha Sena, Burak Senb aComputer Engineering Department/ Faculty of Engineering/ Karabuk University, Karabuk, Turkey bGeneral Directory of State Meteorological Service, Ankara, Turkey Abstract The verification system aims at monitoring the forecast quality over time. Verification helps improving the forecast quality by knowing the strengths and weaknesses of the existing forecasting system and by comparing the quality of different forecasting methodologies. Thus, the web-based verification system has been developed for verification of forecast results that is produced by International Center for Theoretical Physics Regional Climate Model v4 for our country. The forecasters and analysts can analyze the data in real-time with this web-based system. In this study, model values obtained from the system provided by ULAKBIM High Performance and Grid Computing Center. Model and station values were compared with each other for verification of model results with observation values. Therefore model grid values are transferred to station by using bi-linear and nearest neighbor (k-NN) (proximal) interpolation methods. This process in meteorological literature is called grid to point technique. Verification methods for forecasts of continuous variables are used to verify forecast values with observation values. Some verification methods; Mean Error, Mean Absolute Error and Root Mean Square Error, are calculated for validation. Verification results are shown as table and graphics on web-based system which is developed by the power of PHP (PHP: Hypertext Preprocessor). © 2011 Published by Elsevier Ltd. Keywords expert verification system; climate model; grid to point verification; regcm; grads 1. Introduction Numerical weather prediction (NWP) uses atmospheric state variables (temperature, wind, humidity and pressure), some physical equations (motion, thermodynamics, continuity, hydrostatic equation) and mathematical models to make the weather forecast. The first try of the numerical weather prediction resulted in failure by Lewis Fry Richardson in 1920s. Basically, two group of models used for NWP; Global Models (GCM’s) and Limited Area Models. Limited area models are nested within global models (GCM’s) but some of these models don't consider hydrostatic equilibrium equations [1]. In a typical global climate model, the horizontal (spatial) resolution of atmospheric component is 250 km and ocean component is between 125 and 250 km. This resolution is not enough for lots of the details of the regional * Emrullah Sonuc. Tel.: +90 370 433 2021 . E-mail address: esonuc@karabuk.edu.tr . Available online at www.sciencedirect.com Open access under CC BY-NC-ND license. Open access under CC BY-NC-ND license. http://creativecommons.org/licenses/by-nc-nd/3.0/ http://creativecommons.org/licenses/by-nc-nd/3.0/ 25 Emrullah Sonuc / Procedia Technology 1 ( 2012 ) 24 – 30 climate characteristics. Turkish State Meteorological Service (TSMS) is working about the climate changes and trends for Turkey, to predict climate of the future. As the regional climate models, RegCM and PRECIS models are chosen because of the ability to represent Turkey [1]. In this study, observation data were provided by TSMS and forecast data were calculated by regional climate model RegCM V4.0 [2]. This process was performed on the system provided by ULAKBIM High Performance and Grid Computing Center. Forecast data decoded by The Grid Analysis and Display System (GRADS) which is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. Forecast and observation data are recorded in database with the web-based system coding in PHP. Before the verification, results are transferred from grid to point using bi-linear interpolation and the nearest neighbor interpolation. 2. Materials and Evaluation Methods 2.1 Regional Climate Model (RegCM) Each region or area has a characteristic features. For this reason, GCMs are not enough to make prediction on a limited area. Therefore regional climate models are developed. The Regional Climate Model System (RegCM), developed by The Abdus Salam International Centre for Theoretical Physics (ICTP). RegCM is hydrostatic and compressible atmospheric model in a limited area and consists of sigma-pressure levels. The first version of the model, RegCM1, was developed in 1989 and the last version that is RegCM4 has been released in 2010[2]. There are four main steps for RegCM: 1) The model equations, 2) parameterization, 3) projection and grid structure of the model, and 4) running the model[3]. Topography map of RegCM model which is used at this study is shown in Figure 1. This model has been run on the system provided by ULAKBIM High Performance and Grid Computing Center. Figure 1 - Topography map of RegCM4 model. 2.2 Observation and Forecast Data The observation data has taken as a regular file from TSMS. This file includes station number, year, month, day, air temperature, wind speed and more. And in another file includes station information like station name, latitude and longitude of station, address of the station etc. The air temperature at 2m (t2m) was used for verification. T2m values have been read through this file and the data of the stations were recorded in the database with the web-based application which was coded in PHP. Thus observation data of the stations has become available for verification. The forecast data has been created by RegCM. Model has been run on the super computers provided by ULAKBIM High Performance and Grid Computing Center. The 4th version of the RegCM has been installed on system. It has been run to create monthly surface output files between 1989 and 2007. Model used the ERA Interim data sets which are prepared by European Centre for Medium-Range Weather Forecasts (ECMWF) used to create the model initial and boundary data. At the end of the process surface output files are downloaded from ULAKBIM to the computer which has a web-based system for verification analysis. Surface output file is a binary data file and it has a data descriptor file contains a description of the binary data. The general contents of a gridded data descriptor file are as follows [4]: Filename for the binary data Missing or undefined data value 26 Emrullah Sonuc / Procedia Technology 1 ( 2012 ) 24 – 30 Mapping between grid coordinates and world coordinates Description of variables in the binary data set The data of these files has been decoded by web-based application with the user parameters using GRADS. Also the forecast data has been recorded in the database with the application. 2.3 Interpolation Interpolation is the process of using known data values to estimate unknown data values. [5] In this study, bi- linear interpolation and nearest neighbor interpolation method is applied to obtain forecast data from output files before verifying process. Bi-linear interpolation is an extension of linear interpolation for interpolating functions of two variables on a regular grid. The four red dots show the data points and the green dot is the point at which we want to interpolate [6]. Figure 2 – Interpolation to one point. In the figure 2; P: Desired Point, Q: Known Points (four grid values), R: Point on the line with the known points. Suppose that we want to find the value of the unknown function f at the point P = (x, y). It is assumed that we know the value of f at the four points Q11 = (x1, y1), Q12 = (x1, y2), Q21 = (x2, y1), and Q22 = (x2, y2). [6] We first do linear interpolation in the x-direction. This yields where R1 = (x,y1), (1) where R2 = (x,y2). (2) We proceed by interpolating in the y-direction. (3) This gives us the desired estimate of f(x, y). (4) The nearest neighbor interpolation is a simply method to reconstruct a function is to take for each position the value of the nearest sampling point. The nearest neighbor algorithm selects the value of the nearest point and does not consider the values of neighboring points at all, yielding a piecewise-constant interpolation. 27 Emrullah Sonuc / Procedia Technology 1 ( 2012 ) 24 – 30 2.4 Verification At first, standard statistical methods as well as graphical techniques were used. This method is called subjective verification. On the other hand, objective verification is numerical expression of verification result that is generated by comparing forecast and observation values . Of course, the difference values of all points on the map can be found. However, the difference values of a particular plane may not make sense alone. Therefore to see trend of difference values for the same parameter in a fixed point continue in a time is effective determining the accuracy of the forecasts in the calculation of objective verification. Because of this, verification of variation with time in a point is considered instead of a broad area [1]. In this study, the following verification statics were computed for compared values: Mean Error (ME), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). (5) The ME is the simplest and most familiar of scores and can provide very useful information on the local behaviour of a given weather parameter (e.g. maximum temperature close to the coastline or minimum temperature over snow-covered ground). The ME range is from minus infinity to infinity, and a perfect score is = 0. However, it is possible to reach a perfect score for a dataset with large errors, if there are compensating errors of a reverse sign. The ME is not an accuracy measure as it does not provide information of the magnitude of forecast errors. The MAE range is from zero to infinity and, as with the ME, a perfect score equals = 0. The MAE measures the average magnitude of forecast errors in a given dataset and therefore is a scalar measure of forecast accuracy. It is advisable to always view the ME and the MAE simultaneously. For small or limited data sets the use of MAE is preferred. The mean absolute error is given by (6) The most common accuracy measure is the RMSE [15]. (7) The RMSE is sensitive to interpolation, phase error and the general variability or anomaly [7]. 3. Results 18 stations were selected for verification. These stations have been doing regularly synoptic observation on and basis at the airports. The data for a one year (365 observation data) belong to 1996 and 00 GMT (Greenwich Mean Time). These data compared with model prediction using two different interpolation methods and three separate statistical evaluations were analyzed. Results; verification score for each station with the Turkey average, location of the stations (continental or maritime) are in the table 1. Table 1. Verification scores of the year 1996 No. Station Number & Name L_ME L_MAE L_RMSE N_ME N_MAE N_RMSE M-C 1 17038-Trabzon -9.66 9.67 10.42 -9.33 9.33 10.11 M 2 17060-Istanbul Ataturk -4.24 4.94 5.95 -4.02 4.89 5.92 M 3 17082-Tokat -3.45 4.34 5.18 -3.45 4.34 5.19 C 4 17090-Sivas. -4.66 5.07 5.79 -4.61 5.03 5.76 C 28 Emrullah Sonuc / Procedia Technology 1 ( 2012 ) 24 – 30 5 17096-Erzurum -1.10 4.11 5.30 -1.01 4.07 5.27 C 6 17112-Canakkale -4.11 4.39 5.21 -4.05 4.35 5.17 M 7 17115-Balikesir -4.03 4.64 5.75 -4.04 4.65 5.77 M 8 17124-Eskisehir -3.89 4.42 5.22 -3.85 4.42 5.23 C 9 17128-Ankara Esenboga -2.97 3.84 4.65 -2.89 3.78 4.59 C 10 17170-Van -6.04 6.18 7.09 -5.54 5.73 6.69 C 11 17195-Kayseri -3.14 3.85 4.72 -3.15 3.87 4.74 C 12 17200-Malatya Erhac -5.16 5.60 6.73 -4.66 5.20 6.32 C 13 17202-Elazig -3.70 4.19 5.21 -3.71 4.19 5.22 C 14 17219-Izmir Adnan Menderes -1.66 3.26 4.01 -1.80 3.24 3.97 M 15 17244-Konya -4.43 4.68 5.45 -4.41 4.65 5.41 C 16 17260-Gaziantep -4.73 5.13 6.15 -4.69 5.08 6.09 C 17 17280-Diyarbakir -2.88 3.63 4.65 -2.79 3.58 4.60 C 18 17300-Antalya -6.12 6.15 6.80 -6.44 6.46 7.08 M Turkey_average -4.22 4.89 5.79 -4.14 4.83 5.73 C=Continental, M=Maritime, L_ME=Bi-linear interpolation mean error, L_MAE=Bi-linear interpolation mean absolute error, L_RMSE=Bi-linear interpolation root mean square error, N_ME=nearest neighbor interpolation mean error, N_MAE=nearest neighbor interpolation mean absolute error, N_RMSE=nearest neighbor interpolation root mean square error According to the results of the analysis; there is not a significant difference between bi-linear interpolation and nearest neighbor interpolation. In 18 stations, bi-linear interpolation results are better for 5 stations, nearest neighbor interpolation results are better for 12 stations and in 1 station, both methods have same results. In verification scores about both interpolation methods, differences are over than ±0.1 for 4 stations and are lower than ±0.1 for 14 stations. Given this assessment, it can be said that there is not a significant difference for selected stations between both methods and evaluated data. The bias, which is calculated for each day, is shown as a graphically in figure 3. Figure 3 – BIAS Line Graphic for Istanbul Ataturk Airport Station (Sample for 15 days) ME and MAE results are close to each other. It shows that the model gives over/lower estimate continuously. In the cases of over-estimate and lower estimate (such as in the Erzurum and Izmir examples) ME value is significantly smaller than MAE value. There is not a significant difference for RMSE and MAE scores. According to the technique of nearest neighbor interpolation; for the scores of 18 stations, RMSE values are between 3.97 and 10.11 and, MAE values are between 3.2 and 9.33, ME are between -9.33 and -1.8. Average of Turkey (for 18 stations) were determined as 5.73, 4.83, -4.14 respectively. According to the technique of bi-linear interpolation; for the scores of 18 stations, RMSE are between 4.01 and 10.42, MAE are between 3.26 and 9.67, ME are between -9.66 and -1.66. Average of Turkey ( for 18 stations) were determined as 5.79, 4.89, -4.22 respectively. 29 Emrullah Sonuc / Procedia Technology 1 ( 2012 ) 24 – 30 In two interpolation methods, the climate model gives lower estimate for 00 GMT. The reason of this situation is reflecting the radiation-induced cooling at night more than usual by model. To determine this condition exactly, it should perform the verification using observation values including 06, 12, 18 GMT and daily average temperature values and analyze the results. The aim of this study is to give information of the software and to show sample application. Therefore mentioned evaluations will be done next studies. 4. Conclusion All over around the world, many meteorological services develop a forecast verification system to verify forecast data. The accuracy of the model predictions are increasingly important to changing climate conditions. Therefore verification data sets created to use in forecast verification models. Demirtas, Nance, Bernardet, Lin, Chuang, Loughe, Ma-honey, Gall and Koch (2005) [8] have developed the verification system includes National Centers for Environmental Prediction (NCEP)’s surface and upper-air verification package, which employs a grid-to-point (station based) approach. Mahoney, Henderson, Brown, Hart, Loughe, Fischer and Sigren (2005) [9] have developed the real-time verification system (RTVS) to provide a statistical baseline for weather forecasts and model based guidance products, and to support real-time forecast operations, model-based algorithm development, and case study assessments. Şen and Aydın [10] have developed a real-time verification system. In their study, inverse distance weighting interpolation method is applied to forecast data before verification. In this study, the mentioned web-based application is an alternative system to verify t2m values by meteorologists or relevant persons. The forecast values are created by RegCM which is running on the system provided by ULAKBIM. This application supports both objective and subjective verification. The subjective verification provided with the graphics and charts and the objective verification provided with the tables including forecast parameter values. Furthermore mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) values are calculated with the objective verification. As a result of this study, the mentioned web-based application is developed for verifying surface output files including t2m values. This application could develop to add new parameters such as precipitation etc. And radiation output files can be added to system to verify new parameter values. References 1. Sayısal Hava Tahmini Şube Müdürlüğü Sayısal Hava Tahmini, Retrieved May 19, 2011 from the World Wide Web: http://www.dmi.gov.tr/FILES/genel/sss/sayisalnedir.pdf. 2. ICTP - Regional Model: REGCM4, Retrieved May 20, 2011 from the World Wide Web: http://www.ictp.it/research/esp/models/regcm4.aspx 3. Morris, D. (2010). E-confidence or incompetence: Are teachers ready to teach in the 21st century?. World Journal on Educational Technology, 2(2), 142-155. 4. B. ŞEN, Bölgesel İklim Modelleri Kullanılarak Çukurova Yöresi’nde İklim Değişikliğinin 1. Ve 2. Ürün Mısır Verimine Olası Etkilerinin Belirlenmesi, Ph.D. Thesis, Çukurova University, Adana, 2007. 5. Components of a GrADS Data Descriptor File, Retrieved May 20, 2011 from the World Wide Web: http://www.iges.org/grads/gadoc/descriptorfile.html 6. Interpolation Techniques, Retrieved May 21, 2011 from the World Wide Web: http://iridl.ldeo.columbia.edu/dochelp/StatTutorial/Interpolation/ 7. Bilinear interpolation, Retrieved May 21, 2011 from the World Wide Web: http://en.wikipedia.org/wiki/Bilinear_interpolation 8. The Root Mean Square Error (RMSE) Retrieved May 21, 2011 from the World Wide Web: http://www.ecmwf.int/products/forecasts/guide/The_RMSE.html 30 Emrullah Sonuc / Procedia Technology 1 ( 2012 ) 24 – 30 9. M. Demirtaş, L. Nance, L. Berdardet, Y. Lin, H.Y. Chuang, A. Loughe, J. Mahoney, R. Gall and S. Koch. The Developmental Testbed Center Verification System. WRF/MM5 User’s Workshop. 27-30 June, 2005 10. J.L. Mahoney, J.K. Henderson, B.G. Brown, J.E. Hart, A. Loughe, C.Fischer and B.Sigren, . The real-time verification system (RTVS) and its application to aviation weather forecasting 10th Conference on Aviation, Range, and Aerospace Meteorology, 2002. 11. B. Şen and N. Aydın, A Real-Time Verification System Development Using Regional Climate Model Results. The 1st International Symposium on Computing in Science & Engineering ISCSE, 2010.