key: cord-0047212-e9yovgk8 authors: Legdou, Anass; Chafik, Hassan; Amine, Aouatif; Lahssini, Said; Berrada, Mohamed title: A Random Forest-Cellular Automata Modeling Approach to Predict Future Forest Cover Change in Middle Atlas Morocco, Under Anthropic, Biotic and Abiotic Parameters date: 2020-06-05 journal: Image and Signal Processing DOI: 10.1007/978-3-030-51935-3_10 sha: 285b994c0a15237cdc0dfe6a0a6dd37f69608a53 doc_id: 47212 cord_uid: e9yovgk8 This study aims to predict forest species cover changes in the Sidi M’Guild Forest (Mid Atlas, Morocco). Used approach combines remote sensing and GIS and is based on training Cellular Automata and Random Forest (RF) regression model for predicting species cover transition. Five covariates that precludes such transition have been chosen according to Pearson’s test. The model was trained and validated based on the use of forest cover stratum transition probabilities between 1990 and 2004 and then validated using 2018 forest species cover map. Validation of the predicted map with that of 2018 shows an overall agreement between the two maps (72%) for each number of RF’s trees used. The 2032 projected forest species cover map indicate a strong regression of Cedar atlas and thuriferous juniper cover and a medium regression of mixture holm oak and thuriferous juniper, mixture of atlas cedar and thuriferous juniper, and sylvatic and asylvatic vacuums, a very strong progression of holm oak, and of mixture atlas cedar, holm oak and thuriferous juniper and medium progression of mixture of atlas cedar and holm oak. These findings provide important insights to planners, natural resource managers and policy-makers to reconsider their strategies to ensure the sustainability goals. provided services without compromising their abilities to fulfil theses services in the future [2] . Nowadays due to social and human needs, forests resources are under high pressures. In many regions, the pressures are far beyond forests productive capacities. Moreover, climate change increase the fragility of the threatened natural ecosystems. As consequence, a diffuse and still progressing process of land use and forest cover changes are widely documented mainly in developing countries [4] . With the need to maintain forest contribution to global cycles and to protect this natural capital for the next generations, there is a need to understand their dynamic and to predict the tendencies in order to stress urgent actions and formulate appropriates policies to improve land use planing [5] . Such understanding relay on the Land Use Land Change Modeling LULC that try to explain human environment dynamics producing the changes [6] . LULC needs multi temporal land/forest cover maps as well as the driving forces conducting to that changes [7] [8] [9] . In addition, machine learning algorithms have been used extensively to explain LULCs. Several researches had combined Cellular automata (CA) with a plethora of modeling frameworks such as Markov chains [10] , neural networks [11] support vector machines [12] and kernel-based methods [13] among others. More recently, CA have been successfully combined with Random Forest [14, 15] . Moroccan forests hold a major part of its biological diversity. It covers 5.8 million hectares, including 132,000 ha of cedar, 1.36 million ha of holm oak, 830,000 ha of argan, 350,000 ha of cork oak, 600,000 ha of thuja and 1 million ha of Saharan acacia (HCEFLCD, 1992). The Atlas cedar occupies a prominent place among other species. Moroccan cedar forests, especially those in the Middle Atlas, show regressive trends. Cedar become limited to mountain tops. Among the factors generally blamed for the degradation: regeneration lack due to high grazing pressure [16] ; high human pressure (overgrazing, cultivation, illegal cutting, fire, etc.); dieback phenomenon, which is becoming increasingly worrisome about the future of cedar stands, and damages caused by cedar's natural enemies (defoliating insects, wood-boring insects, fungi and the maggot monkey), which weakens stands stressed by climatic hazards. Given the status of Moroccan cedar forests as threatened natural capital and in order to understand the driving forces toward its regressive tendencies and to predict its cover in the future, we focus within this work on cedar cover change modeling. The work concerns Sidi M'Guild Forest which belong to Middle Atlas National Park that holds a representative part of Cedar Ecosystem. Land Use change has been modeled using machine learning algorithm predicting future forest cover change as result of driving anthropic, biotic and abiotic factors. Our paper is organized in a Methods describing the used data and algorithms explanation and the achieved results which stress out the main results and the discussions and conclusion. The study concerned Sidi M'Guild forest which is located in the Moroccan Middle Atlas. It covers about 29,000 hectares. Cedar covers about 51%, holm oak 34%, and Thuriferous juniper 3.6% of forest area. The overall methodology is shown in the Fig. 1 . It consists on getting covariates, then processing and modeling. Landsat satellite images covering study used were downloaded from USGS Earth explorer platform. Landsat 4 TM (thematic Mappers) images with a 30 m spatial resolution and 7 spectral bands for 1990 to 2004. Landsat 8 OLI image (Operational Land Image) that contains 11 spectral bands at 30 m spatial resolution were used for 2018. Images were chosen based on the availability and being cloud free and captured during August to reduce atmospheric disturbance and confusion with herbaceous layers' spectral emission. The predictive variables used consists on bioclimatic variables, Digital elevation model (DEM) and human characteristics. Bioclimatic data were downloaded from worldclim's platform (see Table 1 ) 1 . Altitude Maps was extracted from Shuttle Radar Topography Mission (STRM) DEM 2 . The location of human settlement was linked to 2014 Morocco's 'general census' data. Distance from human settlement maps and distance from Forest edge map were generated using basic GIS functions. Furthermore, a Pearson's r correlation coefficient has been calculated in order to identify the factors that are highly correlated. Satellite data were preprocessed QGIS and then classified using Maximum Likelihood algorithm. Images were standardized and accommodated to the same extent and spatial resolution.Image classification was based on training the classifier using, as ground truth, forest stand type maps released in 1990, 2004 and 2018. Six dominant forest cover stratum/classes were chosen while respecting the National forest inventory standards. The resulting classes are: atlas cedar (Ca), holm oak (Qr), thuriferous juniper (Jt), atlas cedar and thuriferous juniper mixture (CaJt), atlas cedar and holm oak mixture (CaQr), mixture of atlas cedar, holm oak and thuriferous juniper (CaQrJt), holm oak and thuriferous juniper mixture (QrJt) and sylvatic and asylvatic vacuums (V). In addition to Atlas cedar, we classified other species closed to or mixed with Atlas cedar, since when we talk about the cedar ecosystem, we refer to all the species that are closely linked to it. The accuracy assessment based on cross-validation of the classified images showed an overall accuracy of 88.86%, 90.21% and 92.33% respectively for the years 1990, 2004 and 2018. Random Forest is a flexible and easy to use machine learning algorithm which performs both regression and classification tasks [17] . It combines multiple decision trees in determining the final output. It uses different technics for training the model. Bagging technique [18] , which involves training each decision tree on a random set of data sampled, without replacement, from the training data set with a random split selection [19] . Random forest regression, which the performance has been proved in several studies [20, 21] , was used to predict forest cover class transition probabilities. Transition maps for the period 1990-2004 were established by observing the behavior of the forest species. For each transition from one class to another, two possible values 1 denotes change from class to other class, and 0 denotes no change. Then, binary transition maps were produced 0 for stability and 1 for change occurrence and used for training 16 models. Following the approach adopted by Gounaridis et al. [15] , the transition probability surfaces were generated through training Random Forest algorithm [18] using all variables and using the most independent covariates identified through the Pearson's correlation coefficient calculation (5 factors). The RF regression models were then implemented in python using the Random Forest regression. The model output is transition probability map. The model was run for each of the sixteen transitions defined before (Fig. 2) and for a set on number of trees (10, 20, 30, 40, 50, 100). Cellular automata [22] were used to predict the future state of forest species distribution. Probability maps obtained with different number of trees were produced and for each time, the Kappa coefficient of Cohen was calculated through comparing predicted and observed results of the same year [23] . Accuracy assessment based on cross-validation showed an overall accuracy of 88.86%, 90.21% and 92.33% respectively for the years 1990, 2004 and 2018. Thus, indicating the suitability of the derived classified maps for effective and reliable forest species cover change analysis and modeling. Post-classification analysis of the spatial metrics and their variations based on Table 2 showed that the area occupied by Ca, CaQr, CaQrJt and Jt classes had drastically decreased between 1990 and 2018 in the studied area. On the other hand, the area occupied by Qr, CaJt, QrJt and V had substantially increased during the same period. Such results seems to be coherent with literature describing Cedar as vulnerable species and holm oak as green cement with a high adaptive capacities. Using all factors as covariates or retaining the five factors that are not correlated according to Pearson's r correlation coefficient (Temperature Seasonality: Bio4, precipitation of Driest Quarter: Bio17, Distance from human settlement, settlement density and distance from Forest edge) gave the same results (Table 3) . Such fact could be explained by RF robustness toward correlated data. In general, we notice that the higher scores were recorded for numbers of trees equal to 50 and 100 trees. In addition, we conclude that the five non correlated parameters considerably affects the evolution of the forest species change between 1990 and 2004. As explained in the methods, the predictive model was used to predict 2018 forest cover state. The model returns simulated maps of forest species cover distribution for the year 2018, relative to each of RF model number of trees. The predicted maps were compared to existing 2018 forest cover maps and the validation was based on the use of Kappa coefficient. Validation results are given in Table 4 . We notice that there was no evident difference between the values achieved by Kappa Coefficient. Hence, we could conclude that the number of trees used in the random forest model does not influence the results of the simulation. Contrariwise, the thresholds experimented for each rule did influence the model. Table 3 . Transition probability of forest species cover areas for each number of trees. Area size Training score n = 10 n = 20 n = 30 n = 40 n = 50 n = 100 By comparing the areas occupied by each class in both simulated and observed maps, we observe some differences (Table 5 ): The model did closely estimate the area of CaQr, CaQrJt, Jt, QrJt and CaJt. On the other hand, it did largely overestimated the area of Qr about −18% and miss estimated the area occupied by Ca, CaQr and V. In addition, the model has predicted a regression of Ca, CaQr, CaQrJt, QrJt and V and then the area affected by the change were localized. It predicted Qr, Jt and CaJt progression. The model overestimated some species classes and underestimated others because we did not take into account the interventions carried out by the administration in terms of silvicultural interventions on holm oak and cedar stands, the reforestation actions undertaken including silvicultural vacuums, the awareness campaigns conducted for the benefit of the populations bordering the forest and finally the forest police actions to preserve and conserve the forest heritage. Theses interventions must be deeply analyzed in order to assess their impacts on explaining the occurred forest cover changes. With regard to the predictions, the model returned the forest species cover state for 2032 ( Fig. 3 and Table 6 ). The model predicted a strong regression for Ca and Jt cover and a medium regression for QrJt, CaJt, and V. It did also predicate a very strong progression for Qr and CaQrJt and medium progression for CaQr. The majority of forest species cover change are depending on their positioning to settlements and to forest's boundaries. Stratum that will show regression trends are located near human settlements and not far from forest edge. On the other hand, stratum with progressive trends are located far from human settlements and to forest's boundaries. Such finding seems to be coherent with literature and forest mangers declarations. The current study was based on an integrated approach that combines remote sensing and GIS to simulate and predict plausible forest species cover changes for Sidi M'Guild Forest for the years 2032 using Cellular Automata (CA)-Random Forest regression (RF) model. The initial forest species cover map (1990), the transition potential maps (1990) (1991) (1992) (1993) (1994) (1995) (1996) (1997) (1998) (1999) (2000) (2001) (2002) (2003) (2004) and the 1990-2004 transition probabilities were used to train RF model. Model was validated using actual and predicted 2018 forest species cover. The overall agreement between the two maps was 72% for each number of RF's trees used. The future 2032 projections indicate a strong regression of Cedar atlas with −91.49% and thuriferous juniper cover with −74.39% and a medium regression of holm oak and thuriferous juniper mixture with −49.92%, atlas cedar and thuriferous juniper mixture with −45.05%, and sylvatic and asylvatic vacuums with 21.65%, a very strong progression of holm oak with +1294.02%, and of atlas cedar, holm oak and thuriferous juniper mixture with +124.68% and medium progression of atlas cedar and holm oak mixture with +28.11% by 2032. The majority of forest species cover changes depends on their location to settlements and to forest's boundaries. Regression are located near human settlement and forest boundaries. As cedar is considered as national heritage, these findings could be useful for decision makers and for managers to review their strategies in order to ensure the sustainability of cedar as natural capital. Quality of life: an approach integrating opportunities, human needs, and subjective well-being Forest Management and Planning The key literature of, and trends in, forest-level management planning in North America Next generation of global land cover characterization, mapping and monitoring Impact of land use/land cover change on hydrological components in chongwe river catchment Modelling land cover dynamics: integration of fine-scale land cover data with landscape attributes Vegetation-ice-bare land cover conversion in the oceanic glacial region of tibet based on multiple machine learning classifications Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data Unravelling the frontiers of urban growth: spatiotemporal dynamics of land-use change and urban expansion in Greater Accra Metropolitan Area Deriving suitability factors for CA-Markov land use simulation model based on local historical data Modeling land use change using Cellular Automata and Artificial Neural Network: the case of Chunati Wildlife Sanctuary Comparing support vector machines with logistic regression for calibrating cellular automata land use change models A future land use simulation model (FLUS) for simulating multiple land use scenarios by coupling human and natural effects A Random Forest-Cellular Automata modelling approach to explore future land use/cover change in Attica (Greece), under different socio-economic realities and scales Exploring prospective urban growth trends under different economic outlooks and land-use planning scenarios: the case of Athens Classification and Regression Trees Bagging predictors An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting and randomization Comparison of machine learning algorithms for forest parameter estimations and application for forest quality assessments Estimating leaf area index and light extinction coefficient using Random Forest regression algorithm in a tropical moist deciduous forest Simulating urban growth using a Random Forest-Cellular Automata (RF-CA) model Measuring inter-rater reliability for nominal data -which coefficients and confidence intervals are appropriate?