1 Introduction

Recently, due to climate change, wildfire emergence in different parts of the globe have taken on alarming proportions, being harmful to both vegetation and urban areas. Higher temperatures dry out the landscape and help create the perfect environment for more extensive and frequent forest fires. The year 2021 was one of the worst in terms of fires, with a loss of 9.3 million hectares worldwide [14]. Wildfires lead to a vicious cycle because as carbon emissions increase, the climate becomes warmer and drier, which favors the incidence of fires in increasingly larger areas and for a prolonged period, generating more carbon emissions. This cycle keeps repeating, and each year the damage caused by fire propagation is greater, in addition to factors related to human intervention that further aggravate the situation. Among the difficulties in fighting fires is the lack of information about the affected areas and how fires spread across different landscapes [6]. Therefore, appropriate modeling and simulation techniques are necessary to expand the understanding of the effect of factors such as vegetation, climate, topography and human influence on the flames dynamics enabling efficient planning of wildfire prevention and combat countermeasures.

Cellular Automata (CA) have shown to be an extremely useful tool for representing natural phenomena [22]. Notably, the study of fire propagation is one of the most investigated natural phenomena using CA models [5, 8, 17, 20]. A CA can be defined as a discrete dynamic system, composed of a lattice of agents, called cells. They are characterized by decentralized computation, where each cell interacts with its neighborhood to define its state at each time step. The evolution of the states of the CA cells is determined by simple transition rules, but when combined with local interaction between the cells, they result in a global complex dynamic of the lattice (emergent computation) [22]. However, for a model based on CA to adequately represent a phenomenon, it is necessary to correctly adjust the parameters of such model. Estimating the values for these parameters is not a trivial task. As the number of parameters increases, so does the complexity due to the number of possible combinations [9]. In the case of fires, these factors can include climate, terrain, amount of combustible material, wind intensity, and many others [15]. On the other hand, approaches based on evolutionary algorithms, such as genetic algorithms, are widely used in the literature to adjust the parameters of complex models [2, 21].

A Genetic Algorithm (GA) is a search meta-heuristic inspired by Darwin’s theory of evolution [11]. This type of algorithm deals with a population of individuals, which represent potential solutions, and employs selection methods and genetic operators that mimic the process of natural selection. That is, the idea that the fittest individuals (solutions) tend to survive and have a greater chance of passing on their genetic material to future generations. GAs are commonly used in optimization problems since their genetic operators direct the exploration towards promising regions of the search space to find the best solution. CA models tuning based on evolutionary approach was previously investigated in the literature [8, 9, 16].

This work aims to develop an evolutionary approach using genetic algorithms to automatically adjust the parameters of an CA-based fire propagation model. It relies strongly on the previous approach proposed in [8] to reproduce fire propagation dynamics in homogeneous vegetation according to patterns observed in synthetically generated datasets. However, in the previous approach the data sampling rate was not taken into account. Preliminary experiments have shown that this information is important for adjusting time steps in the evolution of a CA-based model to adequately capture the fire dynamic presents in the reference dataset. This was highlighted when the dataset is generated from a model that exhibits different dynamic behavior than that being adjusted by GA to reproduce the patterns in the data. From a broader perspective, in future more realistic applications, when the data to be used as reference are images sampled from the evolution of a real fire, this adjustment aims to make the number of steps in the temporal evolution of the CA model compatible with the sampling rate of the images. Experiments were carried out, where the fire spreading achieved from our approach is compared with the reference data. Different datasets were used, each one constructed from the distinct intervals (CA steps) between samples. It is also evaluated the method’s ability to automatically set the values of the model’s parameters in order to reproduce the behavior observed in the synthetic datasets created using both our model and another fire propagation CA-based model of the literature [20]. Results showed that the GA adjustments presented a wildfire spreading similar to the dynamics represented in each reference dataset, demonstrating the robustness and adaptability of the method.

2 Related Works

A brief literature review is presented that includes some important works related to both the models based on cellular automata to simulate the wildfires propagation and to the genetic algorithms employed to optimize the parameters of such models.

The approach proposed in [7] is a reference for fire spread models using cellular automata. It is a two-dimensional CA lattice composed of three states cells and stochastic transition rules, that uses Moore neighborhood and simulate the recovery of burnt cells and the spontaneous combustion. In [25] was proposed an AC-based model with hexagonal cells integrated as a Rothermel model to simulate the temporal and spatial spread of forest fires. Real data from a Chinese province and artificial data were used in experiments that showed its ability to reproduce fires in heterogeneous landscapes. In [24], CA was integrated with GIS (Geographic Information Systems) to provide fire spread simulations where topographic elements, forest fuels and climate variables are considered. The proposal was evaluated on real data from a Canadian city and the authors concluded that it can be applied to reproduce realistic fires and other spatio-temporal applications. Using a case study related to a forest fire on Spetses Island in 1990, a CA-based fire spread model was developed in [1], where a black box nonlinear optimization approach is used to adjust the parameters from real GIS data. The results showed that the event was well represented and that the approach can be used to manage risks in future fires. In [10] was presented a CA-based fire propagation model that applies a numerical optimisation approach to find values that correlate the model parameters.According to the authors, the results found were close to classical approaches. In [23], a CA-based model was used to evaluate the influence of a set of factors, such as combustible materials, wind, temperature and terrain, in the flames dynamics. The authors concluded that the proposed approach was able to satisfactorily simulate the flames behavior trends, aiding to understand the key factors of a wildfire propagation under different conditions. A hybrid approach was presented in [19], where CA is combined with other techniques calculating a propagation index using meteorological data from remote sensing in order to adjust the automaton’s evolution step to wildfire spread. According to the results, the model was able to predict the flames dynamics, ensuring accuracy and adaptability in the simulations. In [8], a 2D probabilistic CA-based model is tunning in order to improve the quality of the wildfire spread simulations. CA transition rule considers characteristics such as wind intensity, burning time and vegetation recovery, where the probability of a cell in the tree state starting to burn is calculated based on the burning neighbors and the direction and intensity of the wind. Wildfire propagation model based in a 2D probabilistic CA is proposed in [20]. The model adopts different fire states and a non-linear vegetation recovery function, as well as modeling important characteristics that affect flame dynamics, such as the burning time, the fire intensity, the wind intensity and direction, the presence of obstacles. The last two models were used as case studies in this paper. The approach presented in [8] was adopted as the standard model to be configured by the GA, while the model proposed in [20] was used to generate reference datasets for the experiments. Finally, a 2D CA-based model was used in [4] to reproduce wildfires in a real Brazilian Cerrado environment. Different scenarios of vegetagion (homogeneous, heterogeneous) were simulated to evaluate the influence of model parameters, such as combustion intensity, wind and rivers, on the accurate reproduction of fire behavior.

Evolutionary approaches have been applied in the adjustment of various applications such as parameter definition of neural networks [21], scheduling problems [18], the spread of diseases [9, 16], adjustment of hydrological-hydrodynamic and water quality models [12], among other applications. Several studies employ genetic algorithms in predicting and adjusting wildfire propagation models. In [3] is emphasized that fire modeling requires many parameters that are not always known and a genetic algorithm is used in order to find these parameters from historical data and calibrate the model at the end of the execution. GA’s population is evaluated using a metric called FFDI (Forest Fire Danger Index), which is calculated based on weather variable factors such as temperature, relative humidity, wind speed and drought factor. Approach described in [5] utilizes a two-step strategy to calibrate the model, where the quality of the solution is evaluated by comparing the simulation data with the real fire data (burned in the real fire, burned in the simulated fire, or not burned). Then, a \(2 \times 2\) table is constructed with four possibilities (Hits, False Alarms, Misses, Correct Negatives). Finally, the individual’s fitness is calculated. The authors concluded that there was a significant improvement in performance without compromising the execution time. In [8] an GA is used to adjust the parameters of a fire propagation model that considers elements such as wind intensity and direction, and homogeneous and heterogeneous vegetation scenarios based on historical data in a given area, being able to reproduce good results in different experiments. Finally, a parallel GA with variable neighborhood and multiple subpopulations is proposed in [13] aiming to adjust the model parameters to optimize firefighting routes and minimize damage resulting from simulated fire propagation.

3 Wildfire Spreading Model Based on Cellular Automata

In the present paper, we reproduce the fire propagation model based on cellular automata proposed in [8]. It uses a two-dimensional probabilistic cellular automaton composed of a lattice of cells, where each cell can assume one of two states (tree or fire), and a stochastic transition rule that controls the evolution of the CA, that is, it defines the state of the central cell at the next time step based in its current state and that of its neighbors. A matrix W (\(3 \times 3\)) is used to quantify the probability of a given cell igniting through the propagation of fire from its closest fire cells in a Moore’s Neighborhood with radius 1. This matrix represents the wind influence, that is, each of its 8-outer positions represents a wind direction according to the Cartesian plane (northwest, north, northeast, east, southeast, south, southwest and west) and indicates the impact of that neighbor cell on the state of central cell. The matrix positions have a real value between [0,1], whose composition determines the direction, intensity and speed of the flames in the simulation. CA rule also represents characteristics of the vegetation, such as burning time (LQ) and recovery time (LR), representing the time that a cell will remain in the fire state and the time required for a burned cell to be susceptible to burning again, respectively. Burning cells go through the burning stages \(S_f \in \left\{ 1... LQ \right\} \), while burned cells go through the recovery stages \(S_t \in \left\{ 1... LR \right\} \), before becoming susceptible to new fires. This evolution dynamic follows CA transition rules: (i) tree cells start with the maximum value of LR; (ii) a fire cell evolves incrementally through different stages until it reaches the maximum value LQ; (iii) upon reaching the value LQ, a fire cell transforms into a tree cell in stage 1; (iv) a tree cell in stage 1 goes through LR stages of recovery; and (v) only cells with maximum LR have a probability \(P = \frac{Age}{LR} \times \frac{\overline{IW_{fire}}}{LQ}\) to burn again, where \({\overline{IW_{fire}}}\) is the average of the product between the burning stage and the wind influence of each neighbor in fire state.

The implementation to reproduce the reference model [8] was carried out considering a initial lattice \(100 \times 100\), where the central cell is chosen as the initial fire spot and all the other cells start in the tree state. The wind matrix used is \(W_{3\,\times \,3}\) = {{0.14, 0.50, 0.85}, {0.00, 0.00, 1.00}, {0.14, 0.50, 0.85}}, what characterizes the wind blowing to west and a force vector shown in Fig. 1a. Model parameters LQ and LR were adjusted in preliminary experiments to approximate the behavior presented in the original model. The values employed are \(LQ=3\) and \(LR=15\). The temporal evolution of the CA lattice using the reproduced model is presented in Fig. 1b over 80 time steps. Four snapshots of the simulation are presented in the figure: t = {10, 30, 50, 80}.

Fig. 1.
figure 1

Simulation using our reproduction of the model in [8]

4 Proposed Evolutionary Approach

We investigate an evolutionary approach to tuning the parameters for the CA-based fire spreading model previously described to produce a simulation that closely matches the one observed in the reference dataset related to the propagation of flames from a forest fire in a given area. We used artificial data generated by fire simulations as reference dataset. The initial experiments use the same model to generate these artificial datasets. In the subsequent experiment, a different CA-based spreading model [20] was used to generate the reference data. For both cases, we evaluate the ability of the GA to specify an adequate set of parameters to reproduce the general behavior observed in the images captured of the area during the evolution of a fire.

The proposed genetic algorithm operates over 100 generations, with a uniformly generated population of 100 individuals, where each individual is a possible configuration for the CA-based model composed of 11 genes, as shown in Fig. 2a. The first 8 genes have real values between \(\left[ 0, 1 \right] \) representing the wind matrix W that makes up the Moore neighborhood (blue). The central cell (gray) is always considered zero, so it is not included in the chromosome. The next two genes are integer values representing the burn value LQ with values between \(\left[ 0, 10 \right] \), the recovery value LR varying between \(\left[ 0, 100 \right] \). The latest parameter SR is also an integer belonging to the interval \(\left[ 0, 10 \right] \) representing the sampling rate. Given a reference dataset composed by N fire spreading snapshots. This samples can be generated by other fire spreading models or even extracted from a sequence of images obtained in a real-world fire propagation. The number of time steps T needed in the simulation model that will be used to reproduce the behavior present in the reference dataset is given by \(SR \times N\). A set of T lattices will be generated from the temporal evolution of the CA model using the others parameters coded in the individual of the population. SR parameter represents the interval of time steps that will be applied to sampling the set of lattices used to evaluate the similarity of the simulation in relation to the fire propagation snapshots present in reference dataset. For example, if SR is 5 and the reference dataset has N = 10 snapshots of a wildfire propagation, the fitness set used to evaluate the individual will be composed by the CA lattices in the time steps in {5, 10, 15, 20, ..., 50} (T = 50). The quality of the GA individual is measured by the difference between the snapshots of the reference dataset and the lattices present in the fitness set. SR parameter is related to the major investigation of the present paper. It was proposed in this work to make the GA more generalist, being able to adapt to a reference data generated by other fire spreading models or from a sequence of real-world images.

Parent selection for the next generation is done by simple tournament selection with \(Tour=3\), and offspring are generated by double-point crossover with a 90% rate. In mutation, each gene of the chromosome has a 20% probability of undergoing alteration. For real values, there is an increment or decrement between \(\left[ -0.2, 0.2\right] \), while integer values are incremented or decremented in 1. Elitism reinsertion method was applied with a 25% rate, where the top 25 parents are retained, and the rest of the population is filled by the top 75 offspring.

For the fitness calculation \(Fit9P = \frac{\sum _{q=0}^{8}\sum _{i=1}^{N}\left| Ind_{iq}-Ref_{iq} \right| }{9}\), the lattice with a dimension of \(L = 100 \times 100\) is divided into 9 partitions \(q_i\), where i varies from \(\left[ 0, 8\right] \), as shown in Fig. 2b, where \(Ind_{iq}\) refers to the value of the simulation using the individual parameters under analysis in each partition q at time step i, and \(Ref_{iq}\) is its reference value. Five simulations of the model for each initial focus (24, 24), (24, 74), (74, 24), (74, 74) are performed using the parameters defined in the genes of each individual of the population over T time steps and the average value of the executions in each partition is used in the fitness function . These values were chosen based on preliminary experiments to reduce stochastic error and not render the experiments unfeasible due to high simulation time.

Fig. 2.
figure 2

Chromosome and lattice division used in the GA fitness computation.

5 Experiments and Results

Experiments were carried out to evaluate the effectiveness of the proposed approach in adjusting the parameters of the fire propagation modelFootnote 1 and the ability to find an adequate sampling rate in order to reproduce the general behavior of the flames propagation present in different reference datasets (DS).

Three criteria were considered for analyzing the results produced by the GA with respect to the ability to: (i) configure a wind intensity and direction similar to that observed in the dataset; (ii) reproduce the fire spreading of the reference data; and (iii) find an adequate sampling rate to simulate the flames in the DS.

For each experiment, 30 runs of the genetic algorithm were performed. The results were ranked by fitness, and the best solution found was applied in the graphical plotting of the temporal evolution. Then, a heat map was generated to verify the frequency that an area igniting, allowing the observation of the average behavior of the fire spreading by counting the occurrence of fire in a cell over 100 simulations.

5.1 Sensitivity Analysis

Aiming evaluate the algorithm’s ability to calibrate the parameters and its adaptability to a given input, sensitivity tests were conducted varying the size of the dataset. The objective was to determine the minimum size necessary for the algorithm’s convergence without losing the quality of the adjustment. This approach makes the algorithm more adaptable to information sources and optimizes computational time by being able to calibrate the model with fewer records.

Initially, the cellular automaton described in Sect. 3 was reproduced. It was executed 100 times using the parameters of the matrix W corresponding to the wind to the west, \(LQ=3\) and \(LR=15\) over 50 time steps. The average value for each time step was computed considering the 100 CA simulations, and reference databases with different sizes were created. The first database stores the average values related to all the 50 time steps. The second database is composed by only 25 average values, where it was stored just the even time steps {2, 4, 6, ..., 50}. In a similar way, other four databases were built to store 16, 10, 7 and 5 average values. Thus, six reference datasets (DS) were built using distinct sampling period \(p = \left\{ 1, 2, 3, 5, 7, 10 \right\} \), resulting in different dataset size \(S = \left\{ 50, 25, 16, 10, 7, 5 \right\} \), respectively. These reference datasets were called Dataset i or DSi, where i indicates the ith element of p and S. Although these datasets characterize the same wildfire scenario, since they were built based on the same group of 100 CA model simulations, they represent the data of a same phenomenon with different sampling frequency. Generalizing to a real-world wildfire, propagation images are stored with different capture frequency.

The wind matrix (W) in Fig. 2a produces a resultant vector shown in Fig. 1a, which represents the wind intensity (\(I=2\)) and west direction (\(\theta =180^{\circ }\)). It is obtained through vector decomposition, where the central cell \(W_{2\,\times \,2}\) represents the origin of the Cartesian plane and each matrix cell \(W_{ij}\) is decomposed in vertical and horizontal components, which are used to generate the resultant vector of the matrix. More details on this process are provided in [8]. Even if the exact values of the matrix parameters are not found, a good reproduction of the wind behavior can be achieved from a configuration capable of producing a similar resulting vector, that is, with nearby wind intensity and direction.

Fig. 3.
figure 3

Wind Preference Matrices

Genetic algorithm was applied to optimize the CA model parameters for the datasets used as reference (DS1, DS2, DS3, DS4, DS5 and DS6). Thirty GA runs were carried out for each dataset. In Table 1, the reference parameters used in the CA-based model simulations for the generation of the original dataset with 50 lattices are presented in the column named “Reference”. The best solutions found by GA for each dataset are presented in the other columns. The genes \(G_{Nw}\) to \( G_{W}\) that compose the matrix W for each experiment are shown. These values are used to produce the resultant vector with its intensity and inclination (Fig. 3). The values for the burning time (LQ), recovery time (LR) and sampling rate (SR) parameters are also presented, as well as the fitness achieved by the best GA individual in each dataset.

Table 1. Reference and best parameter sets found by GA for each dataset.

As can be observed, GA achieved the exact values of the LR, LQ and SR parameters for all datasets. It is important to note that the fitness values achieved for the best individual in each experiment are not comparable among the experiments D1 to DS6, since the values computation depends on the number of records belonging in each database. Considering the DS1 experiment, the largest dataset with 50 records was used, so the GA accesses all the spreading information (50 lattices), making it the easier case of adjustment. The value of SR equal to 1 was found, and the matrix W produced values close to the original ones, as shown in Fig. 3a, with an intensity vector of 1.87 and an inclination of 179.04\(^{\circ }\), which is close to the resultant vector related to the original matrix W (Fig. 1a). The samples of the temporal evolution chosen for graphical plotting were \(t = \left\{ 7, 20, 35, 50 \right\} \). Figure 4a presents the temporal evolution of both (i) the reference DS1 (the top plot); and (ii) the simulation based on the parameters of the best individual evolved by GA (the bottom plot). For all the others temporal evolution presented in Fig. 4, the top plot was done based on the dataset used as reference, and the bottom plot was generated from the simulation using the best GA individual evolved for this dataset. The observed behavior in Fig. 4a was appropriate, with burn areas very similar for both the main and secondary waves. Besides, this observation can be generalized for several simulations as shown in the heat map in Fig. 5a, with a slight tendency to evolve northwest due to the degree of difference in the vector, but nothing very significant.

Fig. 4.
figure 4

Cellular Automata Temporal Evolution

In the DS2 experiment, half of the records (25) were used, so the GA has less information about the spread of the fire - sampling the phenomenon at each two time steps - providing a greater difficulty to GA finding a good parameter adjustment. Despite this, it was able to determine that the most suitable SR is 2. The GA also achieved good results regarding the matrix W, as shown in Fig. 3b, with an intensity vector of 2.01 and an inclination of 180.8\(^{\circ }\), the best so far compared to the reference. The chosen samples for graphical plotting were \(t = \left\{ 4, 10, 18, 25 \right\} \). The simulation using the best individual presented an appropriate temporal evolution in Fig. 4b, when compared to the data behavior, especially in the main wave, while the second wave exists but is slightly smaller than the reference. Considering the heat map in Fig. 5b, the average behavior was quite similar for both the outer and inner waves, although in the latter, the trace is slightly weaker than the reference, but the size is very close.

Considering the remaining experiments (DS3, DS4, DS5 and DS6), the best individual found for each dataset reference determines the most suitable SR: 3, 5, 7 and 10, respectively. Regarding the matrix W, the values are appropriate, producing resultant vectors similar to the reference as shown in Fig. 3, although the experiments DS3, DS4 and DS5 showed a more pronounced inclination to the southwest (6.44\(^{\circ }\), 7.27\(^{\circ }\) and 4.01\(^{\circ }\), respectively). This difference led to a smooth fire propagation in this direction, as shown in Fig. 4 for these experiments. Overall, despite the inclination, the behavior observed in experiments DS3 and DS4 in Figure was appropriated. Considering DS5 experiment, which reference dataset has 7 records, the temporal evolution in Fig. 4e shows that the simulation with the best evolved individual, was able to reproduce the main propagation wave, but the second wave is non-existent. The average behavior observed in Fig. 5 for experiments DS3, DS4 and DS5 was adequate. DS6 was the experiment with the smallest base containing only 5 records. Although a good configuration was found for the matrix W, in the temporal evolution presented in Fig. 4f, it can be noted that the internal waves were quite apparent, even more intense than the reference. In the average behavior shown by the heatmap in Fig. 5f, values with a similar behavior to the external wave are observed, but the internal wave is more subdued than the previous experiments, showing that the temporal evolution did not reflect the average behavior adequately.

Fig. 5.
figure 5

Temporal Evolution Heat Map

In general, the experiments shown the GA was able to find good solutions and the most appropriate SR, provided that the base with the smallest sizes of records (DS5 and DS6) did not yield good results as the others, with no internal waves or being very intense.

5.2 Robustness Analysis

In order to validate the robustness and adaptability of the proposed approach, due to the lack of real-world wildfire images, we employed a second CA-based fire propagation model proposed in [20] to generate artificial datasets. This model has other parameters than the model discussed in Sect. 3, such as humidity and several burning stages, as well as its CA time steps represents a fire evolution time different from the CA-based propagation discussed previously. Due to the distinct characteristics between the models, the best values for the model’s parameters are not known, what approximates to the problem of reproducing real-world wildfires dynamics with a simulation model. The performance of our approach will be evaluated based on the comparison between the fire dynamics resulting from the model configured by GA and the records of the datasets.

Three datasets were generated with 20 records (in each one) by running the reference CA-based model with different sampling rates \(SR~\in ~\left\{ 5, 7, 10 \right\} \), being named Dataset 7 (DS7, SR=5), Dataset 8 (DS8, SR=7) and Dataset 9 (DS9, SR=9). These sampling rates were chosen because they are more challenger for the genetic algorithm search, as it was observed in the experiments in the last section. To validate the greater accuracy of the proposed approach, the artificial datasets were submitted to adjust the parameters of the CA-based model using two different versions of the genetic algorithm. The first one does not use the parameter SR in the individual coding, while the second GA version uses the proposed parameter in the genetic code as presented in Sect. 4. The results of each GA approach are presented and analyzed to highlight the effect of including SR parameter. For each GA version and each dataset, 30 runs were carried out and they were analyzed using the same approach used in the previous experiments DS1 to DS6. The new experiments with Database 7 were named GA-DS7, where the GA without the parameter SR was used, and GA-SR-DS7, where the proposed parameter SR is adjusted join with the other parameters in the individual. The same was made for Dataset 8 and Dataset 9 resulting in experiments named GA-DS8, GA-DS9, GA-SR-DS8, GA-SR-DS9. For the plot, the same time steps (\(t = \left\{ 5, 10, 15, 20 \right\} \)) are presented.

Table 2. Robustness Test

Figure 6 presents the dynamics of fires obtained from GA and GA-SR for different sampling rates, showing the temporal evolution of burned (red), recovering (gray) and recovered (green) areas. Considering the experiment GA-DS7, Table 2 shows that the best individual evolved by the simplest GA has \(LQ=10\) (maximum possible value) and a \(LR=91\), producing a temporal evolution with a smaller burning area as shown in Fig. 6a. On the other hand, in the experiment GA-SR-DS7 the best evolved individual has \(LR=10\) and \(LR=33\), which results in a bigger burning area, which is closer to the reference, especially at \(t=20\). It is important to note that there is a recovery process in the model [20] that was not captured by GA, but the GA-SR was able to find a similar recovery using \(SR=3\). Considering the average behavior shown in Fig. 6d it is highlighted that GA-SR-DS7 presented a closer dynamics to the reference. Regarding the experiment GA-DS8, Table 2 shows that the best individual evolved by the simplest GA has \(LQ=10\) and a \(LR=57\). Due to the fact that the sampling period is longer, the temporal evolution of the GA-DS8 in Fig. 6b was not adequate because the burning area is significantly smaller than the reference. On the other hand, GA-SR-DS8 was able to find a better adaptation, by increasing the value of SR to 4, with \(LQ=10\) and \(LR=23\). Thus the temporal evolution was similar to the reference both in the burning and recovery areas. In the average behavior shown in Fig. 6e this difference between GA-DS8 and GA-SR-DS8 are highlighted. We believe that the difference is very significant because with a higher sampling rate, the GA managed to get closer to the solution. Finally, in the experiment GA-DS9, Table 2 shows that the best individual evolved by the simplest GA has \(LQ=9\) and a \(LR=28\), producing a very different temporal evolution in Fig. 6a when compared to the reference. On the other hand, GA-SR-DS9 defined \(LR=10\), \(LR=37\) and \(SR=5\) in the best individual, better reproducing the dynamics adequately with a burning area (red), recovering (grey) and recovered (green) very similar. The average behavior in Fig. 6f makes clear the much better adaptability of GA-SR over a GA approach without the sample rate.

Fig. 6.
figure 6

GA and GA-SR Cellular Automata Temporal Evolution. (Color figure online)

Experiments have shown that the proposed approach can replicate the behavior of other fire propagation models with different parameters, as well as adapt satisfactorily to the data dynamics generated from different sampling rates.

6 Conclusions

Our main goal is to propose an evolutionary approach capable of automatically adjusting the parameters of a CA-based model, so that it is capable of making a good reproduction of the dynamics observed in the wildfire spreading. The phenomenon to be represented can be either a sequence of lattices generated by another fire propagation model or a sequence of digitized images of a real fire (reference dataset). We establish that it is relevant in this investigation that this evolutionary adjustment of parameters can adequately reproduce the dynamics regardless of the sampling rate used to generate the reference dataset. In this sense, a new parameter (sampling rate) was adjusted by the GA, which determines which lattices of a CA simulation will be used to assess the behavior similarity in relation to the reference data. Experimental results show that our approach was able to find adequate values for this sampling rate and the other model parameters, resulting in fire dynamics close to those observed in the datasets. Comparing with the approach without sampling rate, it was found that the presence of this parameter provides more versatility and adaptability to the GA, especially when reproducing sets with sparser data.

We also analyzed the influence of the dataset size on the quality of the simulations generated using GA, where datasets with different sizes and periodicity between data were tested. It was possible to observe a reduction in GA efficiency for small datasets (size < 10), with 20 records being sufficient for the reproduction of the fire behavior even for sparser datasets.

As future work, we intend to evaluate new fitness functions that consider more CA states and investigate the adjustment of more complex fire models, which have more parameters and are consequently more challenging for the GA. Additionally, we also aim to evaluate the effectiveness of our approach in reproducing real wildfire propagation dynamics.