key: cord-027134-1k6oegu4 authors: Turky, Ayad; Rahaman, Mohammad Saiedur; Shao, Wei; Salim, Flora D.; Bradbrook, Doug; Song, Andy title: Deep Learning Assisted Memetic Algorithm for Shortest Route Problems date: 2020-05-25 journal: Computational Science - ICCS 2020 DOI: 10.1007/978-3-030-50426-7_9 sha: doc_id: 27134 cord_uid: 1k6oegu4 Finding the shortest route between a pair of origin and destination is known to be a crucial and challenging task in intelligent transportation systems. Current methods assume fixed travel time between any pairs, thus the efficiency of these approaches is limited because the travel time in reality can dynamically change due to factors including the weather conditions, the traffic conditions, the time of the day and the day of the week, etc. To address this dynamic situation, we propose a novel two-stage approach to find the shortest route. Firstly deep learning is utilised to predict the travel time between a pair of origin and destination. Weather conditions are added into the input data to increase the accuracy of travel time predicition. Secondly, a customised Memetic Algorithm is developed to find shortest route using the predicted travel time. The proposed memetic algorithm uses genetic algorithm for exploration and local search for exploiting the current search space around a given solution. The effectiveness of the proposed two-stage method is evaluated based on the New York City taxi benchmark dataset. The obtained results demonstrate that the proposed method is highly effective compared with state-of-the-art methods. Finding shortest routes is crucial in intelligent transportation systems. Shortest route information can be utilised to enable route planners to compute and provide effective routing decisions [8, 11, 14, 16, 24] . However, shortest route computation is a challenging task partially due to dynamic environments [3] . For instance, the shortest path is impacted by various spatio-temporal factors, which are dynamic in nature, including weather, the time of the day, and the day of the week. That makes the current shortest route computation techniques ineffective [3, 7] . Moreover, it is a challenging problem to incorporate these dynamic factors into shortest route computation. In recent years, the proliferation of pervasive technologies has enabled the collection of spatio-temporal big data associated with user mobility and travel routes in a real-time manner [15] . Modern cars are equipped with telematics devices including in-car GPS (Global Positioning System) devices which can be used as a source of valuable information in traffic modelling [23] . The traces generated from GPS devices has been leveraged by many scenarios such as Spatiotemporal context recognition, taxi-passenger queue time prediction, study of city dynamics and transport demand estimation [3, 12, 13, 17, 23] . One important aspect of finding shortest routes in realistic environments, which are inherently dynamic, is travel time prediction [8, 22] . Due to the dynamic nature of in the travel routes, traditional machine learning methods cannot be applied directly onto travel time prediction. One of the key challenge for traditional machine learning models is the unavailability of hand-crafted features which requires substantial involvement of domain experts. One relevant approach is the recent use of evolutionary algorithms in other domains to work along with deep learning models for effective feature extraction and selection [18] [19] [20] [21] . In this study, we aim to identify relevant features for shortest route finding between an origin and destination, leveraging the auto-feature generation capability of deep learning. Thereby we propose a novel two-stage architecture for the travel time prediction and route finding task. In particular we design a customized memetic algorithm to find shortest route based on the predicted travel time from the earlier stage. The contributions of this research are summarised as follows: -A novel two-stage architecture for the shortest route finding under dynamic environments. -Development of a deep learning method to predict the travel time between a origin-destination pair. -A customised memetic algorithm to find shortest route using the predicted travel time. The rest of the paper is organized as follows. In Sect. 2, we present our proposed methodology for this study. Section 3 describes the experimental settings which is followed by the discussion of experimental results in Sect. 4. Finally, we conclude the paper in Sect. 5. In this paper, we propose a deep learning assisted memetic algorithm to solve the shortest route problems. The proposed method has two stages which are (1) prediction stage and (2) optimisation stage. The prediction stage is responsible to predict the travel times between a pair of origin and destination along the given route by using deep learning. The second stage uses memetic algorithm to actually find the shortest path to visit all locations along the given route. In the following subsections, we discuss the main steps of the proposed method and the components of each stage in detail. Figure 1 shows our proposed approach. Conventional route finding methods assume fixed cost or travel time between any pairs of points. That is rarely the case in reality. One approach to the dynamic travel time issue is prediction. In this work, we incorporate the weather data along with the temporal-spatial data to develop a deep learning predictive approach. The goal of the proposed predictive approach is to predict future travel time between any points in the problem based on historical observations and weather condition. Specifically, given a group of historical travel time data, weather data and road network data, the aim is to predict travel time between source (s) and destination (d) s i , d i ∈ R, i ∈ [1,2, ..., n], where n is the number of locations in the road network. Our predictive approach tries to predict the travel time at t+1 based on the given data at t. The proposed predictive approach has three parts: input data, data cleaning and aggregation, the prediction approach. Figure 2 shows the deep learning approach. Input Data. In this work, we use data from three different sources. The data involves around 1.5 million trip records. These include the travel time data, weather data and road network data. -Travel time data. The travel times between different locations were collected using 2016 NYC Yellow Cab trip record data. -Weather data. We use the weather data in New York City -2016. The data involves: date, maximum temperature, minimum temperature, average temperature, precipitation, snow fall and snow depth. -Road network data. The road network data involves temporal and spatial information as follows: • Id -a trip identifier. • Vendor id -a code indicating whether the provider is involved with the trip record. • Pickup date-time -date and time when the meter was started. • Drop-off date-time -date and time when the meter was disconnected. • Passenger count -indicates the total number of riders in the vehicle. • Pickup longitude -the longitude of picked passenger. • Pickup latitude -the latitude of the picked passenger. • Dropoff longitude -the longitude of the dropped passenger. • Dropoff latitude -the latitude of the dropped passenger. • Store flag -indicates if the trip record was saved in vehicle memory before sending to the vendor where Y = store and forward; N = not a store and forward trip. • Trip duration -duration of the trip in seconds. This process involves removal of all error values, outliers, imputation of missing values and data aggregation. To facilitate the prediction we bound the data ranges between (average + 2) × standard deviation to (average − 2) × standard deviation. Values outside of these ranges are considered as outliers and are removed. The missing values are imputed by the average values. Any overlapping pick-up and drop-off locations are also removed. In the aggregation step, we combine the travel time data, weather data and road network each time step so that it can be fed into our deep networks. Prediction Approach. The main goal of this step is to provide high accuracy prediction of the travel times between different locations in the road network. The processed and aggregated data is provided as an input for the prediction approach. Once the prediction model is trained and retrieved, it is then ready to actually predict the travel times between given locations. In this work, we propose a deep learning technique based on feedforward neural network to build our prediction approach. The deep neural network consists of one input layer, multiple hidden layers and one output layer. Each layer (input, hidden and output) involves a set of neurons. The total number of neurons in the input layer is same as the number of input variables in our input data. The output layer has one single neuron which represents the predicted value. In deep neural network, we have m number of hidden layers and each one has k number of neurons. The input layer takes the input data and then feed them into the hidden layers. The output of the hidden layers are used as an input for the output layer. Given the input data X (X =x 1 , .. x n ) and the output value Y, the prediction approach aims to find the estimated value Y est using a simple approach is as follows: Where w is the weight and b is the bias. Using a four-layer (one input, two hidden and one output) neural network as example, the Y est can be calculated as follows: 1 is the output of the network and f is the activation function. In this work, Keras [1] based on TensorFlow [2] is used to develop our predication model. This subsection presents the proposed memetic algorithm (MA) for shortest route problems. MA is a population-based metaheuristic that combines the strengths of local search algorithm with population-based metaheuristic to improve the convergence process [9, 10] . In this paper, we used genetic algorithm (GA) and local search (LS) algorithm to form our proposed MA. GA is responsible for exploring new areas in the search space of solutions. LS is used to accelerate the search convergence. The pseudocode of the proposed MA is presented in is shown in (1) . The overview of the process is given below followed by detailed description of these steps. Our proposed algorithm starts from setting parameters, creating a population of solutions, calculating the quality of each solution and identifying the best solution in the current population. Next, the main steps of MA will iterate over a number of generations until the stopping criterion is met. At each generation, good solutions are selected from the population by the selection procedure. Then the crossover operator is applied on the selected solutions to generate new solutions. After that the mutation operator is applied on the new solutions by randomly changing them. A repair procedure is applied to check the feasibility of the generated solutions and fix the infeasible solutions as some solutions are no longer feasible. Afterwards a local search algorithm is invoked to iteratively improve the current solutions. If one of the stopping criteria is satisfied, then the whole MA procedure will stop and the current best solution will be returned as the output. Otherwise, the fitness of the current pool of solutions will be calculated. Then the population is updated since new solutions have been generated by crossover, mutation, repair procedure and local search. After that a new iteration starts from the selection procedure again. The main parameters of the proposed MA are initialised in this step. The proposed MA has several parameters. These are: population size, the number of generations, crossover rate, mutation rate and the number of non improvement iterations for the local search. Initial Population. The initial population is randomly generated. Each solution is represented as one chromosome, e.g. one-dimensional array. Each cell of the array contains an integer number which represent the location. Fitness Function. In this step, the fitness value of each solution based on the objective function is calculated. The better the fitness value is, the higher chance the solution will be selected to reproduce the next generation of solutions. For shortest route problems, the fitness is the total travel time between the origin and destination locations. Therefore, solution with shortest travel time is the better. Selection Procedure. This step is responsible for selecting two solutions for producing the next generation. In this paper, we adopted the traditional tournament selection mechanism [4] [5] [6] . The tournament size is set to 2, indicating that each tournament has two solutions competing with each other. At each call, two solutions are randomly selected from the current population and the one with highest fitness value will be added to the reproduction pool. Crossover. This step is responsible to generate new solutions by taking the selected solutions and mixes their genetic materials to produce new offsprings. In this paper, single-point crossover method is used which only swap genetic materials at one point [5, 6] . It first finds a common point between source node and destination node and then all points behind the common point are exchanged between the two solutions, thus resulting in two offspring's. Mutation operator helps explore a large search space by producing some random changes in various solutions. In this paper, we used a one-point mutation operator [5] . Crossover point is randomly selected and then all points behind the selected mutation point are changed with a random sequence. Repair Procedure. The aim of this step is to turn infeasible solutions into feasible ones. After crossover and mutation operations, the resulting solutions may become infeasible [5, 6] . In this paper, The MA in our experiments has repair procedure that ensure all infeasible solutions are repaired. Local Search Algorithm. The main role of this step is to improve the convergence process of the search process in order to attain higher quality solutions [9, 10] . In this paper, the utilised local search algorithm is the steepest descent algorithm. Steepest descent algorithm is a simple variation of the gradient descent algorithm. It starts with a given solution as an input and uses a neighbourhood structure to move the search process to other possibly better solutions. It uses an "accept only" improving acceptance criterion whereby only a better solution will be used as a new starting point. Given s i , It applies a neighbourhood structure to create s n . Replace s n with s i if s n is better. The pseudocode of the steepest descent algorithm is shown in (2) . Stopping Condition. If the stopping condition is met, terminate the search process and return the best found solution. For our proposed memetic algorithm, it will stop if the maximum number of generations is reached. Otherwise, go to step 24. In this section, the parameter settings of the deep learning and the proposed algorithm are provided. The values of parameters were selected empirically based on our preliminary experiments, where we tested the deep learning model and the proposed algorithm with different parameter combination using different values for each parameter. The values of these parameters are determined one by one through manually changing the value of one parameter, while fixing the others. Then, the best values for all parameters are recorded. The final parameter values of the deep learning and the proposed algorithm are presented in Tables 1 and 2 . This section is divided into two subsections. The first examines the performance comparison between the deep learning approach and other machine learning models (Sect. 4.1). The second assesses the benefit of incorporating the proposed components on search performance (Sect. 4.2). In this paper, we have implemented a number of machine learning models and the results of these models are compared with the deep learning model proposed in this work. We have tested the followings methods: XGBoost, Random forest, Artificial neural network, Multivariate regression. The root-mean squared-error (RMSE) was used as an evaluation metric. Table 3 shows the results in term of RMSE on the NYC Taxi dataset. In the table, the best obtained result is highlighted in bold. From Table 3 , it can be seen that our deep prediction model is superior to the other machine learning models in term of RMSE. The best values with the lowest RMSE is 11.01 achieved by our approach, followed by 21.34 from random forest, 24.06 from XGBoost, 27.19 from multivariate regression and 70.21 from artificial neural network. This good result can be attributed to the factor that deep learning consider all input features and then utilise best ones through the internal learning process. On the other hand, other machine learning methods require feature engineering step to identify the best subset of features which is a very time consuming and needs a human expert. This section evaluates the effectiveness of the machine learning models and the proposed memetic algorithm. To this end, genetic algorithm (GA) and memetic algorithm (MA) with different machine learning models are tested and compared against each other. These are: GA with XGBoost, GA with random forest, GA with artificial neural network, GA with multivariate regression, GA with deep prediction model, MA with XGBoost, MA with random forest, MA with artificial neural network, MA with multivariate regression and MA with deep prediction model. The main aim is to evaluate the benefit of using our deep prediction model and local search algorithm within MA. To ensure a fair comparison between the compared algorithms, the initial solution, number of runs, stopping condition and computer resources are the same for all instances. All algorithms were executed for 30 independent runs over all instances. We also used 4 instances with a different number of locations ranging between 500 and 2000 locations, which can be seen as small, medium, large and very large. The computational comparisons of the above algorithms are presented in Tables 4 and 5 . The comparison is in terms of the best cost (travel time) and standard deviation (std) for each number of locations, where the lower the better. The best results are highlighted in bold. A close scrutiny of Tables 4 and 5 reveals that, of all the instances, the proposed MA algorithm with deep learning approach outperforms the other algorithms in all instances. From Tables 4 and 5 , we can make the following observations: -GA with deep prediction model obtained better results when compared to GA with all other prediction models across all instances. This justifies the benefit of using deep learning approach to predict the travel time and the proposed memetic algorithm to exploit the current search space around the given solution. In this study, we proposed a novel two-stage approach for finding the shortest route under dynamic environment where travel time changes. Firstly, we developed a deep learning method to predict the travel time between the origin and destination. We also added the weather conditions into the input to demonstrate that our approach can predict the travel time more accurately. Secondly, a customised memetic algorithm is developed to find shortest route using the predicted travel time. The effectiveness of the proposed method has been evaluated on New York City taxi dataset. The obtained results lead to our conclusion that the proposed two-stage shortest route is effective, compared with conventional methods. The proposed deep prediction model and memetic algorithm are beneficial. TensorFlow: large-scale machine learning on heterogeneous systems DeepIST: deep image-based spatio-temporal network for travel time estimation A comparative analysis of selection schemes used in genetic algorithms Genetic algorithms and machine learning Genetic algorithms Travel time estimation without road networks: an urban morphological layout representation approach Multi-task representation learning for travel time estimation On evolution, search, optimization, genetic algorithms and martial arts: towards memetic algorithms. Caltech Concurrent Comput. Program, C3P Rep Memetic algorithms and memetic computing optimization: a literature review Solving multiple travelling officers problem with population-based optimization algorithms Predicting imbalanced taxi and passenger queue contexts in airport Queue context prediction using taxi driver knowledge Coact: a framework for context-aware trip planning using active transport Using big spatial data for planning user mobility CAPRA: a contour-based accessible path routing algorithm Wait time prediction for airport taxis using weighted nearest neighbor regression Optimising deep belief networks by hyper-heuristic approach An evolutionary hyper-heuristic to optimise deep belief networks for image reconstruction Evolutionary model construction for electricity consumption prediction Multi-resolution selective ensemble extreme learning machine for electricity consumption prediction When will you arrive? Estimating travel time based on deep neural networks. In: Thirty-Second AAAI Conference on Artificial Intelligence Ridesourcing systems: a framework and review Learning to estimate the travel time Acknowledgements. This work is supported by the Smarter Cities and Suburbs Grant from the Australian Government and the Mornington Peninsula Shire Council.