key: cord-1020700-5artdajk authors: Ghaithan, Ahmed M.; Alarfaj, Ibrahim; Mohammed, Awsan; Qasim, Osaid title: A neural network-based model for estimating the delivery time of oxygen gas cylinders during COVID-19 pandemic date: 2022-03-14 journal: Neural Comput Appl DOI: 10.1007/s00521-022-07037-3 sha: e92a07286945831b3ccbe2457730a8cdce8fe1a2 doc_id: 1020700 cord_uid: 5artdajk Since COVID-19 was declared as a pandemic by World Health Organization in March 2020, 169,682,828 cases have been reported worldwide, with 151,416,570 recovered, and 3,526,647 deaths by May 28, 2021. Oxygen gas cylinders demand is booming globally due to its need for COVID-19’s for intensive care. Thus, it is critical for hospitals to know exactly the time of receiving oxygen gas cylinders since this will help in minimizing the fatality rate. In this regards, this paper proposes a Multilayer Perceptron Neural Network-based model to predict the delivery time of oxygen gas cylinders for a real-life logistics data from a company that delivers oxygen gas cylinders to all cities around Saudi Arabia. Besides, Multilayer Perceptron Neural Network is benchmarked to supported vector machine and multiple linear regression. Although all the considered models have the ability to provide accurate prediction results, the findings indicate that the proposed supported vector machine and Multilayer Perceptron Neural Network model provide better prediction results. The analysis was achieved through a methodology to identify factors with the highest impact and build a neural network model. The model was further optimized to identify the best order and select the best subset of input variables. The analysis showed that the neural network model can be used effectively to estimate the delivery time of oxygen gas cylinders. The model illustrated high accuracy of prediction by comparing the predicted values to the actual values. The COVID-19 pandemic underscored the importance of managing health supply chain effectively, especially medical logistic distribution. It has exposed the gaps in this field that need to be tackled by researchers. The logistics of medical products are characteristic of unique challenges because their delivery and storage require special regulations and licenses. For instance, to transport medical gases, the common way is to use gas cylinders in a closed-loop logistic distribution since the cylinders are reused and refilled [1] . Moreover, the gas cylinders must be regularly inspected before being delivered and also need to be carefully handled, due to the size, weight, and hazards associated with them [2] . Furthermore, the storage facilities' condition and locations must be well-designed to ensure safety [3] . Thus, delivery of medical products is a difficult task due to these characteristics, especially during COVID-19 pandemic. Consequently, there is a need for studies that focus on the logistics of medical gases stored in cylinders. One of the main gases transported in cylinders is oxygen, which is continuously increasing in demand. COVID-19 pandemic has positively impacted the growth of global demand for Oxygen Gas Cylinders (OGCs) for intensive care. Accurate prediction of medical oxygen gas cylinders delivery is critical for hospitals. Hospitals must keep an adequate supply of oxygen on hand, with the quantity determined by delivery estimates. In general, a fourteen-day supply is the minimum quantity to be held, and this quantity should be buffered further in case of logistics problems and delivery time inaccuracies [2] . Due to the increase in oxygen demand during the COVID-19 pandemic, World Health Organization (WHO) released a special guide to address oxygen distribution. According to the guide, a proper management of oxygen logistics is the foundation to manage the crisis and similar crises [4] . To address the surge increase in OGCS demand and overcome logistics challenges, a robust design of the OGCs's supply chain is important. This supply chain should consist of specialized production facilities, filling centers, and distribution centers, to deliver the cylinders to end users. Moreover, several unique factors should be analyzed to improve the performance of oxygen gas cylinder supply chains, including gas molecules, filling tools, and plant composition [5] . Therefore, a sophisticate data-driven model are required to identify the impact of these factors and hence recommend ways to manage these factors and improve overall performance. The literature review reveals that there is a lack of developing and using date-driven models to predict OGCs delivery time internationally and locally. Moreover, identifying the factors influencing the delivery time of OGCs, ranking the most influential factors, and developing a forecasting model that considers the identified factors are needed. Therefore, to fill this gap and participate in inhibiting and controlling COVID-19 pandemic, the main contributions of this paper are summarized as follows: • Designing and implementing a neural network-based model that provides the state-of-the-art result in OGCs logistics. The proposed model extends the applicability and suitability of neural network in prediction of delivery time of such important medical product especially during COVID-19 pandemic. • Additionally, two other state-of-the-art algorithms, namely multiple linear regression and supported vector machine have been used for the purpose of performance comparison. • Identifying the factors influencing the logistics distribution of OGCs. The data are taken from a leading distribution company having its sales network spread all over the country. • Structuring and designing the data set required to train, validate, and evaluate the proposed model to ensure its accuracy and effectiveness. Furthermore, sensitivity analysis is conducted to investigate the impact of changing the inputs variables on the behavior of model. The developed model offers a simple and practical way for logistics service providers and companies dealing with supplying medical resources. It is also useful for hospitals to assist them in their planning of OGCs held inventory. In the next section, the most relevant published papers will be reviewed. Several stochastic variables influence logistics, transportation, and supply chain management problems. The literature shows that cycle time prediction in logistics can assist in identifying factors causing delays, and hence mitigate them. Hence, many studies have explored methodologies to analyze those variables to predict various outputs, such as delivery time and cost. Several studies used classical methods such as heuristics [6] , partial least square regression [7] , linear regression [8] , support vector regression [9] , and logistic regression [10] . However, recently, machine learning was extensively used in the literature for prediction problems. In the medical field, accurate prediction of delivery times is critical since it enables hospitals to effectively plan the availability of vital materials, which impact people's lives. The delivery of OGCs is especially important, since it is high in demand, and it requires special handling. Several studies attempted to optimize medical logistics and the logistics of oxygen cylinders. These include studies to predict medical demand and general medical deliveries. Moreover, oxygen cylinder delivery was studied using mathematical programming and simulation [11, 12] . One of the common techniques that have been used in supply chain prediction problems is machine learning algorithms. In a full review of how Artificial Intelligence (AI) is used in logistics, [13] discussed how prediction in machine learning can enable a smart logistics system, by predicting failures and maintenance requirements. Another comprehensive review was conducted by [14] , which explored the use of several machine learning techniques such as neural networks, decision trees, random forests, and support vector machines in supply chain optimization. Carbonneau et al. [15] also conducted a comparison study of neural networks, support vector machine, and traditional prediction methods to predict demand, focusing on the bullwhip effect. Knoll et al. [16] proposed a model to predict planning tasks in inbound logistics using machine learning. Machine learning prediction can also be used to study stochastic aspects of logistics, as examined by [17] . Some research articles used machine learning to approach more specific logistics problems. For instance, [8] focused on predicting semiconductor manufacturing cycle times. Moreover, [18] attempted different machine learning techniques to predict the lead time for Just-In-Time operations, using restaurants as a case study. Likewise, [19] used machine learning to predict profits in a vendor-managed inventory system and demonstrated that a combination of machine learning and genetic algorithms assists in optimizing inventory replenishment. In a flowshop environment, machine learning can predict manufacturing lead-time, as explored in [20] . In addition, machine learning can be combined with simulation and optimization to assist in supply chain decisions such as routing and inventory levels [21] . One of the most widely used of machine learning algorithms in the literature for supply chain prediction is neural networks. Noorul Haq and Kannan [22] developed neural network model to forecast demand, then studied the impact of the method used on the overall cost of the distribution inventory. A similar result was achieved by [23] , which used deep neural networks to forecast inventory needs. Neural networks were also shown to provide insight on evaluating different locations of distribution centers [24, 25] . Chiu and Lin [26] used neural networks for supply chain collaborative planning and order fulfillment. Asadzadeh et al. [27] combined multi-perceptron neural networks with fuzzy and linear regressions to develop a new prediction algorithm for manufacturing lead-time. This fuzzy-neural approach was also used by [28] to predict the lead-time of semiconductor manufacturing and resulted in high accuracy. Wang and Jiang [29] used Radio Frequency Identification (RFID) data to feed neural networks and predict order time completion. Neural networks are often paired with an optimization algorithm to identify the best parameters [22, 24, 25] . Delivery time was the focus of many prediction studies. For instance, [30] used machine learning for predicting delivery times, taking the postal service as a case study. The authors compared several boosting algorithms and showed that the developed algorithms provide accurate predictions with short running times. Liu et al. [31] used random forests and quantile regression forests to predict delivery. Comparably, [32] used quantile regression forest and regression tree to determine arrival times of delivery. Another study considered street blockage in their input to train the models predicting delivery times. The authors experimented on neural networks and support vector machines, and both provided accurate results comparing with actual delivery times [33] . Liao and Wang [34] used neural networks to predict delivery time of an automatic material handling system. They built a simulation model that provided the inputs for the neural network model, which improved prediction results. Another input that can impact delivery is marketing decisions, as illustrated by [35] . The proposed neural network model revealed that delivery is most affected by seller characteristics. An important benefit of predicting delivery is to assign due dates, as demonstrated by [36] . Similarly, [37] studied the package delivery system to assign the Estimated Time of Arrival (ETA). The authors used a ''spatial-temporal sequential neural network'' considering numerous input factors, namely last route sequence, delivery pattern consistency, and the sequence of delivery. Liu et al. [38] studied the food industry to analyze how arrival time prediction can help optimize delivery routes. Also for the food industry, [39] estimated delivery times of meals using gradient boosting decision trees and showed how accurate arrival promises improve customer experience. Furthermore, several studies focused on discussing the use of machine learning in medical predictions since healthcare is a critical field. Buntak et al. [40] highlighted the importance of logistics management in healthcare in general, and review the state of the research in the medical field. There review demonstrated that there is a need to develop mathematical models to analyze and optimize medical logistics. Ngiam and Khor [41] explored the benefits and difficulties of utilizing such models, specifically machine learning, to analyze big data of medical deliveries. Accurate prediction of medical deliveries can help in hospital planning and scheduling. However, since human lives are at stake, data needs to be carefully preprocessed and models need to be iteratively refined to avoid prediction errors. Moreover, using machine learning may raise concerns in terms of liabilities of any potential errors [41] . It is a common approach to medical prediction and forecasting. Merkuryeva et al. [42] discussed the use of regression algorithms to forecast pharmaceutical demand. Xu and Tan [43] demonstrated in their study how machine learning models can improve demand forecasting in the medical field and shed the light on the importance of data preparation. For delivery prediction, [44] used several neural network techniques to predict drug delivery. Multilayer perceptron (MLP) outperforms radial basis function network (RBFN), and generalized regression neural network (GRNN) in drug delivery prediction. Similarly, [5] developed a comprehensive neural network model that predicts design, discovery, delivery, and disposition in the drug industry. The COVID-19 pandemic increased the importance of effective medical supply chain management. Sharma et al. [45] highlighted the challenges caused by the COVID pandemic in medical supply chains. Medical materials demand has significantly increased, while distribution channels became difficult to reach due to lockdowns. The authors also explored the dependence of Indian supply chains on China, which amplified the challenges related to logistics and distribution. The study proposed supply chain diversification, inventory buffers, localization, and risk management focus [45] . Ivanov [46] developed a simulation model to predict the long-term impact of the pandemic on supply chains. The model's results showed that the closure of facilities due to the pandemic can lead to significant delays and losses. It also predicts that the use of Neural Computing and Applications machine learning can help businesses reduce the impact of such pandemics by providing more planning and prediction insight [46] . Bhaskar et al. [47] proposed an integrated framework to build robust supply chain. Machine learning is used to predict customer demand. Several studies tackled logistics problems specifically for gas cylinders, either medical or for other uses. Singh et al. [12] designed the logistics network for gas cylinders using mixed-integer programming, considering overall supply chain transportation costs. Carrasco-gallego et al. [1] also designed the gas cylinder closed-loop supply chain, but with consideration of reuse of the cylinders. Costantino et al. [11] developed a simulation model to analyze the impact of opening and closing plants on overall performance while experimenting with oxygen cylinder logistics. The simulation model enabled optimizing inventory levels in each scenario of which plants are open. Pathak et al. [48] obtained a US patent for their design of a gas cylinder supply chain distribution network. The design considered the location of distribution hubs, filling plants, and production facilities considering customer sites. The invention includes a computer system that recommends a design for the supply chain distribution network for gas cylinders, considering a list of input data. The network is optimized using a two-step model. However, the patent did not consider delivery time prediction. Though the importance of oxygen particularly nowadays because of COVID-19 pandemic, the above literature reveals that there is a lack of developing date-driven models to predict oxygen cylinder delivery time. Therefore, this very critical field needs more attention and development of simple and effective datadriven models for forecasting logistic distribution that can be easily standardized and employed. In this regards, this paper proposes a data-driven neural network model to estimate oxygen gas cylinder delivery time. It explores the literature and uses a real-life application to identify and select applicable features associated with the commodity. In this study, real delivery time data are used to develop a data-driven model based on multilayer perceptron (MLP) neural network for predicting delivery time for a vital medical item during COVID-19, which is the OGCs. In addition, the proposed neural network model is compared to two other methods, namely; supported vector machine and multiple linear regression. This work is useful for logistics service providers in general, and more so for companies dealing with OGCs and closed-loop supply chains. It is also useful for hospitals to assist them in their planning especially during health crisis such as COVID-19 which needs Oxygen gas. Different input parameters are identified and described to estimate the delivery time of OGCs. The correlation between the input parameters and the response variable is also diagnosed. The development of an artificial neural network model involved five main phases as shown in Fig. 1 . These phases are divided into twelve steps as follows: 1. Conduct a comprehensive literature review to identify factors that affects OGC logistics. Also, additional general logistics factors suggested by experts were considered. This list of factors is used as a starting point, to be filtered in step 2. 2. Test the initial factors against the real-life practice. To do so, a logistics company that delivers OGCs was identified. Meetings were performed to utilize employee expertise to select factors that have the most impact in practice from those identified through the literature review in step 1. 3. Collect practical data for selected factors from the identified logistics companies. The data were collected from operational history of year 2019. This data are used to build the model and test it. 4. Analyze and process the data to eliminate outliers and ensure that the model uses reliable data. This analysis provided insight that allowed effective model building. 5. Identify the factors that have the most impact on delivery time, as found from the data analysis in step 4. Based on this finding, improvements were recommended to reduce OGC delivery time. 6. Build a neural network model that consists of a scaling layer, feedforward multilayer perceptron (MLP) neural network, and an unscaling layer. The scaling layer normalizes the data such that input factors have the same range. The perceptron layers use back propagation to enable the model to learn. The activation function of the perceptron layers was set as linear for the first and last layers and hyperbolic tangent for the layers in the middle. The unscaling layer resets the ranges to the actual scale. 7. Identify model training strategy which focuses on training-testing split, error measurement of loss index, and optimal model selection. First, data was split so that 60% is used for model training, 20% for model selection, and 20% for model testing. Second, the loss index was measured using Mean Square Error (MSE). For model optimization, the quasi-Newton method was used. 8. Select the optimal structure of the neural network that minimizes loss. This included identifying optimal order and selecting optimal input variables. This methodology enabled the achievement of the study objectives, as will be explained in the model development section. In this paper, Neuraldesigner software is used to develop the neural network model [48] . Several neural network architectures were trained and examined to find the best model that achieves the best results. For the oxygen cylinder delivery problem, the ''output variable'' is the total hours spent in delivery, and the factors impacting the delivery time are defined as ''input variables''. Multilayer Perceptron is a type of feedforward Artificial Neural Network that is highly used in prediction problems. It can learn complicated nonlinear problems and generate accurate outputs with newly untrained data inputs [49] . Multilayer Perceptron neural network consists of multiple interconnected neurons that are activated by activation functions. The most suitable activation function should be selected in order to assure a highly accurate model. Selection of the appropriate activation function can be performed through understanding its mechanism and its implications. The most common types of activation functions are linear, sigmoid and Hyperbolic Tangent or Rectified Linear Units functions, etc. The linear activation function turns the neural network into one layer where the activation is proportional to the input. The sigmoid function has a sigmoid curve (S shape) ranging from 0 to 1. With such characteristics, sigmoid activation functions are highly used for predicting the output having the ability to range from 0 to 1 [50] . The following formula shows the sigmoid activation function: where X n represents multiple inputs and each input will have its own weight W n , and b n represents the bias included to allow shifting the activation function. The Hyperbolic Tangent function is like the sigmoid function with (S-Shape) curve but has higher range from -1 to 1 which give it more capabilities [50] . The ''Hyperbolic Tangent'' activation function is shown below: Figure 2 shows the common structure of Multilayer Perceptron and the flow of data. In Multilayer Perceptron, there are mainly three layers which are input layer, hidden layers and output layer. The basic function of the input layer is to feedforward inputs to the network in the direction of the output. The output of the input layers will feed the hidden layers where it calculates the data based on the activation function to generate an output that will feed the output layer. Finally, the output layer will be also activated to generate the desired output. The weight of each input to each neurons and layers is calculated throughout the training phase [50] . During the training of the network, input data and weights are continuously adjusted until the Multilayer Perceptron reaches the optimum mapping between inputs to output. The learning is supervised which is carried out through backpropagation algorithm that minimizes the least mean squares [50] . On the other hand, multiple linear regression algorithm considers more than one predictor variable to predict the response variable, y. A multiple linear regression algorithm models the linear relationship between a single dependent variable and multiple independent variables (i.e., input). It can be represented by the following standard form: where b 0 ; b 1 ; b 2 ; ::; b n are the model coefficients that represent the change in the dependent variable (y) for each independent variable x when the other variables were kept constant. The model coefficients are estimated based on the input data and according to the least square method by minimizing the difference between the observed and predicted data sets [51] . After estimating the model coefficients, the predicted multiple linear regression equation is used to forecast the response/ dependent variable for any future set of the independent variables x i (i.e., input). Furthermore, support vector machine (SVM) is a learning machine algorithm developed by [52] for pattern recognition problems. Recently, SVM has been used extensively to solve regression and time series prediction problems [53, 54] . SVM approximates the function in the following form: where u i x ð Þ are the input features, w i and b are the supported vector machine coefficients that need to be estimated based on the structural risk principle which is minimizing an upper bound of the generalization error. In this paper, various set of performance metrics are used to measure the accuracy of the proposed prediction algorithms namely, R-squared, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) [55] . MSE is used as the loss measurement, as shown in the following equation: where n is the number of data points, y i is the predicted delivery duration at point i, and t i is the actual delivery duration at point i. RMSE is the square root of average value of squared error in a set of predicted values, without considering direction. It measures of how wide residuals are spread out. RMSE is expressed in Eq. 6: MAE is a measure of the absolute errors between the actual and predicted values. It can be expressed by Eq. 7: In the following sub-sections, data collection, analysis, and model development are described in details. The first step is to identify the factors affecting the delivery time of OGCs in order to develop a prediction model of delivery time. A review of previous research in this filed, knowledgeable practitioners, and experts are used to define the most relevant parameters. First, the literature was reviewed to identify factors used in cylinder delivery and in closed-loop supply chains. Table 1 summarizes the initial set of factors collected from the literature. The above factors were then validated by getting inputs from experts to find the factors that are used in practice. Additional factors were included such as number of trips between cities, number of hospitals in the trip, total distance, and fuel consumption. To ensure the practicality of the model, actual data were collected from a logistics company that delivers oxygen cylinders to hospitals located in all cities of Saudi Arabia. The data were obtained from historical records of company operations. The company has a central warehouse located in Riyadh, Saudi Arabia. It delivers OGCs from the central warehouse to several hospitals in disperse cities within Saudi Arabia. Before COVID-19 pandemic, the average delivery time was within 12 h which is achievable and acceptable to the company. After COVID-19 pandemic the OGCs demand has increased sharply which makes the company unable to cover all demands and have faced logistic problem and delays in delivery time. The management had to act effectively to resolve this problem and minimize the delivery time as much as they can in order to save lives and minimize the fatality rate. The company has to cope with a new delivery time target controlled by government and governed by contracts, where any delays are subjected to huge penalty. In coordination with the company management, a team was formed to study the problem thoroughly and collect the factors that may lead to a long delivery time. Several potential factors were initially identified. Based on analysis and subject matter experts' judgment, the factors were filtered to include location, cylinder quantity, truck number, driver name, number of trips between cities, number of hospitals in the trip, total distance, fuel consumption, which are all indicators of the total delivery time of oxygen cylinders. To perform a deep analysis on the identified problem, the artificial neural network method was proposed to develop a data-driven model that predicts the delivery time for OGCs orders. The input variables are listed and described in Table 2 . The output variable reflects the ultimate objective of the logistics company, which is to minimize the total hours spent on a trip to deliver OGCs to multiple destinations. A sample of 420 data set is presented in Table 3 . Data were split randomly so that 60% is used for training the model, 20% was used for selecting the best model architecture, and 20% was used for testing. Moreover, additional new data points were collected to validate the model. To achieve meaningful results, the quality of the data is analyzed by conducting correlations among input and output variables. The maximum, minimum, mean, and standard deviation values used to scale the inputs are shown in Table 4 . It is clear that the data are distributed uniformly, which mean that the developed model has a high level of accuracy. The factors that influence the delivery time of OGCs are defined as inputs, whereas the delivery time is identified as output. A pairwise comparison was performed between input variables to identify correlations between them. A full positive relationship is given a value of one, no relationship has the value of zero, and a full negative relationship has a value of negative one. This means that the closer the value to zero, the weaker the relationship. As shown in Table 5 , some input variables show significant correlations while others show no correlation. The highest correlation of 0.981 is between distance traveled and fuel consumption, Location [39] Transport requirements and material handling resources [34] Delivery sequence [37] Quality, promotion, and price [35] Quantity [16] Day of week, customer name/number, and quantity [32] New/Used cylinder [1] Inventory levels [56] Customer class, location, and price [43] Location, contract versus spot, number of drops, weather, tenure, origin facility, and driver [10] Number of cities [30] Priority, status of operation, equipment [20] while the lowest is 0 which occurs between several input variables. These correlations present several interesting observations. First, location has no correlation with the number of drops, nor the distance traveled, which indicates that serving far customers is not a concern, but the company needs to optimize their routes if they plan to reduce total distance. Second, truck number does not correlate with the number of trips and low correlation with the number of drops and distance traveled, which leads us to infer that the company did not consider the truck condition in their logistics planning. Third, it seems that most of the time, drivers use the same truck, since the correlation between the truck number and driver name has a high value of 0.972. Also, these drivers and trucks mostly go to the same locations, as inferred by the high correlations between location and truck number and between location and driver name. Finally, cylinder quantity has a high negative correlation with the number of drops. This is counterintuitive, but it may indicate that only a few customers required a high number of cylinders, while others order only a few cylinders. Customers that order a few cylinders add to the delivery time while adding little value to overall sales. This finding can help logistics companies to focus on customers that provide the highest sales value. Based on the correlations analysis, the variables ''Fuel consumption'' and ''Driver name'' were removed from the model, since they are represented by ''Distance traveled'' and ''Truck number'', respectively. The output versus input correlations analysis helps to investigate the correlation between the target, which is delivery time in hours, and the six remaining input variables. An absolute value that is close to one indicates a strong correlation between the input and the target, and an absolute value near zero indicates a weak correlation. As shown in Fig. 3 , the highest contributor to high delivery hours is the distance traveled. On the other hand, location has no impact on delivery time. This low correlation between location and delivery time is interesting. It can be explained by the input relationships since the location has no correlation with distance. This is probably because the final destination before returning to Riyadh city (central warehouse) may not be an indicator of total distance, which is more related to the number of trips and the number of drops, meaning that a single drop to a distant location can take much less time than several drops to a nearby final destination. The truck number, cylinder quantity, and the number of drops have medium correlations to delivery time. Finally, the number of trips and location have the lowest correlations. These correlations with the goal to visualize the correlations with the Pareto principle, where focusing 20% of the reasons can lead to addressing 80% of the improvement opportunities. In this case, it shows that reducing the distance traveled can significantly decrease delivery time, which emphasized the need of utilizing route optimization algorithms. As a matter of fact, these two variables represent 64% of the correlations, while the other input variables combined account for only 36% of the correlations. Moreover, the relationships between delivery time and the numerical input variables are represented using the scatter plots as shown in Fig. 4 . Scatter plots are an effective and simple tool to show the strength of the relationship between variables and to show if the relationship is linear or non-linear. For example, distance traveled shows a strong linear relationship with total delivery time in hours. As can be seen in the scatter plots in Fig. 4 , distance traveled clearly shows a strong positive correlation with delivery time in hours. Also, the number of drops shows a week positive correlation. Finally, cylinder quantity and the number of trips show almost no relationship with delivery time. The results of the scatter plots align with those of the Pareto chart in Fig. 3 . From Fig. 4 , it is clear that the type of relationship between output versus distance traveled and number of drops is linear; output versus truck number and location is logistic, output versus cylinder quantity is power, and output versus number of trips is exponential. In this section, the MLP model is developed consisting of three functions: a scaling function layer, multi-perceptron layers, and an unscaling function layer. The scaling function layer normalizes the inputs to have a consistent range, using the Minimum-Maximum method. The unscaling function layer returns the data to its actual range. The multi-perceptron layers are used to enable learning. The initial model contains four input multi-perceptron layers and one output layer. The first and fourth perceptron layer use the linear activation function, and the second and third layers use the hyperbolic tangent (tanh) activation function which is a sigmoid function that is used widely for data ranging from -1 to 1. The initial neural network structure is portrayed in Fig. 5 . There are six input variables represented as black circles and the scaling layer is denoted as Table 6 . Since the location and truck number are categorical, each value was set as a binary variable which led to 48 input variables. After constructing the initial neural network, the network is trained to learns the input weights and minimizes the loss or error. In this paper, the training process was performed in two stages. First, the initial model was used to get initial performance. Second, training was performed on the optimized model to ensure that it significantly improves the performance. To train the initial built model, 60% of the data set, which was randomly selected from the overall data. The training set enabled producing weights for each input and to measure prediction performance. Moreover, several parameters were experimented on to find the set of parameters that provides the minimum loss. The parameters setting of the model is explained in details in the next subsection. The aim of training the neural network is to identify the weight and bias that minimize the loss. Thus, the Quasi-Newton method was used to optimize the model and minimize the loss, which uses a Hessian function as the loss function. It simplifies the full Newton method by approximating the inverse Hessian function for each iteration of the optimization algorithm. The loss minimization at each epoch is depicted in Fig. 6 . The blue line represents the training error, and the orange line represents the selection error. A clear drop in error can be attributed to applying the quasi-Newton method for optimization. It is noticed that after 298 epochs, the training error has improved significantly from 0.252439 to 0.0137193 (95% improvement), and the selection error has improved from 0.410997 to 0.016316 (96% improvement). The final results of the model after optimization are presented in Table 7 . At this stage of model development, the best structure of the neural network is considered. The goal is to find the structure that avoids overfitting and underfitting and hence minimizes the error in predicting new data. For this purpose, 20% of the data were randomly selected for model selection. The achieved model balances between model complexity and quality of data. The properties considered in the model selection are the order of the model and the input variables used. The order of the model represents optimizing the number of hidden layers. On the other hand, input variables selection represents finding the subset of inputs that provides the best prediction results. To find the order with the minimum loss, an incremental order algorithm was used. This algorithm starts with the minimum number of neurons and keeps adding to the complexity until the order that provides the minimum loss is achieved. For the purpose of this study, the minimum number of perceptron layers evaluated is 1 and the maximum number is 10, with 1 hidden perceptron layer added in each iteration. The optimal number of neurons was 10. Growing inputs algorithm was used for the purpose of finding the best combination of input variables. This algorithm starts with the input that is most correlated with the output, calculates the error associated with that input variable alone, and keeps adding the next most correlated input variable until the error increases. The optimal number of input variables is 5 out of the 6 variables identified previously, leading to distance traveled being not used. The selected model is then used for model testing and validation as will be discussed in the next section. After finding the best structure that avoids overfitting and underfitting in the selection phase, the selected model is then used for testing and validation. In the testing phase, the remaining 20% of the data was introduced, and the actual output of that data set is compared to the predicted outputs produced based on the weights identified in the training phase. The coefficient of determination for the MLP neural network is calculated by fitting linear regression to compare the predicted output with the actual delivery time as shown in Fig. 7 . It is obvious that the results are close to the best-fit outputs, therefore, it is concluded that the model performed well on the testing data set and it provides a satisfactory results with R 2 of 92.44%. As the testing phase provided a satisfactory results, the proposed neural network model will be move to the socalled deployment phase. The concept of deployment or validation refers to the use of the neural network model to predict new and unseen data. In the validation phase, the model is tested on 30 additional data set to ensure it provides accurate results. This data set is completely unknown and unseen data. Figure 8 compares the predicted output from the neural network model and the actual delivery time for validation. The MLP neural network algorithm shows high accuracy in the validation phase with coefficient of determination of 94.48%. In the next section, the accuracy of the MLP model will be compared with supported vector machine and multiple linear regression algorithms for training and testing data sets. In this section, the performance of the developed MLP neural network model has been benchmarked with two other state-of-the-art prediction algorithms, namely; supported vector machine (SVM) and multiple linear regression (MLR). The accuracy for the three algorithms are compared in the basis of four goodness-of-fit parameters including mean squared error (MES), root mean squared error (RMSE), measure of the absolute errors (MAE), and coefficient of determination R 2 . To compare the performance of the three algorithms, six data sets are selected randomly from the original data. Thus, the experimental procedure is repeated six times. Therefore, the mean values for the four performance measures are computed for training and testing data sets as illustrated in Table 8 . In general, the three algorithms show excellent accuracy for predicting OGCs delivery time. However, to support the obtained results, statistical hypothesis test and relative percentage deviation index (RPD) are used. The results acquired by the three models are converted into a relative percentage deviation index as follows [57] : where Model sol is the metric value obtained by a given algorithm, and Best sol is the best solution obtained for the model. The lower value of RPD is preferred. When the confidence intervals overlap, the RPD index indicates that there is no significant statistical difference between the means of the measures. As a result of this analysis, the means and the confidence intervals for the RPD of the three models for the training phase are revealed in Fig. 9 . With respective to MSE and R 2 , it is clear that there is confidence intervals overlap among the three methods. This indication is also supported by statistical hypothesis tests that report a p-values greater than 0.05. Therefore, it is concluded that there is no significant difference between the three methods in the training phase. From Table 8 , it is also obvious that the average of R 2 for the three algorithms are almost identical. Similarly, low variation in the averages of mean square errors were noticed for the three algorithms with 0.5121, 0.4622 and 0.5687 for the MLP, SVM and MLR, respectively. For testing data set, as shown in Table 8 , noticeable differences among the performances of three methods are observed with R 2 of 91.93%, 94.06%, 90.57% for the MLP, SVM and MLR, respectively. The means and the confidence intervals for the RPD of the three models for the testing phase are shown in Fig. 10 . It can be noticed that there is a slight confidence intervals overlap among the three methods which is not statistically significant as supported by hypothesis test that reports p-values less than 0.05 for both R 2 and MSE. Therefore, it is concluded that there is a significant difference between the three methods. However, the SVM algorithm expresses the best prediction accuracy followed by MLP neural network. The SVM has high generalization capability and avoidance of local minima due to its strong theoretical background over other prediction methods [52] . Moreover, the low values of mean square errors for SVM and MLP algorithms indicate that the two approaches can reveal hidden relationships in the collected data, thereby improving the accuracy of predicting delivery time of OGCs. Similarly, the low values of mean absolute errors for supported vector machine and MLP models indicate that they have ability to provide reliable and consistent level of accuracy. The non-linearity relationship between some input variables and the delivery time minimizes the prediction accuracy of MLR model that records a coefficient of determinations of 90.57%. The differences in performances between training and testing for each algorithm depend on the nature, complexity, and ease of the selected data in each phase as well as the setting of corresponding parameters of each algorithm [58] . Figures 11, 12, 13 show the variations of actual and predicted delivery time in the testing phase for MLP, SVM, and MLR algorithms, respectively. It can be observed that the prediction values are close to the actual delivery time for the three methods. However, the SVM and MLP show slightly higher accuracy than MLR. In general, the predicted values for the three methods are almost identical to the original data which are in agreement with the coefficient of determinations for the three methods in the testing phase. As a final step in model development, sensitivity analysis was implemented to study the impact of changing one numerical input variable on the prediction results while keeping other input variables constant. The sensitivity analysis is conducted for MLP model since it provides a reliable and consistent level of accuracy compared to the other two benchmarked models. Figure 14 shows how changing cylinder quantity, number of trips, number of drops, and distance traveled impact the total delivery time. The grey point in the graph represents the reference point. Sensitivity analysis provides valuable insight. First, cylinder quantity has a very low impact on delivery time. In fact, the graph shows that delivering up to 200 cylinders can have a similar time to delivering one cylinder which indicates that adding more cylinders does not significantly slow down delivery time. Second, increasing the number of trips between cities leads to an increase in delivery time. However, this increase is not as significant as increasing the number of drops in the overall trip. This means that regardless of how far the cities are from each other, having more stops has more impact on delivery time. Finally, total distance has the most impact on delivery time, meaning that the total distance in the round trip is the variable that leads to most of the delays. As a final conclusion, sensitivity analysis shows the importance of route optimization in oxygen cylinder logistics. To summarize, after the MLP neural network model is trained, tested, and validated, it can be used to predict future delivery times for any entity handling the delivery of oxygen cylinders. This can assist both carriers and hospitals in their planning. Based on the predicted delivery time, carriers can plan their fleets and future deliveries. Hospitals can use the predicted delivery times to plan the quantity of cylinders to store. For this to work, carriers should share the estimate for each delivery to help other entities in the planning of their activities. The proposed neural network model can also be generalized to numerous similar applications. Specifically, logistics networks with similar characteristics, such as the delivery of industrial gases, can use the same model to predict delivery time and analyze the system following the proposed methodology in this study. Moreover, the model can be adjusted for any logistics network based on the features and applicable factors of the studied system. Demand for OGCs is booming globally due to its requirement for COVID-19 intensive care. This paper developed an effective neural network-based prediction model for the delivery time of OGCs. Additionally, the performance accuracy of the proposed model was compared to two state-of-the art prediction algorithms, namely, SVM and MLR. The neural network model was constructed and validated based on actual data. A real-life data set from a large company was used to provide results that can be used and replicated in practice. First, the factors affecting the oxygen cylinder logistics are identified based on the literature and professional experts. Then, the best and accurate neural network model is developed and selected among several neural network models. The correlation between the factors is identified, and sensitivity analysis was performed to study the impact of changing each input variable given other input variables are constant. The results showed that although the total distance is the main contributor to delivery delays, location has low impact on delivery time. This finding suggests that logistics companies can serve a wide base of customer locations, given that they optimize their routes to reduce the total distance traveled per trip. The results also revealed the ability of the developed model to predict the delivery time of OGCs. The MLP neural network model illustrated 91.93% accuracy of prediction by comparing the predicted values to the actual values for testing data set. The SVM and MLR illustrated 94.06% and 90.57% accuracy of prediction, respectively. The delivery time prediction model developed in this study can help logistic companies to provide more realistic delivery time promises to their clients. The model can also be used for other gas cylinder deliveries by adding factors unique to the studied commodities. The developed model is an easy-to-use model that can predict delivery times of oxygen cylinders, which enables logistics companies to give more accurate delivery time estimates and avoid delay penalties. Future studies can experiment on data sources coming from other areas of the world. They can also consider more factors given the different requirements in other environments, such as factors related to geological or regulatory variables. Furthermore, a similar study can be performed on other medical commodities, other gas cylinder logistics, or other means of storing and delivering oxygen. In addition, a study can be performed to optimally design delivery networks for OGCs, with the aim to reduce total delivery time. A framework for closed-loop supply chains of reusable articles Medical gases, their storage and delivery Storage and handling of gas cylinders World Health Organization (2020) Oxygen sources and distribution for COVID-19 treatment centres: interim guidance books?hl=en&lr=&id=_8_UBQAAQBA J&oi=fnd&pg=PP1&dq=Artificial?Neural?Network?for? Drug?Design,?Delivery?and?Disposition.?In?Artifi cial?Neural?Network?for?Drug?Design&ots= CsV8zKzdmJ&sig=TvsrDqj7cJ6bYW3f0jcwAgnPaDY&redir_ esc=y#v=onepage&q=Artificial Neural Network for Drug Design%2C Delivery and Disposition. In Artificial Neural Network for Drug Design&f=false A heuristic approach to forecasting the delivery time of major project deliverables Long leadtime forecasting of U.S. streamflow using partial least squares regression A data driven cycle time prediction with feature selection in a semiconductor wafer fabrication system Real-time prediction of order flowtimes using support vector regression Predicting on-time delivery in the trucking industry Tronci M simulation model of the logistic distribution in a medical oxygen supply chain Network design for cylinder gas distribution A review of further directions for artificial intelligence, machine learning, and deep learning in smart logistics ICRRM 2019 -System reliability, quality control, safety, maintenance and management. ICRRM, 2019 -Syst Reliab Qual Control Safety Maint Manag Application of machine learning techniques for supply chain demand forecasting Predicting future inbound logistics processes using machine learning Machine learning in agent-based stochastic simulation: inferential theory and evaluation in transportation logistics Predicting order lead time for just in time production system using various machine learning algorithms: a case study Modeling and optimizing a vendor managed replenishment system using machine learning and genetic algorithms Lead time prediction using machine learning algorithms: a case study by a semiconductor manufacturer A novel hybrid artificial intelligence-based decision support framework to predict lead time Effect of forecasting on the multi-echelon distribution inventory supply chain cost using neural network, genetic algorithm and particle swarm optimisation Research on commercial logistics inventory forecasting system based on neural network A model on location decision for distribution centers of emergency food logistics Logistics distribution center location evaluation based on genetic algorithm and fuzzy neural network Collaborative supply chain planning using the artificial neural network approach A neuro-fuzzyregression algorithm for improved prediction of manufacturing lead time with machine breakdowns A hybrid fuzzy-neural approach to job completion time prediction in a semiconductor fabrication factory Deep neural networks based order completion time prediction by using real-time job shop RFID data Boosting algorithms for delivery time prediction in transportation logistics Predicting purchase orders delivery times using regression models with dimension reduction Real-time delivery time forecasting and promising in online retailing: when will your package arrive? Modeling and prediction of freight delivery for blocked and unblocked street using machine learning techniques Neural-network-based delivery time estimates for prioritized 300-mm automatic material handling operations Exploring the relationship between marketing and operations: neural network analysis of marketing decision impacts on delivery performance Due date assignment using artificial neural networks under different shop floor control strategies DeepETA: a spatial-temporal sequential neural network model for estimating time of arrival in package delivery system On-time last-mile delivery: order assignment with travel-time predictors Supervised learning for arrival time estimations in restaurant meal delivery Impact of medical logistics on the quality of life of health care users Supply chain management view project the formation of the city of koprivnica as a smart city view project impact of medical logistics on the quality of life of health care users Big data and machine learning algorithms for health-care delivery Demand forecasting in pharmaceutical supply chains: a case study undefined data-driven inventory management in the healthcare supply chain Application of artificial neural networks in controlled drug delivery systems COVID-19: impact on health supply chain and lessons to be learnt Predicting the impacts of epidemic outbreaks on global supply chains: a simulation-based analysis on the coronavirus outbreak (COVID-19/SARS-CoV-2) case At the epicenter of COVID-19-the tragic failure of the global supply chain for medical supplies Application of artificial neural network(s) in predicting formwork labour productivity Artificial?Neur al?Networks&ots=Gd90xiEJVD&sig=MXXJA4f853c_ GkabZFgfhBykUIw&redir_esc=y#v=onepage&q=Yegnanar ayana%2C Artificial Neural Networks&f=false. Accessed A Modern Introduction to Probability and Statistics: Understanding Why and How -F.M. Dekking, C. Kraaikamp, H.P books?hl=en&lr=&id=odn7_auSAnEC&oi=fnd&pg= PA1&dq=A?modern?introduction?to?probability?and? statistics?:?understanding?why?and?how&ots=CmBWIf-dr6&sig=I8dAfuAIPnxWMGVq9zgQoFaOG9Y&redir_esc= y#v=onepage&q=A modern introduction to probability and statistics %3A understanding why and how&f=false Statistical learning theory Nonlinear prediction of chaotic time series using support vector machines The nature of statistical learning theory Applied statistics and probability for engineers Predicting solutions of large-scale optimization problems via machine learning: a case study in blood supply chain management Optimization of multi-product economic production quantity model with partial backordering and physical constraints Quantitative predictions of gas chromatography retention indexes with support vector machines, radial basis neural networks and multiple linear regression Acknowledgements The authors would like to acknowledge the help and support provided by the King Fahd University of Petroleum & Minerals (KFUPM). Conflict of interest The authors declare there are no conflict of interests.