Data mining with various optimization methods


Data mining with various optimization methods

Vladimir Nedic a, Slobodan Cvetanovic b, Danijela Despotovic c, Milan Despotovic d,⇑, Sasa Babic e
a Faculty of Phil. and Arts, University of Kragujevac, Jovana Cvijica bb, 34000 Kragujevac, Serbia
b Faculty of Economics, University of Nis, Trg kralja Aleksandra Ujedinitelja 11, 18000 Nis, Serbia
c Faculty of Economics, University of Kragujevac, Djure Pucara Starog 3, 34000 Kragujevac, Serbia
d Faculty of Engineering, University of Kragujevac, Sestre Janjic 6, 34000 Kragujevac, Serbia
e College of Applied Mechanical Engineering, Trstenik, Serbia

a r t i c l e i n f o

Keywords:
Traffic noise
Artificial intelligence
Genetic algorithm
Hooke and Jeeves
Simulated annealing
Particle swarm optimization
Software

a b s t r a c t

Road traffic represents the main source of noise in urban environments that is proven to significantly
affect human mental and physical health and labour productivity. Thus, in order to control noise sound
level in urban areas, it is very important to develop methods for modelling the road traffic noise. As
observed in the literature, the models that deal with this issue are mainly based on regression analysis,
while other approaches are very rare. In this paper a novel approach for modelling traffic noise that is
based on optimization is presented. Four optimization techniques were used in simulation in this work:
genetic algorithms, Hooke and Jeeves algorithm, simulated annealing and particle swarm optimization.
Two different scenarios are presented in this paper. In the first scenario the optimization methods use
the whole measurement dataset to find the most suitable parameters, whereas in the second scenario
optimized parameters were found using only some of the measurement data, while the rest of the data
was used to evaluate the predictive capabilities of the model. The goodness of the model is evaluated by
the coefficient of determination and other statistical parameters, and results show agreement of high
extent between measured data and calculated values in both scenarios. In addition, the model was com-
pared with classical statistical model, and superior capabilities of proposed model were demonstrated.
The simulations were done using the originally developed user friendly software package.

� 2013 Elsevier Ltd. All rights reserved.

1. Introduction

Road traffic noise along with the noise coming from railways
and industries represents very important factor regarding environ-
mental pollution in urban areas. The influence of traffic noise on
human health has been studied on numerous occasions in recent
years (Brink, 2011; Fyhri & Klboe, 2009; Pirrera, De Valck, &
Cluydts, 2010) resulting that this kind of annoyance significantly
affects both mental and physical health in many ways: causing
anxiety, stress, hearing impediments, sleep disturbance, cardiovas-
cular problems, etc. Thus, in order to control noise sound level in
urban areas, it is very important to develop methods for prediction
of the traffic noise. Due to the rapid development of means of
transportation and road traffic, the influence of the traffic flow
structure on the level of traffic noise is an important area of
research. Through the monitoring of basic flow parameters and
their trends it is possible to predict and monitor noise that appears
in the certain part of the transport network. In this way, the effect
of noise reduction can be achieved through different modes of

traffic management, which is particularly important for human
health and environmental improvement.

The first traffic noise prediction (TNP) models date back to early
1950s. Since then large number of methods and models for traffic
noise prediction has been developed. The critical reviews of the
most used ones are given in Steele (2001) and Quartieri et al.
(2009). Most of the TNP models that are presented in literature
are based on linear regression analysis. The main limit of those
models, as concluded in Quartieri et al. (2009) and Guarnaccia,
Lenza, Mastorakis, and Quartieri (2011), is ‘‘that they do not take
into account the intrinsic random nature of traffic flow, in the
sense that they do not take care of how vehicles really run, consid-
ering only how many they are’’. More advanced models involve
artificial neural networks (ANN) (Cammarata, Cavalieri, & Fichera,
1995; Givargis & Karimi, 2010) and genetic algorithms (Gndogdu,
Gkdad, & Yksel, 2005; Rahmani, Mousavi, & Kamali, 2011). ANN
model that was used in Cammarata et al. (1995) has 3 inputs:
equivalent number of vehicles, which was obtained by adding to
the number of cars number of motorcycles multiplied by 3 and
number of trucks multiplied by 6, the average height of the build-
ings on the sides of the road, and the width of the road. In order to
increase the number of inputs authors decomposed equivalent
number of vehicles into the number of cars, the number of

0957-4174/$ - see front matter � 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.eswa.2013.12.025

⇑ Corresponding author. Tel.: +381 69 844 9679.
E-mail address: mdespotovic@kg.ac.rs (M. Despotovic).

Expert Systems with Applications 41 (2014) 3993–3999

Contents lists available at ScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a

http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2013.12.025&domain=pdf
http://dx.doi.org/10.1016/j.eswa.2013.12.025
mailto:mdespotovic@kg.ac.rs
http://dx.doi.org/10.1016/j.eswa.2013.12.025
http://www.sciencedirect.com/science/journal/09574174
http://www.elsevier.com/locate/eswa


motorcycles, and the number of trucks, and got the ANN model
with 5 inputs. In terms of the parameters involved in the CoRTN
(Calculation of Road Traffic Noise) model (Quartieri et al., 2009),
which was initially developed in 1975 by the Transport and Road
Research Laboratory and the Department of Transport of the Uni-
ted Kingdom, the ANN model that was used in Givargis and Karimi
(2010) has 5 input variables: the total hourly traffic flow, the
percentage of heavy vehicles, the hourly mean traffic speed,
the gradient of the road, and the angle of view. Authors tested
the developed model on the data collected on Tehran’s roads,
and found no significant differences between the outputs of the
developed ANN and the calibrated CoRTN model. In Gndogdu
et al. (2005) genetic algorithm was used to model the traffic noise
in relation to traffic composition (vehicle per hour), the road gradi-
ent and the ratio of building height to the road width. In Rahmani
et al. (2011) the proposed model is a function of total equivalent
traffic flow and equivalent traffic speed. In both papers the authors
used MATLAB to find the optimized values of model parameters.

In this paper an application of four optimization techniques for
the prediction of traffic noise is presented. These techniques are:
genetic algorithms, Hooke and Jeeves algorithm, simulated anneal-
ing, and particle swarm optimization. The model that is proposed
consists of five variables: the number of light motor vehicles, the
number of medium trucks, the number of heavy trucks, the num-
ber of buses and the average traffic flow speed. All optimized mod-
els are tested on data measured on Serbian road using the
originally developed user friendly software package.

2. Problem formulation

The most suitable measure for depicting traffic noise emission
is equivalent sound pressure level ðLeqÞ, which is expressed in units
of dbA and corresponds to fictitious noise source emitting steady
noise, which in specific period of time contains the same acoustic
energy as the observed source with fluctuating noise. For a number
of discrete measurements ðNÞ; Leq for time period T is expressed by
following equation:

Leq ¼ 10log10 1=T
XN
i¼1

10
Li
10

 !
ð1Þ

where Li is sound pressure level, which corresponds to i
th

measurement.
In order to reduce the noise it is necessary to know functional

relationship between the equivalent sound pressure level and
influential parameters. Leq is correlated to numerous parameters,
such as numbers and types of vehicles, their velocities, type of road
surface, width and slope of the road, height of buildings facing the
road, etc. As mentioned in the introduction, in this paper the
following variables were considered: the number of light motor
vehicles (LMV), the number of medium trucks (STV), the number
of heavy trucks (TTV), the number of buses (BUS) and the average
traffic flow speed (Vavg). A brief description of how these variables
were measured is given in the following chapter.

3. Data sampling

For traffic data measurement and for noise measurement on the
road M5, automatic traffic counters QLTC-10C and sound level me-
ter Bruel&Kajer type 2230 class 1 respectively were used. The
equivalent sound pressure levels were measured for time period
of 15 min. In order to include greater number of scenarios that
might occur in urban environments, a total of 124 measurements
of equivalent noise levels for time periods of 15 min were carried
out. Measurements of Leq for time period of 15 min were performed
at various times to include diversity of the traffic flow as much as

possible. Simultaneously, variations in traffic flow, traffic speed
and composition of traffic flow were measured. For that reasons
the surveys at the same time also consist of the following param-
eters: the number of light motor vehicles, the number of medium
trucks, the number of heavy trucks, the number of buses, and the
average traffic speed in the given time periods.

Measurements were taken in accordance with recommenda-
tions for road traffic noise measurement; microphone was
mounted away from reflecting facades, at a height of 1.2 m above
the ground level and 7.5 m away from central line of the road. Dur-
ing the measurements it has been taken care that climate condi-
tions are as similar as possible (no wind, no rain) in order to
eliminate their influence.

4. Mathematical model and methods

The equivalent sound pressure level is supposed to be modeled
by the following equation:

Leq ¼ N1 � log10ðLMVÞþ N2 � log10ðSTVÞþ N3 � log10ðTTVÞ

þ N4 � log10ðBUSÞþ N5 � Vavg
N6 þ N7 � log10ðVavgÞ ð2Þ

where Niði ¼ 1 � 7Þ are coefficients. The problem transforms to find
coefficients Ni , such that supposed model best fits experimental data.
For that purpose genetic algorithms, Hooke and Jeeves algorithm,
simulated annealing, and particle swarm optimization are used.
These techniques are briefly described in following subchapters.

4.1. Genetic algorithms

Genetic algorithms (Rao, 1996) are class of evolutionary algo-
rithms that could be used for a large number of different applica-
tion areas. The principle of genetic algorithms is based on
Darwin’s theory of evolution, by which the fittest individuals have
the best chances to survive. Genetic algorithms operate with a set
of individuals (chromosomes) called population. The information

Fig. 1. Flowchart of the Genetic algorithm workflow.

3994 V. Nedic et al. / Expert Systems with Applications 41 (2014) 3993–3999


https://isiarticles.com/article/22312