Expert Systems with Applications 00 (2012) 1–13

Expert
Systems

with Appli-
cations

Time-stamped Resampling for Robust Evolutionary Portfolio
Optimization

Sandra Garcı́aa,∗∗, David Quintanaa, Inés M. Galvána,∗, Pedro Isasia

aComputer Science Department, Carlos III University of Madrid
Avda. Universidad 30, 28911 Leganes, Spain

http://www.evannai.inf.uc3m.es

Abstract

Traditional mean-variance financial portfolio optimization is based on two sets of parameters, estimates for the asset returns and
the variance-covariance matrix. The allocations resulting from both traditional methods and heuristics are very dependent on these
values. Given the unreliability of these forecasts, the expected risk and return for the portfolios in the efficient frontier often
differ from the expected ones. In this work we present a resampling method based on time-stamping to control the problem. The
approach, which is compatible with different evolutionary multiobjective algorithms, is tested with four different alternatives. We
also introduce new metrics to assess the reliability of forecast efficient frontiers.

Keywords: Financial Portfolio Optimization, Robust Portfolio, Multiobjective Evolutionary Algorithms

1. Introduction

Asset allocation is one of the core topics in financial management. The search for the optimal choice of financial
assets to be included in a portfolio has been the subject of research for a long time and it is one of the most active lines
in finance.

The academic literature on this subject is very large and mostly based on the work of Markowitz [12, 9]. Under
this framework, the problem is introduced as a multiobjective optimization problem where the investor tries to figure
out the weight that each of the investment alternatives should carry in the portfolio. The target of this investor would
be both minimizing risk and maximizing return at the same time. The solution to the problem of optimizing for these
two objectives in conflict defines a set of solutions called the efficient frontier. This Pareto front consists on portfolios
that are neither better or worse than the rest. For each level of risk or return, there is no better alternative in terms of
the other objective. This makes the election of one of them a choice to be made by investors according to the way they
weight risk and rewards.

The basic version of the problem can be solved using Quadratic Programming (QP). Unfortunately, this approach
is built on a set of assumptions that are unlikely to hold in the real world. The quest for alternatives has driven
attention to metaheuristics that might not suffer this limitation. This is the reason why the framework of evolutionary
computation is getting traction on this area.

∗Corresponding author
∗∗Principal corresponding author

Email addresses: sgrodrig@inf.uc3m.es (Sandra Garcı́a), dquintan@inf.uc3m.es (David Quintana), igalvan@inf.uc3m.es (Inés
M. Galván), isasi@ia.uc3m.es (Pedro Isasi)

1


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 2

The solutions based on metaheuristics mostly fall into one this two categories: solutions that transform the multi-
objective problem into a single-objective form, and those who deal with it using a mutiobjective approach.

Among the first group, we would mention the work by Soleimani et al.[21]. These authors use a genetic algorithm
and extend the mentioned classic mean-variance optimization model and consider constraints on transaction costs,
round lots or cardinality. They tackle the multiobjective nature of the problem minimizing the risk objective while
setting a range of minimum acceptable returns in the constraints. Chiranjeevi and Sastry [4] use another popular
approach to transform the multiobjective problem into a version that can be handled by single-objective algorithms.
Instead of keeping one of the objectives on the objective function and the other in the constraint, they consider the
objectives in the fitness function weighting them. In this particular case, they manage five objectives that are obtained
breaking down the basic two. Chang et al. [3] suggest a similar solution and optimize a function that weights risk and
return using a risk aversion parameter. Zhu et al. [23] target a popular metric, the Sharpe Ratio, that combines both
elements into a single expression.

The development of Multiobjective Evolutionary Algorithms (MOEAs) has resulted into a number researchers
exploring their performance in this area. Among these we could mention Skolpadungket et al. [20], who test a set
of multi-objective algorithms (VEGA, SPEA2, NSGA-II...) on a constrained version of the two objective problem.
More recently, Anagnostopoulos and Mamanis [1] compare the performance of different multiobjective algorithms,
and Deb et al. [6] introduce a customized hybrid version of NSGA-II to tackle the problem. Finally, we will mention
the work of Radziukyniene and Xilinskas [16], where authors compare FastPGA, MOCELL, AbYSS, and NSGA-II
on both of the basic problems, and an extended version that considers the dividend yield as a third objective.

Despite of the amount of research on portfolio optimization, there are still open issues. A key one is the robustness
of results provided by algorithms. Among the most important factors that asset managers have to consider when
evaluating the results provided by any of the above-mentioned methods, there is reliability. Very often, the expected
efficient frontier lies far from the real one. This is due to the fact that the estimates for the expected risk and returns
for the assets in the solution, and the portfolios derived from them, are very inaccurate. Understandably, this results
on mistrust by some practitioners. The search for solutions for this problem has cleared the way for the field of robust
portfolio optimization. It is in this area where we focus our contribution. We introduce a time-stamping method to
control the population that enhances the reliability of the solutions provided by MOEAs.

The process of optimizing the risk and return of a portfolio relies on two parameters: the estimates for the expected
asset returns and the variance-covariance matrix. The values of these parameters are usually based on past data and
they might be inaccurate due to, for instance, the presence of outliers. In this context, there are several potential ways
to tackle the problem. The two most prevalent alternatives found in the literature rely on either putting an emphasis on
having robust estimates for the parameters, or managing the optimization process itself. The first one usually tries to
filter the estimates to control, for instance, the influence of extreme past events on their computation [14]. The authors
focusing on the second alternative design approaches handle uncertainty in the parameters during the optimization
process [15, 22]. The alternative suggested in this paper falls in the latter category. We will enhance the solutions of
MOEAs testing the population for different values for the parameters, and selecting the portfolios that consistently
offer a good performance.

Optimizing for a single scenario, a set of expected asset returns and the use of a single variance-covariance matrix,
bears the risk of getting solutions that might be extremely sensitive to deviations. This could potentially be a problem
as it is almost certain that the estimates will not be accurate. We have to bear in mind that having perfect estimates
for the expected returns for instance, implies that we can make perfect predictions for future prices, which is highly
unlikely. For this reason we consider that assessing the candidate solutions in different likely scenarios and favoring
those that consistently offer a good performance, might be a promising approach. The requirement of consistency is
key. In order to achieve it, we introduce a time-stamping mechanism that, together with performance in terms of risk
and return, will drive the evolution process to find robust and stable solutions.

The approach introduced in this paper is related to alternatives based on resampling [19, 18]. The most compa-
rable among the traditional approaches is the one described by Idzorek [10]. This author suggests using combining
traditional QP with Monte Carlo simulation to derive a set of fronts that are merged into a single solution at a later
stage. This idea is very interesting but, unfortunately, the approach suffers the shortcomings of QP, namely, the ability
to deal with real-world constraints. This is the reason why we feel that adapting the idea to the framework of MOEAs
is very promising, as they do not suffer from this limitation. There is a previous effort based of evolutionary algorithms
along the lines of this work, but is based on a very simple resampling approach that optimizes for a different scenario

2


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 3

in each generation[8]. The time-stamping mechanism that we introduce in this work is based on the previous one
extending the problem with a third implicit objective that favors solutions that are consistently reliable. As we will
see in the experimental section, this results on significantly higher robustness.

The proposed approach is compatible with a wide array of MOEAs. Given their different nature and behavior,
we will test the approach on a set of popular algorithms. Specifically, the experimental section will consider NSGA-
II, SPEA2, SMPSO and GDE3. NSGA-II [5] is one the most referenced algorithms in the field of multiobjective
optimization. This one, together with SPEA2 [24] have been confirmed [20] to offer good performance in portfolio
optimization. Apart from the mentioned two, we consider GDE3 [11], a differential evolution strategy, and SMPSO,
[13] a multiobjective algorithm based on particle swarm optimization.

In the context of multiobjective problems, an important issue is the metric used to evaluate the solutions. It is
generally admitted that there is no ideal single metric that should be used to evaluate different objectives simultane-
ously in every circumstance. The metrics that are most commonly used in this field, such as Hypervolume, Spread
or SetCoverage are not appropriate indicators of stability in this context. For this reason, we define a set of metrics
that capture different aspects of robustness in efficient frontiers. These, Estimation Error, Stability, Extreme Risk and
Unrealized Returns draw on the basic principle that the expected risk and returns for the portfolios in the solution
should be close to the observed ones ex-post.

The rest of the paper is organized as follows. First, we make a formal introduction to the financial portfolio
optimization problem. After, we describe in detail the evolutionary approach proposed in this work. This section
includes a brief description of the MOEAs, the solution encoding and the fitness mechanism in order to find robust and
stable portfolios. Next, the different metrics used in this work to evaluate robustness of solutions are described. That
will be followed by the experimental results and, finally, there will be a section devoted to summary and conclusions.

2. Financial Portfolio Optimization Problem

Financial portfolios can be defined as a collection of investments or assets held by an institution or a private
individual. The Modern Portfolio Theory was originated in the article published by Harry M. Markowitz, in 1952
[12]. It explains how to use the diversification to optimize the Portfolio. In general, the portfolio optimization
problem is the choice of an optimum set of assets to include in the portfolio and the distribution of investor’s wealth
among them. Markowitz [9] assumed that solving the problem requires the simultaneous satisfaction of maximizing
the expected portfolio return E(Rp) and minimizing the portfolio risk (variance) σ2p, that is, solving a multiobjective
optimization problem with two output objective functions [21, 4, 20, 16]. The portfolio optimization problem can be
formally defined as:

• Minimize the risk (variance) of the portfolio:

σ2p = Σ
n
i=1Σ

n
j=1wiw jσi j (1)

• Maximize the return of the portfolio:
E(Rp) = Σ

n
i=1wiµi (2)

• Subject to:
Σ

n
i=1wi = 1 (3)

0 ≤ wi ≤ 1; i = 1...n (4)

where n is the number of available assets, µi the expected return of the asset i, σi j the covariance between asset
i and j, and wi are the decision variables giving the composition of the portfolio. The constrains referenced in
equations 3 and 4 require the full investment of funds and prevent the investor from shorting any asset, respectively.
In a quantitative way, the risk is represented with the standard deviation σp.

The solution to the problem should also consider some real world constraints [2] such as:

3


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 4

Figure 1. Efficient frontier

• Cardinality constraint: it is possible to define the maximum Cmax and minimum Cmin number of assets in which
it is possible to invest (wi , 0):

Cmin ≤ Σ(wi , 0) ≤ Cmax (5)

• Values limit constraint: each weight wi must have a value in the interval [limin f , limsup], where:

0.0 ≤ limin f ≤ wi ≤ limsup ≤ 1.0 (6)

All of these equations are solved by a set of points that constitute the efficient frontier of the problem. The points
will define a curve similar to fig.1, plotted in the risk-return space of all possible portfolios. The points of this curve
represent portfolios which have the minimum amount of risk given a certain expected return (and viceversa).

3. Evolutionary Approach for Robust Portfolio Optimization

As it was mentioned in the introduction, in this paper we tackle the problem of achieving robust or stable portfolios
using MOEAs. In order to do that, we suggest replacing the traditional fitness function with a new one that extends it
with a resampling mechanism and an implicit third objective to control the robustness of the solutions in the front.

We will test the effectiveness of the approach on different MOEAs using the same chromosome structure and
fitness evaluation procedure. In this section, we introduce briefly the MOEAs tested and provide details regarding the
chromosome encoding and fitness function.

3.1. Tested Evolutionary Multi-objective Algorithms

The following MOEAs will be tested: NSGA-II, GDE3, SMPSO, SPEA-II. All the algorithms are implemented
in jMetal [7], a Java framework aimed at multiobjective optimization with metaheuristics. By reusing the base classes
of jMetal, all the techniques share the same basic core components (solution encodings, operators, etc.). This ensures
a fair comparison of the considered algorithms.

NSGA-II, proposed by Deb et al. [5], is one of the most widely used multiobjective metaheuristics. It represents
the new version of the NSGA algorithm developed by the same author. It is a generational genetic algorithm based on
an auxiliary population derived from the original one applying the usual genetic operators (selection, crossover and
mutation). Then, the two populations are merged and the individuals are sorted according to their rank. Inside each
of these ranks, the crowding distance is used to sort the individuals from less to more crowded. A solution with a
smaller value of this distance measure is, in some sense, more crowded by other solutions. Finally, the best solutions
are selected to compose the new population that will be used to create a new population.

GDE3 [11] is the third version of the Generalized Differential Evolution algorithm (GDE), an extension of Dif-
ferential Evolution (DE) for global optimization with an arbitrary number of objectives and constraints. GDE3 starts
with a population of random solutions. At each generation, an offspring population is created using the differential
evolution operators; then, the current population for the next generation is updated using the solutions of both, the

4


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 5

offspring and the current population. Before creating the next generation, the size of the population is reduced using
non-dominated sorting and a pruning technique aimed at diversity preservation, in a similar way as NSGA-II. How-
ever, GDE3 modifies the crowding distance of NSGA-II in order to solve some of its drawbacks when dealing with
problems having more than two objectives.

SMPSO, introduced by Nebro et al. [13], is a multiobjective particle swarm optimization algorithm (PSO) char-
acterized by the use of a strategy to limit the velocity of the particles. It is based in OMOPSO [17] but including the
velocity constriction procedure. This mechanism is useful when the velocity becomes too large because it can produce
new effective particles positions. SMPSO also relies on an external archive that stores the non-dominated solutions
found during the search process. Polynomial mutation is used in the algorithm as a factor of turbulence.

SPEA-II algorithm [24], developed by Zitzler et al., solves some weaknesses of a previous version by the same
authors called SPEA. Among the improvements, we could mention a fitness functions that takes into account, for
each individual, the number of individuals dominated by this one and the number of individuals which dominate it.
This version also adds a density estimation for the population. This algorithm uses a core population and an archive.
It assigns to each individual a fitness value that is the sum of its strength raw fitness plus a density estimation. In
each generation, the non-dominated individuals of both the original population and the archive are used to update the
archive; if the number of non-dominated individuals is greater than the population size, a truncation operator based on
calculating the distances to the k-th nearest neighbor is used. All this procedure is known as Environmental Selection.
Then, the algorithm applies the selection, crossover, and mutation operators to members of the archive in order to
create a new population of offsprings which becomes the population for the next generation.

3.2. Solution encoding

The encoding chosen will represent each portfolio as a vector of real numbers. Each of these numbers represents
the percentage of investment per asset (also called weight: wi where i ranges from 1...n, and n is the number of
investable assets). Here, each portfolio will be represented by a single element of the population.

Every individual must meet the constraints specified by eqs. 3, 4 explained before. The sum of weights per
individual must be 1, that is full investment is required. Also, the individuals must satisfy additional real-world
constraints showed in eqs. 5 and 6. The satisfaction of these constraints is guaranteed by repairing algorithms. The
individuals are repaired both after initializing the population (see alg. 1), and applying the genetic operators (see alg.
2). The repairing algorithms transform non-complying chromosomes into portfolio satisfying constraints. Whenever
the number of invested assets do not belong to the interval [Cmin, Cmax], its number is adjusted to ensure compliance
with the cardinality constraint. This is done adding or dropping assets until the requirement is met. In case the sum
of weights per individual is not 1.0, the algorithm fine tunes the holdings adding or subtracting random amounts up
to the required adjustment. These changes are forced meet the investment limits [limin f , limsup]. The details of this
process are described in algorithms 1 and 2.

Algorithm 1 Reparation after initialization
Initialize population P as a set of vectors with real numbers xi = (xi1, ..., xin) ; xi j ∈ [limin f , limsup]
for each individual xi of P do

A random number ∈ [Cmin, Cmax] of values 0 is assigned to coordinates of vector xi
while Σnj=1 xi j , 1 do

if sum of the limin f of xi j , 0 is > 1 then
select randomly one coordinate j of vector xi such that xi j , 0 and assign xi j = 0

end if
if sum of the limsup of xi j , 0 is < 1 then

select randomly one coordinate j of vector xi such that xi j = 0 and assign xi j = limsup
end if
if Σnj=1 xi j , 1 then

select randomly one coordinate j of vector xi such that xi j , 0
add/subtract the quantity left to make the Σnj=1 xi j = 1 respecting the limits [limin f , limsup]

end if
end while

end for
Return(P)

5


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 6

Algorithm 2 Reparation after genetic operators
for each individual xi of P do

if there are less xi j = 0 than Cmin then
set one random xi j = 0 to xi j = limin f

end if
if there are more xi j = 0 than Cmax then

set the xi j , 0 with less value to xi j = 0
end if
while Σnj=1 xi j , 1 do

if sum of the limin f of xi j , 0 is > 1 then
select randomly one coordinate j of vector xi such that xi j , 0 and assign xi j = 0

end if
if sum of the limsup of xi j , 0 is < 1 then

select randomly one coordinate j of vector xi such that xi j = 0 and assign xi j = limsup
end if
if Σnj=1 xi j , 1 then

select one random xi j , 0 ∈ [limin f , limsup]
add/subtract one random quantity left to make the Σnj=1 xi j = 1 respecting the limits [limin f , limsup]

end if
end while

end for
Return(P)

3.3. Fitness function. Time-stamped resampling

In this paper we introduce a resampling strategy combined with an implicit age-based third objective to identify
robust portfolios. In this subsection, the procedure to evaluate the fitness function is described in detail.

The starting point for the evaluation of portfolios is the framework introduced in section 2, where part of the fitness
of each individual is determined by evaluating two objective functions: the return E(Rp) (eq. 2) to be maximized
and the risk σp (eq. 1) to be minimized. As it is apparent, E(Rp) and σp are very tightly related to the values of
the expected asset returns and the variance-covariance matrix. One of the most important challenges that portfolio
managers face when they operate within Markowitz’s framework is the dependence of the solutions on the estimates
for these parameters. Given the challenges inherent to financial forecasting, it is normal that the mentioned parameters
are not accurate.

The mentioned difficulty is likely to result in a set of portfolios that could end up behaving in an unexpected way.
Fig.2 shows a real example of one Pareto front where the solutions are evaluated using the forecast parameters (in red)
vs. the real parameters (in green). The set of portfolios that were optimized for the traditionally considered most likely
scenario (mean return for each asset over a period of time, and variance-covariance matrix computed using the same
data) define the upper efficient frontier. However, once we calculate the risk and return for the same portfolios using
the real observed parameters (actual assets returns instead of the mentioned averages), we realize that their profiles
can be very different. As can see, the difference between the two fronts, that represent the same solution, might be
potentially quite dramatic.

A first step towards robust solutions would be resampling during the evolution process. This approach would
require replacing the mentioned parameters in the fitness functions at every generation. Doing this, the evolution
process would favor the individuals that show good performance in terms of risk and return over different scenarios,
and would discard those overspecialized on specific values for the parameters, including the predicted ones. A key
element of this process is the creation of these scenarios. We would need a set of solutions able to handle uncertainty,
but there is no need for absolute generalization. The scenarios should be considered according to their likelihood.

The approach that we use to generate likely scenarios is based on nonparametric bootstrap. As we mentioned
before, the usual way to forecast the value of parameters for the model is averaging the returns and computing the
associated variance-covariance matrix over a number of periods. Our algorithm has the same starting point, the
definition of a time window. However, instead of using all the data to derive a single estimate for the parameters,
data are resampled. The process selects a random set of time periods that has the same size as the original window
(each period might be selected more that once). Then, we average the returns for those time periods and compute the
variance-covariance matrix. Every time we do this, we generate new estimates for the parameters that are based on

6


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 7

Figure 2. Solution evaluated with forecast and real parameters

past data. These estimates can subsequently be used to calculate the risk and returns of the portfolios. This process is
described in 3.

Algorithm 3 Resampling method
S is the original sample set with a size Ns.
S ′ is the new sample set with a size N′s. At the beginning, S

′ = ∅ and N′s = 0.
while N′s , Ns do

It selects one instance Xi, at random, from S .
This instance is added to the new set. S ′ = S ′ + Xi.

end while
Return(S ′)

Every generation, the algorithm evaluates all the individuals under a new resampled scenario. This is relevant
because using a shared set of parameters makes the comparison for dominance purposes meaningful.

This process alone would be enough to weed out overspecialized solutions over time. However, as it was described,
it has a relevant flaw. The results would be very dependant on the last iterations of the evolution process. Every
generation, we replace some solutions from the previous population with new ones and, chances the lasts scenarios
generated might be very extreme. If that were the case, a percentage of the best solutions that might have fared
well over time, might offer a poor performance. This could result on new solutions specialized on the last scenario
replacing the old ones. If this happened during the first few generations, the specialized solutions would be replaced
over time. However, in case we faced this in the final stages, we might end up obtaining poor results. We will prevent
this adding to the described resampling an implicit third objective based on a time-stamping mechanism.

This feature extends the resampled mean-variance model with a third objective, the age of the solution, to be
maximized. Each individual will have a counter that will keep track of the number of generations it has remained
in the population. The explicit maximization of this variable rewards good performance in previous generations and
mitigates the risk described before. Once the evolution process is over, the population is evaluated once more using
the forecast values for the parameters for the period. The final set of portfolios, provided as solutions, consists of
the elements of the population that are not dominated in terms of risk and return. At this stage, all the information
concerning the third objective is disregarded. Algorithm 4 shows the modifications carried out in the basic MOEAs to
implement the fitness function described in this section.

We expect that the sum of these changes in the traditional fitness function would discard those portfolios that,
under normal circumstances, could potentially result in particularly bad performance, and prioritize those that offer
consistently good solutions under likely scenarios.

7


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 8

Algorithm 4 Modifications on the basic MOEAs
(Specific steps of MOEA)
Reparation after initialization (alg. 1)
for each generation do

A new sample set (S ′) is generated by alg. 3
(Specific steps of MOEA)
Reparation after genetic operators (alg. 2)
for every existing individual p do do

Calculate E(Rp) on S ′

Calculate σp on S ′

T p = T p + 1
end for
(Specific steps of MOEA)

end for
(Specific steps of MOEA)
MOEA returns a non-dominated solution’s set P
Compute forecast parameters (S : original sample set)
for each individual p in P do

Discard time-stamping objective T p
Calculate E(Rp) on S
Calculate σp on S

end for
Get non-dominated individuals (P f ) from P
Return (P f )

4. Evaluation Metrics

Solutions resulting from different runs of multiobjective algorithms must be compared using quantitative metrics.
There is wide range of alternatives, some of them dependent on knowing the Pareto-optimal front. It is generally
admitted that there is no single perfect metric that can be used to compare solutions in every circumstance. Actually,
the choice of the right metric among the wide range available is itself, a multiobjective problem.

The metrics most widely used in multiobjective context are Hypervolume and Spread [25]. These metrics could
be useful when we look for good distributed and strong dominant fronts. However, they are not appropriate to capture
the effect showed in fig. 2. In order to measure the robustness of the solutions, we introduce four metrics that
evaluate different aspects related portfolio robustness. They are named: Estimation Error, Stability, Extreme Risk and
Unrealized Returns. Next, we describe them in detail.

• Estimation Error: It evaluates the average difference between the expected risk and return for every portfolio
in the efficient frontier and the actual risk and return a posteriori once the real values of the parameters are
observed. That is, the mean distance between the estimates for tn based on data from t0 to tn−1, and the actual
values at tn. For this purposes, we consider the Mahalanobis distance, dM (x, y), which is defined as,

dM (x, y) =
√

(x − y)T Σ−1(x − y) (7)

where x and y are the patterns to be compared, and Σ is the variance-covariance matrix.

This metric is calculated measuring the average distance (dM ) between xp and xp
′ for all the portfolios in the

solution. Here, xp represents, on one hand, the pair (E(Rnp),σ2np) for portfolio p and period tn calculated using
forecast parameters. On the other hand xp

′ is defined by the pair (E(Rnp)
′,σ2np

′
) where both return and risk for

the same portfolio and moment in time are computed using the the real parameters (the ones observed for tn).
Formally, it can be expressed as follows:

EE =
ΣNp=1[dM (xp, xp

′)]2

N
(8)

where N is the number of portfolios in the Pareto front. The smaller is the difference between the forecast and
reality, the lower is the value of this metric and, therefore, the higher is the reliability of the original front.

8


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 9

• Stability: The Stability of a portfolio is measured averaging the Mahalanobis distances among the expected pair
of risk/return in the efficient frontier and the expected risk/return in S different scenarios. Unlike the Estimation
Error, which considers the difference between the expected scenario and the real scenario, this metric measures
the average difference between the expected scenario and a wide range of feasible alternatives. The concern is
the aggregate sensitivity to the distribution of potential scenarios, not to the one that happened to materialize.
The metric is given by the equation:

S T =
ΣSi=1

ΣNp=1 [dM (xp,xpi )]
2

N

S
(9)

where dM (xp, xpi) is the Mahalanobis distance between (E(Rnp),σ2np) and (E(Rnpi),σ
2
npi), the return and risk for

portfolio p and period tn calculated using parameters for the scenario i.

The set of scenarios is generated using nonparametric bootstrap technique described in Alg. 3. The relevance
of this metric would subject to the size of the data set used to resample and the value of S . A larger set of
scenarios is likely to result on a more accurate approximation to the potential distribution of parameters. As it
was mentioned before the value of the metric is obtained averaging all the average distances. Therefore, high
values of this metric would represent higher sensitivity to likely scenarios and lower reliability.

• Extreme Risk: This metric evaluates the sensitivity of the solutions to worst-case scenarios. It is closely related
to Stability, and it matches the same basic definition. The difference is that, instead of considering the average
for all S scenarios, we will only take into account a small subset. Specifically, the computation would include
only the w worst-case ones. For this purposes we define worse-case scenarios as the parameter combinations
that result in the highest average Mahalanobis distance between the expected risk and returns, and the risk and
returns for the same portfolios using the resampled parameters. The rationale for this indicator is providing
an estimate for the expected average negative outcome of the realization of the worst scenario with probability
w/S . The higher the metric, the higher the risk.

• Unrealized Returns: This indicator provides information on the average potential return left on the table.
Namely, it measures the average squared difference between the realized return and the maximum potential
return for that risk level for all the portfolios in the solution. This is defined as:

UR =
ΣNp=1[Rnpe − Rnp]

2

N
(10)

where N is the number of portfolios in the solution, Rnp is the return of portfolio p computed using the ex-post
parameters for time tn and Rnpe is the return of the portfolio, pe, with the most similar risk level in the effi-
cient frontier derived using the ex-post parameters. High values on this metric would indicate large unrealized
potential returns.

5. Experimental Validation

In this section we test the aforementioned approach on a specific asset allocation problem. We try to identify
the mixes of broad investment categories that provide the optimal balance between expected risk and return, while
increasing robustness. As it was mentioned in the introduction, four different multiobjective meta-heuristics are
considered on a constrained version of the problem. For each of them, NSGA-II, SPEAII, SMPSO and GDE3, we
compare the performance of the standard version versus the robust equivalent.

For experimental purposes, the cardinality constraints [Cmin, Cmax] and limits to the minimum and maximum
weight that each asset can carry in the portfolio, [limin f , limsup], are set to [2,6] and [0.1,0.8] respectively.

Subsection 5.1 describes the data sets that we used in this work. Subsection 5.2 gives details about the parameter
settings for the experiments, multiobjective algorithms, and metrics. Finally, subsection 5.3 presents and analyzes the
experimental results.

9


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 10

5.1. Data Sets

For experimental analysis, we use a sample that consists of 240 monthly returns for eight broad financial indexes
representing that many asset classes. The series of monthly returns cover the time period from January, 1990 to
December, 2009 and the source for the data is Datastream. The list of indexes is provided in Table 1.

Table 1. Data Sets
Name Code

Frank Russell 2000 Value FRUS2VA
Frank Russell 2000 Growth FRUS2GR
Frank Russell 1000 Value FRUS1VA

Frank Russell 1000 Growth FRUS1GR
S&P GSCI Commodity Total Return GSCITOT

MSCI EAFE MSEAFEL
BOFA ML CORP MSTR ($) MLCORPM

BOFA ML US TRSY /AGCY MSTRAAA($) MLUSALM

5.2. Experimental Parameters

The experimentation is based on a sliding window approach. Each window has a size n of 120 sets of returns
for the eight indexes, that is 10-year worth of data. This means that the algorithm will rely on data from t0 to tn−1 to
identify the best possible allocations for the period tn. The 10-year window will move one month at a time, 120 times.
This will cover the date interval from 31/01/1990 to 31/12/2009.

The algorithms will be run 20 times per window using the parameters described in table 2. This means that, for
each window, we will obtain 20 solutions sets per algorithm.

Table 2. Parameters. L = 8 (individual length). The termination condition is to compute 300 iterations.
SPEA2

Population size 200 individuals
Archive size 200 individuals
Crossover SBX, pc = 0.9
Mutation Polynomial, pm = 1/L
Selection of Parents Binary tournament

NSGA-II
Population size 200 individuals
Crossover SBX, pc = 0.9
Mutation Polynomial, pm = 1/L
Selection of Parents Binary tournament

SMPSO
Archive size 200 particles
Swarm size 200 particles
Mutation Polynomial, pm = 1/L

GDE3
Population size 200 individuals
Crossover DE crossover, CR = 0.9
Mutation DE mutation, F = 0.5
Selection of Parents DE selection

As for the parameters required by the metrics, the number of scenarios S necessary to compute Stability and
Extreme Risk, will be set to 500. The number of worst-case scenarios, w, considered for the second, will be 5. That
means that comparison will be based on the 1% worst expected average outcomes.

5.3. Experimental Results

This section shows the results from the experimental process described before. For every multiobjective algorithm
tested, we compare the performance of the standard and the robust version of the algorithms over 20 runs using the
metrics introduced in 4. In the tables, the first one is labeled as ”name-of-algorithm” and the second one as ”name-of-
algorithm R+T” (R+T means resampling and third objective). Results are provided in terms of descriptive statistics

10


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 11

(average, median and variance) for metrics described in section 4. The percentage of improvement of the robust
version over the basic one, labeled as % Av., is also reported.

Table 3 shows the set of results for Estimation Error metric (EE). The advantage of using the robust approach
over the standard one ranges from 27.46% to 54.93%. The lowest prediction errors are achieved using the robust
version SMPSO, however the highest improvement takes place for SPEA2 which is the algorithm with highest initial
EE value. We also note that the worse EE in the basic MOEA, the more it is enhanced by the robust approach.

The results for the Stability metric are reported in table 4. The table shows how the proposed approach (R+T)
improves significantly the stability of portfolios achieved by the standard MOEAs. Once again, the improvements are
specially remarkable for the algorithms that, in their standard formulation, lead to less stable fronts. We observe that
SPEA2 is the algorithm that provided the highest values for the metric in the basic version and the lowest values in
its robust alternative. This makes the difference between using R+T mechanism and the standard algorithm specially
large (67.53%). Conversely, GDE3 R+T shows the lowest improvement %Av (35.61%) and the worst final value for
the metric, once we use the robust approach.

Table 3. ”Estimation Error”
Average Median Variance % Av.

SPEA2 2,5198 1,7945 5,1421
SPEA2 R+T 1,1356 0,5538 2,4147 -0,5493
NSGAII 2,2823 1,7996 3,9817
NSGAII R+T 1,2401 0,6609 2,3748 -0,4566
SMPSO 1,5204 1,2878 1,7354
SMPSO R+T 1,0144 0,7403 0,9302 -0,3328
GDE3 1,4820 1,2580 1,5347
GDE3 R+T 1,0750 0,6721 1,2762 -0,2746

Table 4. ”Stability”
Average Median Variance % Av.

SPEA2 7,4057 7,0743 20,7329
SPEA2 R+T 2,4044 2,1024 2,2526 -0,6753
NSGAII 6,5508 6,1640 12,9661
NSGAII R+T 2,7433 2,5085 2,1520 -0,5812
SMPSO 5,2373 4,8981 7,2544
SMPSO R+T 3,2510 2,8582 3,6025 -0,3793
GDE3 5,1666 4,8507 6,8238
GDE3 R+T 3,3270 2,9755 3,8045 -0,3561

The results for the Unrealized Returns metric are reported in table 5. We observe that the solutions obtained by
R+T using observed parameters are closer to the optimal efficient frontier than the solutions derived from the standard
algorithms. The best results, once again, were obtained by SMPSO R+T. Results show improvements that range from
19.56% to 32.38%. GDE3 is the algorithm with the lowest % Av. despite not being the algorithm with the lowest
value for the metric in the standard version.

Finally, solutions evaluated under the less predictable parameters or scenarios (see table 6) show the same pattern
of behavior already observed for EE and Stability metrics. It is remarkable that solutions obtained by R+T approach
and evaluated under these worst scenarios are significantly closer to the observed parameters than the solutions of
standard MOEAs evaluated under the forecast parameters. In this case, even though SPEA2 R+T offers the best
median result, on average, it is beaten by the R+T versions of GDE3 and SMPSO, which provides both the smallest
average value and the lowest variance for the metric.

Table 5. ”Unrealized Returns”
Average Median Variance % Av.

SPEA2 3,5848 2,7989 8,5604
SPEA2 R+T 2,4240 1,8792 3,9853 -0,3238
NSGAII 3,4871 2,7326 7,8823
NSGAII R+T 2,5670 1,9975 4,1648 -0,2638
SMPSO 2,9805 2,4745 4,9961
SMPSO R+T 2,2514 1,7717 2,9235 -0,2446
GDE3 3,3518 2,9221 5,7049
GDE3 R+T 2,6962 2,1723 4,5393 -0,1956

Table 6. ”Extreme Risk”
Average Median Variance % Av.

SPEA2 3,5308 3,0614 4,9670
SPEA2 R+T 1,6622 1,0862 2,5638 -0,5292
NSGAII 3,2407 2,8451 3,9687
NSGAII R+T 1,8172 1,2318 2,5525 -0,4393
SMPSO 2,2485 2,0297 1,8884
SMPSO R+T 1,4671 1,2126 1,0257 -0,3475
GDE3 2,2091 1,9701 1,7397
GDE3 R+T 1,6425 1,2577 1,5775 -0,2565

The differences between the average values for the metrics for the basic and the suggested R+T versions of the
algorithm, % Av., were tested for statistical significance. We used the protocol described in algorithm 5 and all the
differences were significant at 1%.

The best performing algorithm, in terms of robustness, seems to be SMPSO R+T. It offers the lowest values for
three out of the four metrics considered (Estimation Error, Unrealized Returns and Extreme Risk). The best perform-
ing one in terms of Stability is SPEA2 R+T. Among the standard MOEAS, GDE3 provides the highest robustness

11


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 12

Algorithm 5 Statical testing protocol
if the values follow a Normal distribution (applying the Kolmogorov Smirnov test) then

if the variances are homogeneity (the Levene test is used to check it) then
A t-test is performed.

else
A Welch test is executed.

end if
else

A Wilcoxon test is applied to compare the medians of the solutions
end if

followed by SMPSO.
As we have have seen, the proposed R+T mechanism resulted in a systematic and significant increase in the

reliability of solutions across the algorithms. The introduced R+T mechanism generates more robust solutions than
any of the basic approaches of the MOEAs. The minimum decrease in average metric value was 19.56% evaluating
Unrealized Returns for GDE3. The largest improvement, 67.53%, was achieved SPEA2 for Stability. Furthermore,
we also observed that our approach helped to improve those algorithms which produced more unstable solutions.

6. Conclusions

Portfolio optimization represents one of the main topics in financial research. The basic framework targets the
combination of financial assets to be included in a portfolio in order to optimize the balance of risk and return. This
process is usually based on two parameters: the estimates for the expected asset returns, and the variance-covariance
matrix. Quadratic Programming can be used to solve the basic version of this problem. However, in the real world
there are some constraints that cannot be easily tackled by traditional techniques. This is the reason why evolutionary
multiobjective algorithms (MOEAs), that do not present these kind of limitations, are getting traction in the domain.

One of the main problems that portfolio managers face is the uncertainty regarding the expected frontier derived
from their forecasts for future returns. Very often, it lies far from the actual one, resulting in inaccurate forecast
risk/return profiles for the portfolios. Our work is focused on this point, adding robustness to the results by preventing
optimizing for a single expected scenario that may produce solutions that are hyper-specialized and might be extremely
sensitive to likely deviations. We handle the uncertainty in these parameters during the optimization process testing the
population for different values of parameters and selecting the portfolios that consistently offer a good performance.
We also use a time-stamping mechanism that, in collaboration with performance in terms of risk and return, drives the
evolution process to find robust and stable solutions.

The assessment of the robustness of the efficient frontiers resulting from the process requires metrics that differ
from the most traditional ones. For this reason, we introduce a set of four metrics ”Estimation Error”, ”Stability”,
”Extreme Risk” and ”Unrealized Returns”.

The approach was tested through experimentation using four popular MOEAs (NSGA-II, SPEA2, SMPSO and
GDE3). The results were compared in both the standard and the time-stamped resampled versions of the algorithms.
They were tested on a sample of monthly returns for eight indexes representing different broad investment categories
including stock, bonds, etc. The results show that the suggested approach enhances significantly the reliability of
solutions for all algorithms. The time-stamped resampled versions improved the results across all the considered
MOEAs. The improvements got up to 67.53% in ”Stability”, 54.93% in ”Estimation Error”, 52.92% in ”Extreme
Risk” and 33.32% in ”Unrealized Returns”. These improvements were higher when standard MOEA provided worse
results in terms of the metrics. We also observed that the R+T version of SMPSO tended to outperform the alternatives.

Even though these results are good and promising in terms of robustness, there are several issues left open that
could lead to future extensions of this work. Among them, the analysis of the performance of this robust approach on
other MOEAs or the scalability of the results with different number of investment alternatives.

7. Acknowledgements

The authors acknowledge financial support granted by the Spanish Ministry of Science under contract TIN2008-
06491-C04-03 (MSTAR), TIN2011-28336(MOVES) and Comunidad de Madrid (CCG10-UC3M/TIC-5029).

12


Garcia et al. / Expert Systems with Applications 00 (2012) 1–13 13

References

[1] Anagnostopoulos, K., Mamanis, G., 2011. The mean-variance cardinality constrained portfolio optimization problem: An experimental
evaluation of five multiobjective evolutionary algorithms. Expert Systems with Applications 38 (11), 14208–14217.

[2] Barbosa, H. J., Lemonge, A. C., 2005. A genetic algorithm encoding for a class of cardinality constraints. In: Proceedings of the 2005
conference on Genetic and evolutionary computation. GECCO ’05. ACM, New York, NY, USA, pp. 1193–1200.

[3] Chang, T.-J., Yang, S.-C., Chang, K.-J., 2009. Portfolio optimization problems in different risk measures using genetic algorithm. Expert
Systems with Applications 36 (7), 10529 – 10537.

[4] Chiranjeevi, C., Sastry, V. N., 2007. Multi objective portfolio optimization models and its solution using genetic algorithms. Computational
Intelligence and Multimedia Applications, International Conference on 1, 453–457.

[5] Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on
Evolutionary Computation 6 (2), 182–197.

[6] Deb, K., Steuer, R. E., Tewari, R., Tewari, R., 2011. Bi-objective portfolio optimization using a customized hybrid NSGA-II procedure. In:
EMO. pp. 358–373.

[7] Durillo, J., Nebro, A., Alba, E., July 2010. The jmetal framework for multi-objective optimization: Design and architecture. In: CEC 2010.
Vol. 5467 of Lecture Notes in Computer Science. Springer Berlin / Heidelberg, Barcelona, Spain, pp. 4138–4325.

[8] Garcia, S., Quintana, D., Galvan, I. M., Isasi, P., 2011. Portfolio optimization using SPEA2 with resampling. In: Proceedings of the 12th
international conference on Intelligent data engineering and automated learning. IDEAL11. Springer-Verlag, Berlin, Heidelberg, pp. 127–134.

[9] H.M., M., 1959. Portfolio Selection: efficient diversification of investments. John Wiley & Son.
[10] Idzorek, T. M., 2006. Developing robust asset allocations. Tech. rep.
[11] Kukkonen, S., Lampinen, J., 2005. GDE3: The third evolution step of generalized differential evolution. In: IEEE Congress on Evolutionary

Computation (CEC’2005). pp. 443 – 450.
[12] Markowitz, H., 1952. Portfolio selection. The Journal of Finance 7 (1), 77–91.
[13] Nebro, A., Durillo, J., Garcı́a-Nieto, J., Coello Coello, C., Luna, F., Alba, E., 2009. SMPSO: A new PSO-based metaheuristic for multi-

objective optimization. In: 2009 IEEE Symposium on Computational Intelligence in Multicriteria Decision-Making (MCDM 2009). IEEE
Press, pp. 66–73.

[14] Perret-Gentil, C., Victoria-Feser, M.-P., Apr. 2005. Robust mean-variance portfolio selection. Fame research paper series, International Center
for Financial Asset Management and Engineering.

[15] Pflug, G., Wozabal, D., 2007. Ambiguity in portfolio selection. Quantitative Finance 7 (4), 435–442.
[16] Radziukyniene, I., Xilinskas, A., 2008. Evolutionary methods for multi-objective portfolio optimization. In: Ao, S. I., Gelman, L., Hukins,

D. W., Hunter, A., Korsunsky, A. M. (Eds.), Proceedings of the World Congress on Engineering 2008 Vol II, WCE ’08, July 2 - 4, 2008,
London, U.K. Lecture Notes in Engineering and Computer Science. International Association of Engineers, Newswood Limited, pp. 1155–
1159.

[17] Reyes, M., Coello, C. C., 2005. Improving PSO-based multi-objective optimization using crowding, mutation and �-dominance. In: Coello,
C., Hernández, A., Zitler, E. (Eds.), Third International Conference on Evolutionary MultiCriterion Optimization, EMO 2005. Vol. 3410 of
LNCS. Springer, pp. 509–519.

[18] Ruppert, D., June 2006. Statistics and finance: An introduction. Journal of the American Statistical Association 101, 849–850.
[19] Shiraishi, H., 2008. Resampling procedure to construct value at risk efficient portfolios for stationary returns of assets. Tech. rep.
[20] Skolpadungket, P., Dahal, K., Harnpornchai, N., 2007. Portfolio optimization using multi-objective genetic algorithms. In: Proceeding of

2007 IEEE Congress on Evolutionary Computation. pp. 516–523.
[21] Soleimani, H., Golmakani, H. R., Salimi, M. H., April 2009. Markowitz-based portfolio selection with minimum transaction lots, cardinality

constraints and regarding sector capitalization using genetic algorithm. Expert Syst. Appl. 36, 5058–5063.
[22] Tütüncü, R. H., Koenig, M., 2004. Robust asset allocation. Annals OR 132 (1-4), 157–187.
[23] Zhu, H., Wang, Y., Wang, K., Chen, Y., 2011. Particle swarm optimization (PSO) for the constrained portfolio optimization problem. Expert

Systems with Applications 38 (8), 10161 – 10169.
[24] Zitzler, E., Laumanns, M., Thiele, L., 2001. SPEA2: Improving the strength pareto evolutionary algorithm. Tech. Rep. 103, Computer

Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH), Zurich, Switzerland.
[25] Zitzler, E., Thiele, L., 1998. An evolutionary algorithm for multiobjective optimization: The strength pareto approach. Tech. Rep. 43, ss,

Gloriastrasse 35, CH-8092 Zurich, Switzerland.
URL citeseer.ist.psu.edu/article/zitzler99evolutionary.html

13