Dynamic User Profiles for Web Personalization

Ahmad Hawalaha,b, Maria Faslia,∗

aSchool of Computer Science and Electronic Engineering, University of Essex
Wivenhoe Park, Colchester CO4 3SQ, UK

bCollege of Computer Science and Engineering, University of Taibah, Medina, Saudi Arabia

Abstract

Web personalization systems are used to enhance the user experience by providing tailor-made services
based on the user’s interests and preferences which are typically stored in user profiles. For such systems
to remain effective, the profiles need to be able to adapt and reflect the users’ changing behaviour. In this
paper, we introduce a set of methods designed to capture and track user interests and maintain dynamic
user profiles within a personalization system. User interests are represented as ontological concepts which
are constructed by mapping web pages visited by a user to a reference ontology and are subsequently used
to learn short-term and long-term interests. A multi-agent system facilitates and coordinates the capture,
storage, management and adaptation of user interests. We propose a search system that utilizes our dynamic
user profile to provide a personalized search experience. We present a series of experiments that show how
our system can effectively model a dynamic user profile and is capable of learning and adapting to different
user browsing behaviours.

Keywords:
dynamic user profile, modelling user behaviour, web personalization, ontology, multi-agent systems

1. Introduction

The dramatic growth of information on the WWW has inadvertently led to information overload and
hence finding a specific piece of information has become difficult and time consuming (Challam et al.,
2007). Web personalization systems have emerged in recent years in order to deal with this problem aiming
to provide a personalized experience to users based on their individual preferences, interests and needs. Such
systems have been developed for different domains of application (Pignotti et al., 2004; Challam et al., 2007;
Sieg et al., 2007; Pan et al., 2007). In e-commerce, such systems have been used to recommend new items
and products to users based on their previous purchasing history (Gorgoglione et al., 2006; Huang et al.,
2004), while in e-learning, they are used to provide personalized e-learning services (Sun and Xie, 2009;
Zhuhadar and Nasraoui, 2008). Personalization systems require and maintain information about users, and
this may include demographic data, interests, preferences, and previous history. One of the main challenges
in such systems is that user interests, preferences and needs are not fixed, but change over time. If user
profiles contain just static information, this eventually leads to constraining the personalization process and
recommending irrelevant services and items over time. To overcome this problem, methods are required for
learning and understanding different user behaviours, and then adapting the profiles accordingly.

∗Corresponding author
Email addresses: ahawalah@taibahu.edu.sa (Ahmad Hawalah), mfasli@essex.ac.uk (Maria Fasli)

Preprint submitted to Expert Systems with Applications 3rd December 2014


In this paper, we utilize ontological profiles to capture user interests and provide recommendations
based on these. The contribution of our work is threefold. Firstly, we introduce two algorithms in order to
improve the mapping process between web pages visited by the user that contain implicit information about
the user interests and a reference ontology to explicitly represent these interests. Secondly, we introduce
novel techniques to construct ontological short-term and long-term profiles that are tailored to the users,
and adapt them based on their ongoing behaviour. Thirdly, the methods introduced attempt to recognize and
handle potential interest drift and interest shift in the user interests.

To demonstrate our work, we introduce a personalization system that consists of three phases. The
first phase is the information retrieval phase which involves preparing a reference ontology, collecting user
navigation behaviour, and mapping visited web pages to the reference ontology. Indeed, this phase is very
important as capturing inaccurate user interests would directly affect the subsequent phases and eventually
the personalization performance. In this phase, we utilize two novel algorithms based on our work in
(Hawalah and Fasli, 2011a) to improve the mapping process. The second phase is the profile adaptation
and learning phase which utilizes previous work in (Hawalah and Fasli, 2011b). This phase plays a major
role in our model as it is responsible for learning, adapting and modelling ontological-user profiles. This
also includes methods to adapt the ontological profiles to any shift or drift that might occur in the users’
behaviour. This phase makes use of a multi-agent system that coordinates the various processes and ensures
that the user profile remains up-to-date. In the last phase, a re-ranking search system is introduced that
utilizes the dynamic user profile to provide a personalized search experience. The re-ranking search system
takes advantage of the user interests to provide more personalized search results. To evaluate this work, we
have conducted experiments with users over a period of time to assess the ability and effectiveness of our
methods in tracking and adapting to changes in the user behaviour.

The rest of the paper is structured as follows. First we discuss related work. Section 3 presents the main
architecture for modelling dynamic user profiles which consists of three phases with each phase being dis-
cussed in more detail in a subsequent section. We introduce the evaluation in section 4. Section 5 describes
the evaluation of the mapping and profile construction methods, while section 6, details the evaluation of
the dynamic user profiling methods in the context of their deployment within a personalized search system.
The paper ends with the conclusions and pointers to future work.

2. Related Work

As information on the WWW continues to proliferate, users find it increasingly difficult and time-
consuming to sift through this information. To aid the user in his/her quest for the right information (be it
on items, products, movies or articles), recommender and web personalization systems have emerged. Re-
commender systems use two broad categories of techniques: content-based and collaborative-filtering (CF)
techniques. The first technique views users as individuals. Such systems track the user interests and prefer-
ences and create an explicit profile that characterizes the user and any ensuing recommendations are guided
by the profile. The second technique does not make use of complex user profiles, instead information in the
form of ratings (for instance on a scale from 1-5) for items, products, etc. is collated from each user and
then similarities between the users and items are calculated and exploited to make new recommendations.
Our work in this paper, is related to the first family of techniques used for personalization systems.

Personalization systems may rely on different knowledge bases to learn and model user profiles includ-
ing taxonomies (Eirinaki et al., 2006; Mooney et al., 1998), flat databases (Wu et al., 2001) and ontologies
(Middleton et al., 2004; Weng and Chang, 2008; Felden and Linden, 2007; Liu et al., 2008). Different
techniques have been proposed to discover and recommend new items to users based on the modelled pro-
files, such as using content-based models (Liu et al., 2008; Mooney and Roy, 2000; Middleton et al., 2004),

2


spreading activation techniques (Blanco-Fernandez et al., 2011; Liang et al., 2008; Gao et al., 2008; Weng
and Chang, 2008) and classification techniques (Xu et al., 2008).

Ontologies encapsulate knowledge about a domain of application (Razmerita and Lytras, 2008) and as
such they provide a highly expressive medium for describing user interests and preferences and rich inter-
relations among them. Unlike simple methods of representing information such as weighted keywords and
semantic network profiles, an ontology provides a more powerful, deeper and broader concept hierarchy
representation for user profiles (Gauch et al., 2007). Using ontologies to model profiles has already been
proposed in various applications in the field of information systems in general and personalization in par-
ticular. Trajkova and Gauch (2004) and Zhang et al. (2007) for example, introduced a general mechanism
for modelling user profiles by implicitly tracking user visited web pages. These visited web pages are then
mapped to different concepts in the Open Directory Project (ODP) domain ontology1.

In (Weng and Chang, 2008), user browsing and search activities are tracked and processed in order
to build user profiles. The user profiles are learnt based on an ontology that consists of a hierarchical
representation of different topics. A spreading activation model is then applied using this ontology to
provide adequate recommendations to users. Middleton et al. (2004) described two hybrid recommender
systems that employ ontological user profiles to recommend appropriate research papers to academic staff
and students. The novelty in this work is in the ability of these profiles to infer more preferences based on
the collected user data, while it also offers users a visualization of their profiles so that they can explicitly
modify them. Sieg et al. (2007) presented an ontological user profile for tracking user behaviour in a web
search environment. A spreading activation mechanism was proposed to learn and maintain user interests.
User profiles are then used to re-rank the search results based on the users’ current interests. In (Challam
et al., 2007), a web search system based on ontological user profiles is proposed. The authors suggest
that using contextual information, i.e. the user’s current task, can provide a more effective personalized
experience. However, these studies do not focus on the process of learning user behaviour over time;
instead they focus on providing accurate search personalization at a particular time. Anand et al. (2007)
proposed using an ontology to represent user interests. In this approach, the content descriptor of the items
in an ontology is used with user actual ratings to provide a recommendation.

Some studies have proposed sophisticated approaches for modelling user profiles in a more dynamic
way. Grcar et al. (2005) presented a dynamic user profile that tracks implicitly user navigation and consists
of short and long-term models. However, this research assumed that all the visited web pages are of interest
to the user no matter how long s/he spends reading through them. As all pages are treated the same, the
potential strength of interest in various topics is ignored. Another issue with this study is that the short
and long-term folders have been limited to predefined sizes (i.e. 5 and 300 respectively). The short-term
folder stores the most recent visited web pages, while the long-term folder stores the last viewed web pages.
This technique limits the user profile to a small number of interests, and at the same time can suffer from
instability due to highly changeable interests. Along the same lines, Li et al. (2007) proposed a short-term
model that uses a page-history buffer (PHB) which emulates the functionality of the cache and database
disk buffer. Again the size of the buffer is fixed, which leads to the same problems as in (Grcar et al., 2005).
Another limitation relates to the representation of the user interests in the profiles as the interest-topic and
associated weight. An interest’s weight is represented as the total number of visits for this interest. The
model does not make use of a time discount factor, hence, the weight of the interest would remain the same
over time. For example, if a user visited pages associated with the topic IPhone 20 times over a period of
time, then this number remains the same even if the user has shown no interest in IPhone in the more recent
past.

1http://www.dmoz.org/

3


Cantador et al. (2008) proposed an adaptation strategy for a dynamic user profile. This work deploys
a reference ontology that is used to map user preferences and current context to provide personalization.
User preferences can be seen as users’ long-term interests, while current user context represents the users’
interests at the current runtime. All user interests are represented as weighted concepts. However, user
preferences are stored in a stack history which is limited to a predefined size and as such it would only be
able to store a small proportion of user history but not all user preferences. Furthermore, the mechanism for
detecting current user context does not consider the user’s previous sessions. For instance, if a user shows
interest in a concept Car in many sessions, and shows interest in a concept Football in the last session, both
of these concepts would be treated similarly without taking into account that a user might be more interested
in Car as s/he has shown an interest in this concept over many sessions.

Although the above studies provide different ways to model user profiles, most of them do not deal
with the user’s changing interests over time, while others only attempted to model low-level dynamic user
profiles. The distinction made in these works between short and long-term interests, and interests at current
runtime is often blurred and most such approaches treat users as having fixed size interests.

3. Capturing and Modelling Dynamic User Profiles

In this section, and following on from the identification of a number of drawbacks in existing works in
personalization, we propose a dynamic user profiling approach. In developing our research, we have taken
into account a number of factors and desirable properties for personalization systems:

• Users are reluctant to provide information about their interests and preferences explicitly through
completing questionnaires, etc. (Montaner, 2001). Hence, a flexible personalization system should
try and capture interests through tracking user behaviours implicitly without user intervention, but as
accurately as possible.

• In order to be able to provide advanced services to users, their interests and preferences should be
captured and stored in an appropriate format that would enable further processing to be performed
and useful information to be extracted. The use of ontologies has been shown to provide a signific-
ant improvement in the performance of personalization systems (Challam et al., 2007; Trajkova and
Gauch, 2004; Middleton et al., 2004). To this end, we have decided to deploy ontologies for repres-
enting user interests to support more effective, and semantically-driven exploration of the profile and
hence richer recommendations.

• User interests very rarely remain static, they constantly change and evolve over time. Most existing
personalization systems do not deal with this challenge in capturing user behaviour or they do not do
so in a way that looks at the user behaviour in a holistic way. Our aim is to develop profile methods
that should be able to track the change in the user interests including any interest-shift or drift and
adapt accordingly.

• As discussed in the previous section, user interests can be distinguished into short-term, which are
the user current interests and can be highly changeable, and long-term, which tend to be more stable
over time. Such a distinction of user interests is useful to capture and it provides an extra dimension
in trying to understand the individual user and his/her needs. However, this distinction is very often
blurred in existing approaches or not captured at all. We aim to capture this aspect of user interests in
our system in a clear manner.

4


User

Browsing 
Activities

Data 
processing

A multi-agent system
(Learning and 

adaptation) 

Reference Ontology

User ontological

profile

FeedBack
Personalized 

Services

Data source

Personalized 
system

Mapping 
process 

Create an instance

IR phase

Learning and 
adaptation phase

Personalization 
system phase

Figure 1: The phases and architecture of the personalization system.

• Typically work in personalization is developed to address specific domain problems. We aim to
develop models and methods that will have wider applicability and can be integrated and deployed to
provide a range of personalization services.

Building upon these, we propose a novel personalization system that consists of three phases as illus-
trated in Figure 1:

• The Information Retrieval phase.

• The Learning and Adaptation phase.

• The Personalization phase.

3.1. Information Retrieval (IR) Phase

The aim of this phase is to collect user browsing behaviour to discover their interests. Personalization
is aimed at enhancing the process of retrieving information as a user receives tailored contents to his/her
interests, but if the system collects the wrong interests, then it would provide inaccurate and ineffective
services. The tracking and discovery of user interests is done in three stages.

3.1.1. Stage One: Tracking User Browsing Behaviour
Since collecting data explicitly adds more of a burden on users (Montaner, 2001), we aim at collecting

such behaviour unobtrusively. The data that we need to observe in this system is the visited websites,
contents of each web page, timestamp which denotes the date/time of the visit and finally the duration of

5


the visit. The Browsing Activities component in Figure 1 is used to collect and then store this information
in a log file. This component records all the visited web pages W = {W1,W2, ...,Wn} during the user
browsing sessions and fetches all the textual contents txi for each Wi ∈ W . For each visited web page
Wi, the timestamp q is recorded as well as the duration k of the user’s visit to the web page. This raw
information is stored in a Raw log file which is then used by the Data-processing component. First, all the
noise in txi such as HTML tags is removed2. This component then tokenizes all the txi for each Wi to
discover terms ti. Once all the txi are cleaned and tokenized, it is important to reduce the dimensionality
of terms by removing all unnecessary ones which have low discriminating values as well as reducing their
ambiguity. Hence, we first remove all the common terms such as ‘and’, ‘or’ and ‘the’ (stop words) using
a stop list. Subsequently, we apply the Porter stemming algorithm (Porter, 1997) on all the ti in order to
return each term to its stem (e.g. computer to compute). The outcome of this phase is a P-log file that can
be processed further.

3.1.2. Stage Two: Reference Ontology Preparation
Ontologies comprise rich knowledge representation structures and their use has been shown to provide

a significant improvement in the performance of personalization systems (Challam et al., 2007; Trajkova
and Gauch, 2004; Middleton et al., 2004). In the proposed system, an ontology plays a significant role
in modelling dynamic user profiles. A reference (or domain) ontology describes a particular domain of
application and is usually modelled in a hierarchical way in which super-classes are linked to sub-classes.
Each concept in an ontology is typically associated with a document that explains and represents it. Unlike
flat representations, a reference ontology provides a richer representation of information in that semantic
and structural relationships are defined explicitly. We use a reference ontology for two purposes: firstly, to
identify user interests based on the visited web pages, and secondly, to represent user interests as ontological
concepts.

Let O be a reference ontology that represents a domain and C a set of concepts that belongs to O.
R = {r1, ...rn} is a set of relations that links two concepts. A reference ontology might have different types
of relations such as r1: ‘is-a’, r2: ‘has-a’, etc. As we propose a content-based personalization system, each
ci ∈ C in O is associated with a document di that represents the ci.

We first convert all the terms in all the documents associated with concepts in the reference ontology
to vectors using the TF ∗ IDF classifier. That is, each term tj ∈ di where di → ci (associated to ci) in
O is given a weight value w that measures the importance of the term tj within a document di. Using the
TF ∗ IDF , this weight can be computed by dividing the number of occurrences of term tj by the total
number of terms | t | in a document di. Then the inverse document frequency (IDF) which represents the
general importance of the term is calculated by dividing the total number of documents D in O (| D |) by
the total number of documents di that contain the term tj in O | tj . di ∈ D |. Finally, each term in the
ontology is given a weight from 0 to 1.

wtj = (tfji ∗ idfj)

idfj = log(
| D |

| tj . di ∈ D |
) (1)

Although the vector space model is one of the simplest classification methods, it has one drawback: it
distinguishes between terms or vectors that have the same root. Words such as ‘play’, ‘plays’ and ‘played’

2Although works have shown that HTML structure can be used as part of the indexation process, in this phase of our work, we
do not make use of such information.

6


are processed as different words. This makes the classifier less effective as the dimensionality of the terms
increases. In order to avoid this, and similarly to the first stage, we remove stop words and apply the Porter
stemming algorithm (Porter, 1997) to remove term suffixes and return each term in the ontology to its stem.
Once all the terms in a reference ontology are converted to vectors, the ontology will be ready to be used to
discover user interests from user visited web pages.

3.1.3. Stage Three: Mapping Documents to a Reference Ontology
Once the term weights are calculated for each term in the ontology, any vector similarity method can be

used to map visited web pages to appropriate concepts (or classes) in the reference ontology. In this paper,
the well-known cosine similarity algorithm (Baeza-Yates and Ribeiro-Neto, 2011) is applied to classify web
pages to the right concepts. Cosine similarity is used to ascertain the similarity between two vectors and
measures the cosine of the angle between them – in our case, the vector representing the web page visited
against the vector representing a specific concept in the reference ontology.

simCosine(d1,d2) =

∑n
i=1 wi1,wi2√∑n

i=1 wi1
2 ∗
√∑n

i=1 wi2
2

(2)

Several studies have tried to improve the accuracy of the mapping process. One such approach is by
using an ontology with a limited number of levels. Liu et al. (2002) for instance, map user interests to a
reference ontology that is limited to just two levels. Studies such as (Speretta and Gauch, 2005; Chen et al.,
2002; Trajkova and Gauch, 2004) use a three level ontology to map user interests. When it comes to the
retrieval precision, using a limited number of levels has been reported to improve the overall accuracy as
it reduces the number of concepts in a reference ontology which in turn minimizes the error of mapping
contents to wrong concepts. However, levels two or three of an ontology can be too general and broad to
represent actual user interests, and hence this risks no specific interests being recognized. For instance,
in the Open Directory Project (ODP)3 ontology, level two Computers/Programming or three Computers/
Programming/Languages are too general to represent interests such as Java or C# programming languages.

Another method to improve the mapping performance is by adding a pre-defined percentage of each sub-
class’s weight to its super-class. The idea is that if a user is interested in a particular class, then s/he is also
interested in its super-class. For example, if a user is interested in football, then s/he also has some interest
in its super-class which is likely to be Sport. Middleton et al. (2004) and Kim et al. (2007) implemented this
idea by adding an extra 50% for each interest to its super-class. The process is then repeated until the root
class. Although this method showed an improvement over the original cosine similarity, the accumulation
behaviour that it induces puts more emphasis on the top level classes which are too general to represent
actual user interests, while middle and low level classes receive less attention.

Daoud et al. (2008) concur that representing interests with level two of an ontology is too general, while
a leaf node representation is too detailed. They suggested that the most relevant concept is the one that has
the greater number of dependencies. To this end, they proposed a sub-concept aggregation scheme where
the main goal is to represent all user interests with level three in an ontology. The weight of the level three in
this system is calculated by adding the weights of all its sub-concepts. However, representing user interests
by level three in an ontology is restrictive as ontologies can have different structure and complexity. Some
ontologies may have few levels, while others may extend to several; the ODP has seven levels.

To address the limitations of the aforementioned methods, we need a method that improves the mapping
process and at the same time maintains both general and specific interests. To this end, we introduce two

3http://www.dmoz.org/

7


new methods called Gradual Extra Weight (GEW) and Contextual Concept Clustering (3C) (Hawalah and
Fasli, 2011a).
The Gradual Extra Weight Algorithm (GEW). The GEW algorithm is based on the idea that if a user
is interested in a particular class, then s/he also has some interest in its super-class. However, we make no
assumption about the number of levels an ontology has. Moreover, we do not assign a specific percentage
of a sub-class to be added to its super-class such as in (Middleton et al., 2004) and (Kim et al., 2007).

We propose an auto-tuning mechanism in which the percentage value of each sub-concept that is added
to its super-concept is tailored to different levels of the ontology. In this mechanism, we assume that the
concepts deeper in the ontology express more specific interests than a general one which is expressed with
a concept higher up in the ontology (e.g. Java as opposed to Programming Languages). Therefore, the
concepts in the lower levels would add more weight to their super-concepts than those in higher levels.
Equation 3 shows how the extra percentage (EP ) for each level is calculated in order to transfer weight:

EP =
ci.Current level∗α

O.Max levels
(3)

Where ci.Current level is the level of a sub-concept, α is a parameter that controls how much weight
is transferred from one concept to another (i.e. α = 0 means no weight is transferred, while α = 1 means
the maximum weight is transferred) and O.Max levels is the total number of levels in the ontology O.

Using the total number of levels in an ontology in equation 3 allows us to gradually reduce the per-
centage of weight that is transferred as we move up towards the root and keep a balance between general
and specific interests. The default α value is 0.5, as in this case the maximum weight transferred from the
concepts is 50% from the leaf concept to its super-concept and this percentage is reduced as we move up
towards the root. However, α is not fixed but it can be changed based on the ontology and the number of
levels at hand. When changing the α value, extra care is required to avoid selecting too high or too low an
α value, as the former might cause inflation at top levels, while the latter would make the GEW ineffective
as the weight which is transferred from a concept to its super-concept would be too small and insignificant.

Finally, the GEW algorithm is applied to each visited web page after the cosine similarity is computed
between this web page and all concepts in an ontology. However, one problem that might arise at this stage
is that the ontology might have a huge number of concepts and hence, it would be too expensive to apply the
GEW to all concepts. To reduce the computational load, we only apply the GEW on just the top γ results
that have the highest similarity weights. Determining the best value for γ allows us to remove the concepts
that do not add any value when applying the GEW algorithm4. The GEW algorithm is explained in more
detail in Figure 2.

To demonstrate how the GEW algorithm works, consider a reference ontology as in Figure 3 with 6
levels. For each visited web page that is mapped to a reference ontology using the cosine similarity measure,
all the concepts in this ontology would be assigned a similarity weight ∈ [0,1]. The GEW algorithm starts
from the highest similar concept which in this example is the concept with ID 6 that has a weight of 0.3.
The extra weight based on the level of this concept which is level 6 is computed as in the dashed-box on
the left. The EP which is 0.5 is then multiplied by the original weight of this concept which is 0.3. The
total extra weight (0.15) will be added to the upper-concept. As the similarity of this upper-concept and the
visited web page is 0.12, this weight will be modified by adding the extra 0.15 to it, so the new weight is
0.27. This process is then repeated until the root node is reached.

4In the evaluation section, we describe how the value for γ can be identified experimentally.

8


Figure 2: The GEW algorithm.

9


Figure 3: An example of the GEW algorithm.

The Contextual Concept Clustering Algorithm (3C). Although the GEW method may improve the pro-
cess of mapping web pages to concepts, correct mapping cannot be guaranteed as not all the visited web
pages usually have good representative contents or web pages may contain a lot of noise.

We aim to further improve the mapping process by taking advantage of having a log file that stores
the user’s browsing behaviour. When users browse, they tend to visit several web pages that represent one
interest during the same session. All current works in the field of personalization that map each visited web
page to a reference ontology, do so in isolation and without any consideration about other web pages that
have been visited during the same session. However, a set of visited web pages in one session might hold
important hidden information that can be exploited to improve the mapping process. Assume that a user
has visited 3 web pages about the XML topic, and some of these pages contain contents that are not very
representative. A traditional mapping approach would map each web page to a concept from an ontology
separately. As some of these web pages have poor representative contents, such an approach might assign
wrong concepts to them. But if we grouped similar web pages together and applied the mapping process on
this group, the risk of mapping these web pages to the wrong concept could potentially be reduced. This is
because the web pages with good representative contents can “hold up” the pages with poor contents and
eventually help the mapping process to map all the pages in this group (and even the ones with noise) to the
correct concept in the reference ontology.

Another reason why grouping web pages may be effective is that such a grouping may in essence
better encapsulate the context of the user behaviour. Consider a user that visits a web page that provides
a comparison between the C# and Java languages and hence contains two concepts. An assignment to
either of these would be correct from the mapping process point of view. However, if this user is mainly
interested in the C# language and is just curious to find out why it might be better than Java, s/he would
not be interested in receiving recommendations about Java. Although classifying such a web page to either

10


C# or Java would be conceptually correct, it would not be appropriate for this specific user who is mainly
interested in C#. Therefore, taking into account the user session (group of pages) would help to classify
them to the most appropriate concept; in our example the C# concept.

To address these issues, we introduce the Contextual Concept Clustering (3C) algorithm which aims to
group related web pages into clusters which are then assigned to a particular concept in a reference ontology.
The 3C algorithm works as follows (Figure 4): 1) For each browsing session, the GEW algorithm is applied
in order to find the top γ similarities between each visited web page and concepts in the reference ontology.
2) After applying the GEW, the highest β similar concepts from the ontology to each web page are used
as possible candidates to represent it. We need to consider all the top β results because in some cases the
concept with the highest similarity value does not give a correct representation of a web page. This could
be due to poor or irrelevant information in a web page, or it could be simply due to a high level of noise.
3) The “context” is then exploited by selecting the common concept that is associated with different web
pages. This common concept eventually is selected to represent a web page. If there is no common concept,
the concept with the highest similarity weight is selected. The β value can be identified by conducting an
experiment to find the most appropriate value (see evaluation section).

To illustrate the 3C algorithm, consider a user who has visited six web pages as in Figure 5. First the
GEW algorithm is applied to all these web pages and then the top β similar concepts are selected as possible
candidates to represent this concept. In this example, each web page is linked to 5 concepts (concepts are
represented as circles and unique id numbers from the ontology). The 3C mechanism clusters these top
similar concepts into groups where each group contains a set of web pages that have the same common
candidate concept mapped to them. Hence there are three groups. Group 1 consists of four web pages
that share concept 25 as common. The weight of this group can then be computed as the total of all the
semantic similarity weights between the common concept 25, and all other web pages that are members
of group 1 namely web pages 1, 2, 3 and 4. However, web pages can also be mapped at the same time to
other candidate concepts that might be clustered in different groups. For example, web pages 1 and 2 can be
assigned to either group 1 or 2, while web page 3 can be assigned to either group 1 or 3. If there are more
than one groups, the page is assigned to the group with the highest total weight. So, in this example web
pages 1, 2, 3 and 4 would be represented by concept 25, while web pages 5 and 6 would be represented by
concept 21.

3.2. Learning and Adaptation Phase

Typically, user interests change over time as they may lose/acquire interest in an object. If a user profile
is not able to detect and adapt to such changes, this will inevitably affect the personalization performance.
According to (Cantador et al., 2008; Picault and Ribiere, 2008), a number of heuristics can be used when
modelling a user profile to recognize patterns in browsing behaviour:

• A concept may occur in a short period with a very low occurrence. This should not be considered to
be a concept of interest to a user. For example, a user might open a web page that does not meet his
needs so he closes it immediately.

• A concept may occur in a short period and its occurrence is very high. This concept should be
considered as a short-term interest, and it should be removed once the user shows no more interest in
it. For example, a user may be planning a day out in London which would involve looking for train
tickets, museum and theater tickets. The London interest will exist for a while, but once the trip is
taken, the user would have no further interest in London.

11


Figure 4: The 3C algorithm.

12


Figure 5: An example of the 3C algorithm.

• A concept may occur over a long period and its occurrence is high. This concept should be considered
as a short-term interest for a start, and once it is confirmed with time, it should become a long-term
interest. For instance, a new Computer Science student will become interested in topics related to his
subject. These topics should be treated as long-term interests.

• A concept may occur in a short or long period, and its occurrence is very high. Then, this concept
disappears, but after a while the user shows interest in this concept again. Such behaviour is difficult
to recognize as a user might shift or drift his interests to new or urgent ones, and once he is done with
these new interests, he returns to the previous ones. An example would be when a user is interested
in reading topics related to his work, but if he decided to take a holiday and travel abroad his interests
would probably shift to new topics related to the visiting country. However, once back from the
holiday, the user would likely return to his earlier work-related topics.

We make use of these heuristics in our learning and adaptation mechanism and to this end we introduce
a multi-agent system which deals with the complex processes of recognizing, adding, updating and deleting
user interests. We make use of a multi-agent system architecture for two main reasons. Firstly, the use of
agents with specific responsibilities within the learning and adaptation process enables us to delineate func-
tionalities and interactions between them clearly; the flow and processing of information are clearly defined.
This also facilitates the use and processing of the same piece of information from multiple perspectives as
different agents in the system may have access to the same information, but they may be viewing it from
a different perspective and dealing with it in a different way. The second main reason is that of flexibility
and extensibility. The various agents within the system can be upgraded and modified on their own without
affecting the rest of the agents. Hence, we can easily make changes to the behaviour of the agents and their
underlying algorithms in the future separately. This also enables us to test the system in a more systematic
way and reconfigure algorithms in isolation and in conjunction with each other.

The multi-agent system handles the users’ browsing behaviour in general, and the change in their in-
terests in particular and it consists of three layers to provide dynamic learning and adaptation: the session
based, short-term and long-term layer (Figure 6). Each layer consists of one or more agents that are re-
sponsible for a (set of) task(s).

13


Figure 6: A multi-agent system for modelling dynamic user profiles.

3.2.1. Session-Based Layer
In order to learn user profiles, we need to observe, process and then learn from user browsing beha-

viours for each session, hence this process is activated after every session. The learning and the adaptation
processes consist of adding, forgetting and deleting concepts from a user profile, computing the interest
weights for the visited concepts and preparing a list of candidate concepts to be analyzed by the short-term
and long-term layers. This layer consists of a number of agents each responsible for one or more tasks.
Session-based Agent: This is a key agent in the multi-agent system as it is responsible for four main tasks:
(i) collecting data from the P-log file; (ii) communicating with other agents to calculate the latest interest
weight for all collected concepts from the P-log; (iii) storing all the processed concepts and their interest
weights in a session-based profile (SBP); (iv) communicating with the Short-term and Long-term Agents to
enable them to discover short-term and long-term interests respectively.

Once a session finishes, the data in the P-log file are processed and stored in a SBP. One critical differ-
ence between the P-log and the SBP is that the P-log consists of URLs, timestamps, durations and concepts
that represent these URLs, while in the SBP there are no URLs but just concepts and their associated at-
tributes. These attributes include: (i) The Status which can be: positive status such as Browsed-concept
or Confirmed-concept, or negative status such as Forgotten-concept or Deleted-concept. (ii) The Relev-
ance size which refers to how much a concept is relevant to a user interest. The relevance size can be
measured based on user feedback about each concept. If the user feedback is positive, then the relevance
size increases, but if it is negative, then it decreases. (iii) The third attribute is the Frecency which repres-
ents the interest weight that indicates how much a user is interested in a concept. (iv) Finally, the Frequency
attribute represents how many URLs from the P-log file have been mapped to a particular concept.

The Session-based Agent starts by extracting from the P-log file the concepts that represent a web
page, the Status and Frequency attributes. Next it communicates with the other agents in order to set the
Relevance size and compute the Frecency weight for each concept. We borrow the term frecency, which is a
combination of frequency and recency, from the field of web browsing to represent the weight of an interest
(MozillaWiki, 2010). This process is performed daily and hence all new interests as well as existing ones

14


are processed daily to adapt them to any change in user behaviour.
Insert Agent: This agent is responsible for processing all concepts with positive status that are received
from the Session-based Agent. The positive status may be associated with different events where each event
has different value to represent its importance. The different events in the system (and depending on the
domain of application these may vary), need to be assigned numerical values in order to illustrate their
difference and their relative importance. We identify two events: Browsed-concept and Confirmed-concept.
The Browsed-concept event is the default one that is assigned to any concept that is browsed by a user. If
such a concept appears in two subsequent sessions, it is assigned the Confirmed-concept event. This event
has a higher weight than the Browsed-concept one, as concepts appearing in more than one session would
likely be of more interest to users than those that appear in just one. Other events could be added based on
the domain of application, such as Purchased-item, Printed-concept or Bookmarked-concept.

Whenever a web page is browsed that has a classified concept, the interest weight (i.e. frecency) for that
concept is accumulated using equation 4. The frecency value is computed based on two factors: the event
and the duration that are associated with the web page. For instance, if the concept has been browsed, the
event weight is assigned to be 100, while if it is a Confirmed-concept, then we assign a weight of 150. We
could have assigned different values for the weights at this stage (e.g. 1 and 1.5 respectively); the important
thing is to be able to differentiate between these two events and to reflect the difference in their importance.
The second factor, the duration, is the time spent on a web page and as it is in seconds, it is normalized. We
consider the duration when computing the interest weight for a concept since time spent on web pages has
been shown by studies including (Konstan et al., 1997; Claypool et al., 2001) to be a good indicator of user
interest.

Fre =
n∑

ci∈C

(
ci.k

100

)
∗E.weight; E.weight =

{
100 if Browsed-concept
150 if Confirmed-concept

}
(4)

Where ci.k is the time a user spends reading ci, E.weight is a weight that is added based on the event
of the concept. The interest value of a concept ci is computed by summing up all the frecency values for
each web page mapped to this concept. After assigning a frecency weight to each new concept, the Insert
Agent adds this concept and its properties to the SBP.
Forget Agent: This agent handles the behaviour that occurs when a user loses interest in a concept. An
effective personalization system should be able to adapt to this change in user behaviour by forgetting such
an interest from the user profile. However, such a concept should not be deleted immediately, but instead it
should be forgotten gradually until the concept is confirmed as no longer being of interest to the user. Of
course, not all interests are forgotten with the same pace. To account for this, the pace of the forgetting
process depends on three factors. The first is the Relevance size that is associated with each concept and
is an indicator of the user’s strength of interest. The Relevance size is increased/decreased depending on
whether a user shows positive feedback towards a concept (i.e. by either searching for or browsing web
pages related to this concept) or negative. The second factor is the recency of a concept as old concepts
are forgotten faster than new ones. The final factor is related to the introduction of new interests to a user
profile. If a user has started to lose his/her interest in a concept, and at the same time started to acquire new
interests, then this behaviour might indicate that there is a drift in the user interests.

In related work, and in order to encode the assumption that the user’s preferences gradually decay as
time passes, Sugiyama et al. (2004) introduced a forgetting factor for interests in the user’s profile:

e−
log2
hl

(d−dtinit) (5)

15


Where dtinit is the day when term t occurs initially and d is the number of days after the occurrence on
day dtinit . The parameter hl is used to indicated the half-life span and is set to 7 under the assumption that
user interests reduce to 1/2 in a week (Sugiyama et al., 2004).

In order to apply a forgetting process to our profiles, we borrow the idea of the time-based forgetting
factor from (Sugiyama et al., 2004). In contrast to using a predefined half-life span, our forgetting factor
is more dynamic in order to account for the three aforementioned factors and hence the new frecency of a
concept (New Fre) is calculated as follows:

ci.New Fre = ci.Old Fre ∗ e
− log2

(ci.Relevance size∗2)
(Gm−Gl)+(#new interests : dt) (6)

Where ci.Relevance size is the relevance size of a concept ci, Gm is today’s day, Gl is the day of the
last occurrence and # new interests is the number of new interests that are introduced in the Gm. Unlike
(Sugiyama et al., 2004) who divided log2 by a predefined half time, in this equation we divide log2 by
(c.Relevance size ∗ 2) and use the recency and the number of new concepts, so the larger the value of
these elements, the faster the forgetting process. This makes our forgetting factor more dynamic and we
take into account individual user behaviour as users, for instance, may acquire new interests at different
rates.

In Equation 6, we rely on the concept’s Relevance size, recency and the number of new concepts
introduced to the user profile to handle the user’s interest change and interest drift. As explained in (Godoy
and Amandi, 2009), interest drift occurs when an interest experiences a gradual change. The proposed
mechanism handles such behaviour by initializing the process once a user shows no more interest in a
concept. The pace of forgetting further depends on whether a user has got interested in new concepts or
not. The more new concepts in the SBP, the faster the forgetting process. This is particularly important in
coping with interest drift as the introduction of new concepts at the time of losing interest in older concepts
indicates implicitly that there is a drift in the user interests. Hence, older concepts should be forgotten and
replaced by new ones.

Finally, it is important to note that the forgetting process for a concept might be paused when a user
shows interest again. The Relevance size would also increase to reflect this change. But if a user continues
not to show interest in the concept, this will be gradually forgotten until the end of the Relevance size (the
size of zero). Then this concept is sent to the Delete Agent.
Delete Agent: This agent manages the gradual deletion of a concept from a user profile. The difference
in the operation of the Forget and Delete Agents is that a concept with a Forgotten status can still be an
interesting concept to a user and appear in his/her profile, while a concept with a Deleted status means that
this concept is no longer interesting and it should not appear in the profile. When a concept is passed on to
the Delete Agent, this is removed much faster based on the time of the last appearance of the concept and
until the weight reaches a predefined threshold and then it is removed altogether. We use the next equation
to compute the new frecency of a Deleted concept:

ci.New Fre =

(
ci.Old Fre

Gm −Gl

)
(7)

Where Gm is the current date and Gl is the date of the last occurrence of ci. Hence, if there has been no
interest shown for long periods of time, the frecency will be reduced accordingly.

Once all concepts are processed, the Session-based Agent stores the final results into the SBP. Sub-
sequently these are used to discover short-term and long-term user interests, therefore, the last task of the
Session-based Agent is to communicate with the Short-term and Long-term Agents.

16


In this work, we distinguish between long-term and short-term interests as they are different in nature.
Discovering the short-term interests helps a personalization system to shed light on users’ current interests
which are likely to change frequently. To illustrate, a user might be interested in a holiday in France. After
returning from the holiday, the user would most likely lose interest in France and therefore this should con-
stitute a short-term interest. A typical personalization system would provide relative personalized contents
to this user that match his/her interest in France. The long-term interests on the other hand, can play an
important role in providing more effective personalization services to users that match their long-term pref-
erences. If the same user has a long-term interest in science fiction movies, this can be combined with the
short-term one by suggesting new science fiction movies and cinemas while in France.

However, the process of discovering short and long-term interests presents challenges. The main one
arises from the question of how long (short) is long (short). Users may exhibit very different browsing
behaviours: some may change their interests frequently, while others may rarely change. Therefore, identi-
fying length/duration is subjective to each user. We address this issue in our work by incorporating long and
short-term layers that consist of adaptable mechanisms to discover short and long-term interests specific to
each user.

3.2.2. Short-term Layer
Our goal in this section is to develop a new mechanism to learn the user’s short-term interests and to be

able to adapt to the various browsing behaviours listed in the beginning of section 3.2. This layer includes
two components: the short-term profile (STP) that is used for storing all the concepts identified as short-
term interests, and the short-term agent (STA) that is responsible for tasks such as discovering, maintaining
and storing short-term interests in the STP.

The STA discovers short-term interests by examining the concepts stored in the SBP. As each concept in
the SBP is associated with a frecency that represents its importance, the STA uses these values to determine
which of these concepts are short-term interests. One way to discover short-term interests is to pre-define a
threshold so all concepts with weights above it are selected as short-term interests. This method has been
adopted by studies such as (Li et al., 2007) and (Grcar et al., 2005). However, users are different and using
the same threshold for every user would very likely result in modelling inaccurate interests.

We overcome this problem by developing a self-tuning mechanism to calculate a threshold which is ad-
aptable to different users as well as to changes in each user’s browsing behaviour. To identify this threshold,
we first observe the frecency average of a user’s browsing behaviour for each session s. Normally, when
browsing, the user would navigate to both interesting and uninteresting pages. The uninteresting topics
would be assigned low frecency weights, while interesting ones would be assigned high weights (i.e. based
on the time spent on each page). The threshold is computed as the ratio of the total frecency values for each
visited concept in each session to the total number of concepts that have been visited in all sessions:

Threshold =

∑n
all s∈ui Frecency values

| Total number of concepts |
(8)

Hence, all the concepts in the SBP that have weights above this threshold are stored in the STP. As this
threshold adapts to various browsing behaviours, each user would have a different threshold.

Relying solely on a threshold is not adequate, as we also need to control the size of the STP. This is
because users might be interested in a large number of topics on one day, but on another they may have less
interesting topics or even no interests at all. Unlike some studies such as (Li et al., 2007) and (Grcar et al.,
2005) which rely on fixed size profiles to store interests, in this layer we propose a mechanism to adjust the
STP size by either expanding or limiting it to reflect the actual user interests. Taking into account the user’s

17


browsing behaviour, we compare the average of the current session to the average of the last session and
if it is larger/smaller, then the size of the STP is increased/decreased accordingly, otherwise it remains the
same. We use the average of each session as it provides an indication of a user’s behaviour. To prevent the
STP size from being increased indefinitely or diminished we set a maximum/minimum STP size which can
be identified based on the domain of application at hand. The frecency average for a session is computed
by:

Fre average =

∑n
c∈si Frecency values

| Total number of concepts |
(9)

Where
∑n

c∈si Frecency values is the total frecency values for all concepts c that have been visited
during session si.

Finally, the STA also needs to be able to deal with the problem of interest shift. An interest shift can
be defined as abrupt change in user interests (Godoy and Amandi, 2009). Unlike interest drift which is
a gradual interest change and which has been dealt with by the Forget Agent in section 3.2.1, the interest
shift is more difficult to recognize as it happens suddenly (Gorgoglione et al., 2006). Users can sometimes
become suddenly interested in some topics and then once these topics lose their importance, they simply
abandon them. For instance, when the Football World Cup is about to start, sports fans would become
interested in this and might not exhibit interest in other sports. But when the event finishes, they would
likely shift their interest to other sports, or new events. As it is not easy to recognize such shifts at the time
of their occurrence, personalization systems can suffer from lower performance.

The STA handles interest shift by adding any new concept with a frecency weight above the computed
threshold to the STP. However, if the STP is already full, then the concept would not be added. To deal
with this problem, the STA replaces the concept that has the lowest frecency weight in the STP with this
new concept, even if this concept has a lower frecency weight than the replaced one. We attempt to replace
new concepts with existing concepts in the STP when it is full in order to discover a user’s interest shift.
If this new concept appears again in the subsequent session, the Insert Agent in the session-based layer
would process it and it would be treated similarly to the other concepts, while if the concept does not get
confirmed, then it will be forgotten. In this case, our system would recognize and handle the interest shift
when it happens. We call this the Replacement-task as older concepts are replaced by new ones.

3.2.3. Long-term Layer
The Long-term layer is responsible for learning, recognizing and storing long-term interests that are

confirmed with time and consists of the long-term profile (LTP), which stores all interests that are recognized
as long-term ones, and the long-term agent (LTA), whose task is to recognize, maintain and store long-term
interests in the LTP.

Long-term interests are different in nature from short-term ones as they do not change as frequently, but
instead they appear more regularly over a longer period in a user profile. For instance, users who work as
programmers might have a long-term interest in programming languages which might appear regularly in
their profiles. Unlike the short-term interests which rely mostly on recency when computing their interest
weights, the long-term interests should rely more on the frequency of occurrence. The LTA computes the
frequency weights for each concept in the SBP using the following equation:

ci.FW = (Nocc ∗Nday)∗
(
Gm −Gf
P.logage

)
(10)

18


Where ci.FW is the frequency weight of a concept ci, Nocc is the number of times the concept ci occurs
in the P-log file, Nday is the number of days that the concept ci occurs in, Gm is the day when the process is
launched (i.e. today’s day), Gf is the day of the first appearance of the concept ci and finally, the P.logage
is the age of the P-log. By using the age of the P-log we can detect which concepts have been appearing
for longer. In this equation, the more frequent occurrences of a concept, the higher the weight assigned to
it. Moreover, the interests that occur over a longer period receive higher weights than those that occur in
a short period as they seem to be more stable. Unlike the STA which computes the short-term interests’
frecency after each session, the LTA computes the frequency weights for long-term interests periodically
such as once or twice a month based on the domain of application. As long-term interests do not change as
frequently as short-term ones, there is no need to compute them as frequently.

Once the frequency weights for all concepts in the SBP are computed, the LTA needs to identify the in-
terests that should be treated as long-term interests. We introduce a new mechanism to compute a threshold
so that all concepts above it are stored in the LTP. Similar to the short-term layer, we propose a dynamic
threshold that is based on each user’s browsing behaviour. We first compute the frequency for each concept
using equation 10 and then we compute the Standard Deviation of all the concepts in the SBP:

σ =

√√√√ 1
N

N∑
i=1

(ci.FW −AV )2 (11)

Where σ is the Standard Deviation, N is the total number of concepts, ci.FW is the frequency weight
of a concept ci, and AV is the average of all the frequency values for all concepts. The threshold is then:

Threshold = σ +

(∑N
i=1 ci.FW

| N |

)
(12)

Where
∑N

i=1 ci.FW is the total of all the frequency weights for all concepts. Finally, all the concepts
with frequency weights higher than the computed threshold would be stored in the LTP. Again, instead
of predetermining a threshold and fixing it for all users at the same level, we utilize the information on
the frequency weights of the user’s interests and their standard deviation and this enables us to tailor the
threshold to the specific user’s pattern of behaviour.

Long-term interests are more stable than short-term ones, so the process of discovering them is much
easier. It is not appropriate to use the same technique to discover the short-term interests as they are unstable
and change frequently, and they are also subject to interest-shift or drift. When finding the short-term
interests, we should not use the standard deviation with the frecency average as a threshold as we do not
want to limit the threshold to select just the outstanding concepts, but indeed we need to select all the
concepts with frecency weights larger than the frecency average. This is because, when a user changes
his/her interest, the new interest would likely have a low frecency weight in the beginning, so using just
the frecency average would allow us to capture such interests. However, in order to deal with the problem
of selecting too many short-term interests, in the short-term layer, we limit the size of the STP by setting
a maximum number of interests. So for this particular reason, we do not have to set a specific size for the
long-term layer as using the long-term threshold would likely limit the long-term interests to only those
most frequent and outstanding ones.

19


Figure 7: The re-ranking process for the personalized web search system.

3.3. Personalization System Phase
We integrate the work described so far with an experimental web search personalization system in the

third and final phase. As part of this, we also introduce an experimental re-ranking approach. We aim to
demonstrate that our approach on dynamic user profiles can effectively capture user interests as well as the
change in user behaviours including interest shift and drift and can be used as part of a personalization sys-
tem to provide more effective and accurate services to users. However, the methods that we have developed
are generic enough and can be integrated and utilized by diverse personalization systems.

In the proposed system, which is illustrated in Figure 7, when a user submits a query, two processes
take place simultaneously:

1. The query is mapped to the ontological user profile to identify the most similar concept that represents
it. For this reason, the cosine similarity is used to compute the similarity between a query and all the
documents for all concepts in the user profile. We also pre-identify a threshold, so the highest similar
concept which is above this threshold will be selected to represent this query.

2. The query is passed to any search engine (e.g. Google or Yahoo) in order to retrieve initial search
results. The contents of each retrieved result are then extracted and stored in a separate document
using various text analysis techniques such as the ones that have been used in section 3.1.1.

We subsequently compute the similarity between each retrieved search document and the contents of the
interesting concept that represents the initial user query. Finally, we re-rank all the retrieved search results
in descending order and present them the user. The ranking approach is described in Figure 8.

4. Evaluation

In general, the evaluation of recommender, web personalization or systems that deploy user profiles is
recognized to be difficult and expensive (Yang et al., 2005). As such systems may have different purposes
and they tend to be very complex, making direct comparisons is difficult. The evaluation strategies com-
monly used can be divided into three main categories (Shani and Gunawardana, 2011): offline, user-centered
studies and online evaluations.

In offline evaluations, existing real datasets or artificial datasets are employed to assess the performance
of the system. The use of such datasets minimizes the cost of the evaluation and can facilitate reuse.
Although there are datasets such as the MovieLens5 one that can be used for collaborative filtering systems,
rich user datasets that can be used for personalization systems are almost non-existent.

5http://MovieLens.umn.edu

20


Figure 8: The re-ranking algorithm.

21


In user-centered evaluations, real users are recruited in order to collate realistic behaviours to test the
performance of the system (Shani and Gunawardana, 2011). Such evaluations involve three steps: (i) re-
cruiting users; (ii) creating a set of tasks to be completed by the users; and (iii) collecting and analyzing
user interactions and behaviours. Such evaluations are expensive to design and carry out and inevitably
tend to be limited in scale as it is very difficult and costly to recruit large numbers of users. In particular,
the tasks need to be carefully drawn so that they are able to provide appropriate and relevant results. To be
able to draw meaningful comparisons among systems that are being tested, users need to be able to repeat
the same or very similar tasks, hence these usually take the form of simulated work tasks. Borlund (2003)
made a number of recommendations when creating tasks for such evaluations. User evaluations are able
to provide a better indication of the effectiveness, accuracy and efficiency of personalization systems than
using artificial datasets as systems are tested in realistic settings. They can also provide answers to a wide
range of questions directly by the users and therefore afford a more well-rounded evaluation.

In online evaluations, a system is essentially deployed and evaluated in a real environment. This is
considered to be the best way to evaluate any system as in a real environment a system can be thoroughly
tested against real users’ behaviours (Montaner, 2001; Shani and Gunawardana, 2011). However, online
evaluations present a number of challenges. A system needs to be sufficiently well developed to make it
available to real users. This is expensive in terms of time and effort. The experimental algorithms deployed
may have a low performance, and hence, exposing them as part of an online evaluation may affect the user
experience and potentially if the evaluation is carried out by a company, the users’ trust in a company,
product etc. may be negatively affected.

To evaluate our work, we have undertaken a user-centered evaluation that enabled us to collect real user
data which we subsequently analyzed to examine the performance of the various aspects of our work. We
have split the evaluation into two parts (i) evaluation of the mapping and user profile modelling methods;
(ii) evaluation of the methods within a search personalization system.

5. Evaluation I: Mapping & User Profile Modelling Methods

In setting up our user-centered study, we chose the domain of computers as the application domain. Our
methods require the use of a reference ontology. Though we do not make any assumptions on the ontology
used, for the purposes of this study, we first created a reference ontology using information extracted from
the computer category from the Open Directory Project (ODP) Ontology. We used the ODP as it is con-
sidered to be the largest manually constructed directory consisting of websites categorized in related topics.
The computer directory contains more than 110,000 websites categorized in more than 7,000 categories. In
order to train the classifier, all the websites under each category were fetched. Furthermore, all the contents
of all websites were extracted and combined in one document. Hence, each category ci ∈ O is represented
by one document di that includes textual contents tx from web pages W associated with the ci. All the non-
semantically related classes (e.g. alphabetical order) were removed from this ontology. This resulted in a
total of 4,116 categories and about 100,000 training websites whose contents were extracted and combined
in 4,116 documents in our reference ontology. For each document, we apply dimensionality reduction tech-
niques as explained in section 3.1.2 to minimize the number of terms. After applying these techniques, a
vector space model is used to convert the terms in these documents to vectors. For this purpose, the TF-IDF
classifier is used which gives each term tj ∈ di a weight from 0 to 1 using equation 1.

Next, five scenarios were created to address different user browsing behaviours with respect to the
heuristics that were discussed in section 3.2. The aim of these scenarios is to test the learning and adaptation
ensuing from our methods when different behaviours occur and in particular interest shift or drift. For each
scenario, a set of tasks were created to simulate an environment where users would interact with the system

22


as they perform their own tasks. We selected 35 random topics from the computer ontology (e.g. XML,
C#, Computer security) as the basis for the experimental tasks. A total of 90 unique tasks and sub-tasks
were created based on these 35 topics. The tasks take the form of finding a specific piece of information,
or writing a short paragraph. Finally, 30 participants with a computer science background were invited to
participate in our experiment during a 20 day period. The participants were divided so that each scenario
was undertaken by six participants. These scenarios are described in more detail in appendix A.

To track user browsing behaviour, a Firefox browser was used with a modified add-on component called
Meetimer6. Meetimer was modified to implicitly track user browsing behaviour including all the visited web
pages, timestamps and durations. An SQLite database was used to store all the user sessions.

After 20 days, 30 log files from 30 different users were collected. These 30 users together surfed 9,360
web pages, and an average of 15.6 web pages a day for each user. We then processed the collected data to
create processed log files (P-log files) by fetching all the visited web pages, and extracting all their contents.
Moreover, we removed all the HTML tags and other noise data such as advertisements. All the stop words
were removed and then the Porter stemming algorithm was applied (Porter, 1997) to reduce the terms’
dimensionality. In the next stage, we analyzed the collected data by first examining the performance of the
GEW and 3C algorithms, and then the performance of the learning and adaptation processes.

5.1. Evaluation of the GEW and 3C Algorithms

To evaluate the proposed mapping algorithms GEW and 3C, we conducted three experiments to analyze
different aspects that impact on their overall performance.
Tuning of the GEW Algorithm. In this experiment, we examine the best values for the parameters α and γ.
The former controls how much weight is transferred from one concept to its upper-concept. The latter refers
to the number of highest similar concepts to a web page from the reference ontology that GEW would be
applied on. Determining the best γ values allows us to remove the concepts that add no value when applying
the GEW algorithm. We have experimented with α = {0.1, ...,1} and γ = {5,10,20,50,70,100,n} where
n refers to considering all concepts when applying the GEW. We examine all of these settings to measure the
accuracy of mapping the user visited web pages to a reference ontology using equation 13. The accuracy
is computed as the ratio of the total number of positively mapped web pages to correct concepts from a
reference ontology to the total number of web pages visited by all users.

Accuracy =
| Positive mapped web pages |

| All web pages |
(13)

Figure 9 shows the accuracy percentages for applying the GEW on all the visited web pages by all users
in all scenarios using different α and γ values. It can be clearly seen that the accuracy of applying the GEW
algorithm to all the similar concepts is relatively low, and the accuracy increases whenever γ decreases. In
particular, when applying the GEW to the top 5, 10 and 20 concepts, the accuracy of the mapping process
shows considerable improvement. Surprisingly, the best accuracy result is achieved when γ = 10 but not
when γ = 5. A possible explanation for this is that the top 10 similar results to a web page could hold the
most important concepts that are likely to be related to this page, while considering just the top 5 results may
miss some very important concepts. When it comes to α, interestingly the accuracy results improve when
α is close to 0.5, and clearly tends to decrease when α tends to the maximum or minimum. This is because
when α = 1, most of the inaccurately mapped concepts are too general to represent user interests, while

6https://addons.mozilla.org/en-US/firefox/addon/meetimer/

23


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

A
cc

u
ra

cy
 

 α value 

Top γ =5 

Top γ= 10 

Top γ= 20 

Top γ= 50 

Top γ= 70 

Top  γ= 100 

All

Figure 9: The accuracy results of applying the GEW algorithm on all web pages visited by all users with different α and γ values.

β value 5 10 20 30 40
Accuracy 0.7805 0.7448 0.731 0.7108 0.669

Table 1: Accuracy results of the 3C algorithm with different β values.

when α = 0.1 most of web pages were mapped to leaf nodes from the reference ontology which happen to
be too specific to represent these interests. For α = 0.5, a balance between general and specific interests is
maintained. Based on these results, we assign α = 0.5 and γ = 10 for all subsequent experiments.
Tuning the 3C Algorithm. To apply the 3C algorithm, we need to set β which acts as a threshold and
determines the number of concepts that are used as candidates. To identify the best β value, we apply the
3C algorithm to the top 40, 30, 20, 10 and 5 concepts. For each value, the 3C algorithm attempts to cluster
web pages with common candidate concepts together in order to identify the most accurate concept that
correctly represents them. Table 1 shows the accuracy results using the measure in equation 13.

Surprisingly, it can be seen from Table 1 that all thresholds have achieved very close accuracy results.
The top 5 threshold achieved the highest accuracy result of 0.7805, while the top 10 and top 20 performed
slightly worse with 0.7448 and 0.731 respectively. This result may be interpreted as suggesting that the
most appropriate concept that should be mapped to a web page is likely to be among the top 5 most similar
concepts to a web page. Finally, these results support our hypothesis that mapping web pages as groups can
enhance the mapping process as the accuracy of applying just the GEW is 0.6779 while by using both the
GEW and 3C algorithms, the accuracy increased to 0.7805. Based on these results, we assign β = 5 in all
subsequent experiments.
Comparing Against Other Mapping Techniques. So far, we have examined different values to find the
best possible configuration that can enhance the GEW and 3C performance. However, it is essential to
examine how these algorithms perform against other techniques in the literature. Hence, in this experiment,
we compare our mapping algorithms (GEW and 3C) to three different mapping techniques in the literature.
The first is the original cosine similarity (OCS) which computes the similarity between each collected
URL and documents in the reference ontology. Each URL is then represented by the highest similarity

24


Method Accuracy

GEW & 3C 0.7805
OCS 0.4489
Adding 50% 0.6158
SAS 0.4223

Table 2: Comparison of four different mapping approaches.

concept from the ODP. The second technique, which was suggested by (Middleton et al., 2004) and (Kim
et al., 2007) is adding 50% of each sub-concept’s weight to its super-concept, and repeats this process until
reaching the root. The third technique is the Sub-class Aggregation Scheme (SAS) that was proposed in
(Daoud et al., 2008) where user interests are represented using level three of the ontology. The weight for
each level-three concept is computed by adding the weights of its sub-concepts. In this experiment, each
visited web page for each user is mapped by applying all four techniques. Table 2 shows the overall average
accuracy for each mapping technique for all the users.

According to Table 2, the OCS and SAS performed poorly as the former achieved an accuracy of 0.4489
while the latter a slightly lower result at just 0.4223. It is somewhat surprising that the OCS performed
slightly better than the SAS. This poor performance of the SAS could be attributed to that all the visited
web pages are mapped only to concepts in level three of the reference ontology. This causes mapping these
web pages to concepts which are too general to represent them. On the other hand, adding 50% of each
sub-class to its super-class shows considerable improvement in the mapping with accuracy 0.6158. This
could be due to the main underlying assumption that if a user is interested in any concept, then s/he is likely
have some interest in its super-concept. Although this technique improved the overall mapping accuracy,
the percentage which is added to the super-classes is very high (50%). As a result, 44% of all the incorrectly
mapped web pages were mapped to level 1 and 2 super-concepts that are too general and broad to represent
user interests. The overall accuracy of the GEW and 3C algorithms shows a marked improvement over all
other techniques, achieving an accuracy result of 0.7805. This improvement shows that GEW and 3C can
overcome some of the drawbacks of other techniques as well as keep a balance between the general and the
more specific interests.

5.2. Evaluation of the Dynamic User Profiling Methods

Here we aim to evaluate the performance of the proposed dynamic user profile and the effectiveness of
the learning and adaptation processes. Hence, we first evaluate different features of our system, and then
compare its overall performance against other modelling approaches in the literature. For all of these eval-
uations, we measure the precision of learning and adapting the user profiles for each day of the experiment
duration (20 days) using equation 14:

Precision =
| Number of correctlearnedand adapted interests |

| Total number of actual interests |
(14)

The precision in this equation is computed as the ratio of correctly modelled interests in user profiles to
the actual number of interests in the scenarios.
Modelling Dynamic User Profiles. First, we examine the effectiveness of the Replacement-task feature
which was proposed in section 3.2.2, and aimed at dealing with the interest-shift behaviour that might occur

25


Figure 10: Average precision for each user over the 20 day period.

when a user browses the Internet. We compare how our system performs when the Replacement-task is
enabled and when it is disabled for each day of the experiment and for each user profile. As Figure 10
illustrates when the feature is enabled the average precision for all user profiles is 0.7649, while when it
is disabled it drops slightly to 0.7228. In more detail, we can see that in some days such as days 5 and
11, a sudden drop in the precision is observed when the Replacement-task is enabled as well as when it
is disabled. These drops were somehow expected as in these days, the user profiles experienced a shift
of interests. Interestingly, when the Replacement-task is enabled, the precision results during the interest-
shifts are considerably better than when it is disabled. Moreover, after each interest-shift such as after days
5 and 11, the system with the feature enabled managed to effectively learn and adapt to the changes in user
behaviours quickly as the precision consistently improved. These results show that using the Replacement-
task feature can detect and handle the interest-shift quite well.

Next, we evaluate the performance of different threshold techniques that are used to discover short-term
and long-term interests. First, we evaluate three such techniques for discovering short-term interests:

• Fixed number of interests (Fixed): A fixed number of topics in the SBP is selected and considered
as short-term interests. This threshold technique has been applied by studies such as (Grcar et al.,
2005) and (Li et al., 2007). In this experiment, we select just the top 5 topics from the SBP as it was
suggested in (Grcar et al., 2005).

• Largest gap (LGap): In this technique, we first find the largest gap among the frecency values that
are associated with the topics in the SBP, and then consider all the topics with frecency values larger
than the largest gap as short-term interests.

• Our approach: We employ the system described with all the aforementioned features and use the
average frecency and the Replacement-task feature to select the correct short-term interests for a user.

Figure 11 shows the overall comparison between the above three threshold techniques over 20 days. The
LGap technique achieved the lowest precision performance at an average of 0.649. A possible explanation
for this is that when a new interesting concept is browsed by a user, its interest weight would not be high.

26


Figure 11: Comparison of three techniques for discovering short-term interests.

As a result when the largest gap is computed, only the existing interests with high interest weights are
considered as short-term interests, while the new browsed interests are not (as their interest weights are
likely to be less than the largest gap value). In addition, when using LGap, sharp drops in the precision can
be observed after days 5 and 11. These drops occur because of the interest-shift and drift in users’ interests in
the scenarios that have been tested which cannot be captured by LGap. When it comes to the Fixed number
of interests threshold, surprisingly the performance is slightly better than LGap with an average precision of
0.683. What is interesting, is that this technique is not affected by the interest-shift and drift that occur in the
tested scenarios. In days 10, 12 and 13 this technique managed to achieve the best precision in comparison
to the others which shows that it can handle all the interest-shifts and drifts quite well. Since the five most
recent concepts are considered as short-term interests on each day, regardless of their interest weights, it
manages to capture the interest-shift and drift; when these occur, the new interests are simply added to the
user short-term profiles. Although this technique has no problem with the interest-shift and drift behaviours,
it has two limitations that affect its overall precision. The first limitation is that the size of the short-term
profile is limited to just five interests per session. Hence, if a user has more than five interests at the same
time, this technique would eliminate and ignore some of the interesting concepts. The second limitation is
that the interest weights for concepts are not considered when selecting short-term interests. As a result,
any concept even uninteresting ones would be considered as interesting to a user by this technique. The last
threshold technique is our approach of using the average frecency with the Replacement-task feature. As
it can be seen from Figure 11, our technique achieves an average precision of 0.764 which is a significant
improvement on the other techniques. This result demonstrates that our approach can provide an effective
mechanism to learn and adapt to the changes in user browsing behaviours.

When it comes to discovering long-term interests, we evaluate and compare the performance of the
following four techniques:

• Fixed number of interests (Fixed): We apply the threshold technique in (Grcar et al., 2005) where
the short-term profile stores the most recent visited pages, while the long-term one stores the last
viewed ones.

• Largest gap (LGap): In this technique, we first find the largest gap among the frequency values that

27


Threshold Technique Fixed LGap Our technique JFA
Average Precision 0.411 0.86 0.91 0.84

Table 3: Comparison of four techniques for discovering long-term interests.

are associated with the topics in the SBP, and then consider all the topics with frequency values larger
than the largest gap as long-term interests.

• Our technique: This is in essence our proposed approach of modelling a dynamic user profile by
using the frequency average and the standard deviation to discover long-term interests.

• Just the frequency average (JFA): In this technique, we deploy our proposed approach of modelling
a dynamic user profile to learn user interests by using just the average frequency as a threshold.

Unlike short-term interests that are discovered and learnt daily, long-term interests are discovered and
learnt at the end of the experiment period (i.e. day 20). In Table 3, we show just the average precision of
modelling long-term interests for all the user profiles for the four different threshold techniques. The results
show that the Fixed technique recorded the lowest performance. This is because in (Grcar et al., 2005), all
the last visited concepts are considered as long-term interests, which means that uninteresting and short-
term interesting concepts are also treated as long-term interests. On the other hand, LGap surprisingly
achieved a relatively good average precision of 0.86 which is better than both the Fixed and JFA. This
result can be attributed to that most of the long-term interests in user profiles have higher frequency weights
which are much higher than the short-term interests. However, one limitation of this technique which may
not be obvious in our scenarios might appear when the list of concepts has more than one largest gap or the
largest gap is between the uninteresting and short-term concepts, but not between short-term and long-term
interests. In this case, the largest gap would likely capture inaccurate long-term interests from such a list.
The results show that using both the frequency average and standard deviation instead of just the frequency
average can provide better average precision, 0.91 and 0.84 respectively. This is because when using both
frequency average and standard deviation just those long-term interests with the highest frequency weights
are selected to be long-term interests, but not the short-term interests with high interest weights as in the
case of using just the frequency average as a threshold.

Comparison Against Other State-of-the-Art Works. In this experiment, we compare the performance
of our modelling system against two other approaches, namely: (Grcar et al., 2005) and (Li et al., 2007)
(both were discussed in section 2). We test all the five scenarios which are described in more detail in
appendix A. Figure 12 shows the comparison results between our work and these two approaches for all
users over 20 days. It can be clearly seen that the approach of (Grcar et al., 2005) has achieved the lowest
result with an average precision of 0.41 for all days. This is due to a number of limitations. The main
limitation is that as all visited web pages are treated as interesting, even uninteresting ones, inevitably
inaccurate interests would be captured and modelled. Another problem is that the user short-term profile
is limited to a small number of interests, which means that when this gets full, many interesting concepts
would be left out. This is further accentuated by the fact that abandoned interests are not removed from the
profile.

The approach of (Li et al., 2007), performs slightly better with average precision over all days of 0.534.
This result is due to different limitations. One problem is associated with the representation of a user profile
where the interests are represented as concepts while the interest weights are the total number of visits. Even
if the user had stopped visiting such interests as the number of visits would not change, the user interests

28


Figure 12: Comparison of three user profile modelling approaches.

would not be removed from the profiles. Another issue is that the size of the buffer in the short-term profiles
is fixed, which leads to the same problem of capturing just a limited number of interests as in (Grcar et al.,
2005). As a consequence, this approach could not model correct user profiles when a user is interested in a
relatively big number of interests such as in day 7 in scenario 3. Our approach on the other hand, managed
to achieve a good average precision of about 0.75 for all days. This result is due to that user profiles are not
limited to a fixed number of interests, but the size adapts in the case of short-term profiles and is unlimited
in long-term profiles. This performance can also be attributed to the ability to recognize the shifts and
drifts in user interests as well as forgetting uninteresting concepts more effectively. Overall, although all the
designed scenarios are different and exhibit behaviours that are highly changeable, our dynamic user profile
managed to maintain good precision overall for all users during all days. This shows that our proposed
methods are able to effectively capture user interests and adapt to different browsing behaviours.

6. Evaluation II: Dynamic User Profiling for Personalized Search

We wish to evaluate the performance of the learning and adaptation processes in the context of a per-
sonalization system. To this end, we implemented the proposed search personalization system in section 3.3
that utilizes the dynamic user profiles to provide re-ranked search results based on user interests.

We used the same reference ontology that we created for the first evaluation and used Google7 as the
search engine to retrieve the search results. As before, the Firefox browser with the Meetimer component
was used to collect user behaviour. We invited 30 users to participate in an experiment over six days. All
users were postgraduate computer science students, so they are expert computer users. They were asked to
search and browse for interesting Programming Languages during the first three days. During the following
three days, users are asked to search and browse for any interesting topic in the field of computer science.
All users were asked to search and browse for at least three different interests during each of the three days.
We do this so that we can examine the shift of interests and how our models adapt to this. For each day,
users were asked to write down all the visited interests in order to compare them to the modelled dynamic

7http://www.google.com

29


# of queries submitted by all users 723
# of interests visited by all users 216
# of visited web pages by all users 1,290

Table 4: Collected data from all users.

user profiles. After six days, we collected all the 30 log files for all the 30 participants. Table 4 summarizes
the collected information.

In the next step, we retrieved all the visited web pages for all users and extracted all their contents. All
the web pages that belong to one interest were aggregated in one document. Then, in order to examine
the personalization performance of our proposed search system, we asked each user to select three queries
related to programming languages which were visited during the first three days, and to select another three
queries related to the interests that were visited during the last three days of the experiment. For each of
these queries, we retrieved the top 30 results from the Google search engine. We then randomized all of
these results and asked users to select just the results that were relevant to each query. Users were free to
click and navigate any result for more detail. Finally, we recorded the users’ responses with regards to all
of their queries.

6.1. Evaluating the Accuracy of Modelling User Profiles

We are interested in examining the accuracy of modelling user profiles and in particular, the quality of
identifying and mapping visited web pages to concepts from the reference ontology. As users were asked
to write down all the topics that they visited for each day, we compare these recorded topics to the concepts
that were discovered in the dynamic users profiles using equation 13.

Figure 13 shows the average mapping accuracy results for all users in each day of the experiment
for two approaches. The first approach is using our system with the GEW and 3C algorithms, while the
second is using our system with the simple cosine similarity algorithm. According to Figure 13, the average
accuracy for all users when our system uses the GEW and 3C algorithms is better than with the simple
cosine similarity algorithm (0.791 and 0.602 respectively). In more detail, the accuracy of our model with
the GEW and 3C on day 1 was only just 0.52. This could be attributed to that the dynamic user profiles have
just started to learn and adapt to the users’ browsing behaviour which means that they only have a small
amount of information. However, the trend from day 2 to day 3 has improved significantly with accuracy
being at 0.877 and 0.91 respectively. This is because after three days of learning, the profiles adapted
effectively to the new interests which in turn improved the accuracy results. On day 4, unsurprisingly a
sudden decrease occurred in the overall accuracy from 0.91 to 0.692. This drop was expected as users
started to shift their browsing behaviour from navigating through programming languages to navigating
through their own interests. Once users started to navigate more interests and browse more web pages on
days 5 and 6, the overall accuracy experienced a considerable improvement from 0.692 on day 4 to 0.89
on day 6. This demonstrates that the sudden change of user interests would lower the performance of the
dynamic user profile. Nevertheless, our system shows it can recover quickly as it effectively adapted to
the abrupt changes. Finally, it is clear from this figure that using the GEW and 3C outperforms the cosine
similarity on all days.

6.2. Evaluating the Re-ranking Approach

In this experiment, our goal is to evaluate the performance of the developed search personalization
system that utilizes dynamic user profiles to re-rank search results. An effective re-ranking approach should

30


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

d1 d2 d3 d4 d5 d6

A
v

e
rg

a
e

 A
cc

u
ra

cy
 

Day 

Average accuracy for user profiles 

Our approach Simple Cosine similarity

Figure 13: Average accuracy for each user profile over the six day period.

place the most relevant results at the top of the retrieved results. For this experiment, we evaluate the quality
of our proposed system by comparing its performance against three different systems. The first system is
the Google original ranking (GOR) that retrieves results for each query without any further processing. The
second system is using our proposed search system (PSS) that utilizes dynamic user profiles. The third
system computes the simple cosine similarity (SCS) between a user query and all the retrieved results from
the Google search engine and then re-ranks these results based on their weights. The last system uses the
rich concept descriptions in the reference ontology (RCD). We use a range of metrics to measure the quality
of these re-ranking methods. The Average Ranking Error (ARE) is calculated as follows:

ARE(uq) =

∑n
p∈R p.position

| Total number of p |
(15)

Where uq is a query that is sent by a user u, R is a set of all the retrieved results for the query uq,
p.position is a position of a retrieved page p in the ranking list, and | Totalnumberof p | is the total
number of results that are selected by users as related to their queries. This metric which has been used
by many studies such as (Li et al., 2007; Dou et al., 2007) has one important advantage as it computes
the error of the ranking order of any system in comparison to the actual ranking provided by users taking
into account the order of all retrieved results. Therefore, in this metric, a smaller error indicates better
performance. Figure 14 shows the ARE for all users for each system.

During the first three days, the PSS achieved the best overall performance in terms of ARE over all other
three methods, while the RCD achieved the second best. The SCS and GOR, on the other hand, recorded
lower results during the first three days. In more detail, the performance of our method (PSS) improved
steadily from 13.53 on day 1 to 12.92 on day 3. This improvement is due to the fact that during the second
and third days, our system managed to learn and collect sufficient information about the users’ interests. The
RCD, on the other hand, achieved lower performance than the PSS, but at the same time better performance
than the SCS. This is because in this approach, the concepts in the reference ontology are associated with
documents that represent these concepts. These documents contributed to the good performance as they
hold rich information about the associated concepts which are used to re-rank the retrieved results. Finally,

31


11

11.5

12

12.5

13

13.5

14

14.5

15

15.5

16

d1 d2 d3 d4 d5 d6

A
v

e
ra

g
e

 R
a

n
k

in
g

 E
rr

o
r 

Day 

Average Ranking Error for All Users 

GOR SCS RCD PSS

Figure 14: ARE across all four systems for all users over the six day period.

the GOR system has shown the worst performance and was the only approach that the performance kept
getting worse during the second and third days. This is because during days 2 and 3, users started to navigate
more interests, as well as visiting more web pages regarding the interests that they had visited earlier on day
1. As a result, new concepts were discovered, and more information about each concept was recorded. The
emergence of new concepts resulted in the GOR mechanism scoring the lowest ranking average of about
14.8 and 14.9 on days 2 and 3 respectively, while all the other methods managed to improve their overall
performance. On day 4, all four methods experienced a sudden drop in their performance. This trend was
expected as on that day users started to shift their interests from navigating topics related to programming
languages to topics related to their own interests in the field of computers. Similar to the results recorded
on day 1, the PSS recorded better performance than the other methods and this performance kept improving
during days 5 and 6 as a result of collecting more information about user interests.

Next, we look at the performance of all of the tested methods using the precision at N results (P @N) in
order to examine the proportion of the relevant results ranked at different top N results. Unlike the previous
metric, the order of the result is not important; what is important is to examine the users’ satisfaction over
the retrieved top N results for their submitted queries. We also use the Mean Average Precision (MAP)
metric to report the mean of the ranking precision for all the queries submitted by all users. Figure 15
shows the precision performance of four methods at different top N results namely: P@5, P@10, P@15,
P@20 and P@25, while Table 5 illustrates the MAP for these four methods.

The results in Figure 15 show that the PSS system achieved the best precision performance at all P @N
results. RCD and SCS scored the second and third best performances while GOR achieved the lowest
precision. At P@5, the PSS method achieved the best precision result of 0.881 which represents about 12%,
13% and 18% improvement over the RCD, SCS and GOR respectively. This indicates the effectiveness of
the system that provides a re-ranking search result based on the proposed method of modelling dynamic
user profiles. We also note that the RCD achieved better performance than the other methods due to that
all the retrieved results were re-ranked based on the information that is associated with the concepts in the
reference ontology. Finally, both the SCS and GOR achieved the lowest precision performance at all P @N
results. When it comes to the MAP metric, the results reflect those of Figure 15. That is, the PSS has
the highest MAP of 0.807, while the RCD and SCS scored the second and third best MAP (i.e. 0.748 and

32


0.4

0.5

0.6

0.7

0.8

0.9

1

P@5 P@10 P@15 P@20 P@25

P
re

ci
si

o
n

 
P@N for four ranking methods 

PSS GOR SCS RCD

Figure 15: P @N results for the four different ranking methods.

Method MAP

GOR 0.677
PSS 0.807
SCS 0.73
RCD 0.748

Table 5: MAP results for the four ranking methods.

0.73 respectively), while GOR achieved the lowest MAP of 0.677. Overall, PSS effectively outperformed
the three other methods, and was able to provide more accurate and reliable results when the user profiles
had sufficient amounts of information. This demonstrates that the proposed dynamic user profile adapted
effectively to the users’ interests, and managed to provide an improved personalized search service.

7. Conclusions

In this paper, we presented a dynamic user profile modelling approach for web personalization. The
starting point of our work was the identification of a number of issues in existing approaches, namely,
the limitations of such systems in dealing with the user’s changing interests over time, the lack of a clear
delineation of the short and long-term aspects of user interests and weaknesses in modelling dynamic aspects
of the user’s behaviour. In our work, we aimed to address these issues. Firstly, the developed methods are
able to deal with the constant change and transitions in a user’s interests including sudden change. Secondly,
we have developed methods that are able to capture to a satisfactory extent the differences in a user’s short
and long-term interests and the transition from one to another. These methods are also able to adapt based on
the patterns of behaviour of individual users and hence we do not apply a one-size-fits-all approach. Finally,
the methods developed have wider applicability and can be used in a number of domains and applications
in which ongoing user behaviour can be captured and information about user preferences and interests can
be extracted and modelled.

33


In more detail, our theoretical contribution is three-fold. Firstly, we introduce two mapping algorithms
to improve the mapping process between web pages visited by the user that contain implicit information
about his/her interests, and a reference ontology to explicitly represent these interests. Secondly, we intro-
duce novel techniques to construct ontological short and long-term profiles that are tailored to the users, and
adapt them based on their ongoing behaviour. Thirdly, within the methods for the construction and adapt-
ation of user profiles, we introduce techniques to recognize and handle potential interest drift and interest
shift in the user interests. Unlike other approaches in the literature such as (Challam et al., 2007; Zhang
et al., 2007; Trajkova and Gauch, 2004), the methods developed are able to handle the dynamic nature of
user interests and behaviours that change frequently. Our methods can handle the addition of new interests
to a user profile, but also other more convoluted processes such as updating, gradually forgetting and delet-
ing user interests. We also distinguish between long-term and short-term interests. Unlike approaches that
suggest pre-defined thresholds to discover and store such interests (Grcar et al., 2005; Li et al., 2007), we
presented mechanisms that are able to adapt to each user according to his/her behaviour. Another contri-
bution is that the proposed methods are flexible and can be imported and used with diverse personalization
systems. In this paper, we have illustrated this contribution by developing a re-ranking algorithm that uses
our system for modelling dynamic user profiles to provide personalized search results to users.

In order to evaluate our work, we introduced a two-part evaluation. The first part aimed at evaluating
the mapping and user profile modelling methods, while the second part aimed at evaluating the methods
within a re-ranking personalization system. Both of these user-centered evaluations used the same refer-
ence ontology which was extracted from the ODP. The first evaluation showed that the methods developed
are capable of mapping web pages to the reference ontology more accurately than other methods in the
literature. In addition, we showed that by examining different user behaviours, the learning and adaptation
strategies are capable of dynamically capturing user interests and forgetting the ones that are no longer of
interest. We also demonstrated that our methods are able to deal with the interest-shift and interest-drift
behaviours by minimizing the negative impact of these behaviours when modelling user profiles. In the
second set of evaluations, we examined our work within a personalized search system. We found that our
system achieved higher performance than the other search approaches. Our evaluation results demonstrated
that the proposed methods can effectively capture user interests, adapt to the changes occurring in user
behaviours and can enhance the performance of a personalization system.

Although in this work we have addressed some of the limitations of existing personalization methods
and we have showed that our system is effective at learning and adapting dynamic user profiles, our proposed
approach has a number of limitations. It is essential for our system to collect sufficient information about
each user in order to learn user profiles, and without such information, the system would not be able to
provide effective personalization services. This problem of having just a small amount of information is
known as the cold start problem and is a typical problem in content-based personalization systems. In
addition, in order for our adaptation methods to work effectively, the user needs to be using the system
in an ongoing basis. If the user stops using the system for a period of time and returns while interests
have changed significantly, the system will still be producing recommendations based on older information.
As it is often the case with content-based systems, our system could suffer from the problem of over-
specialisation. In essence, as the user profile adapts, the profile information becomes ever more specialised
and the system may be unable to make novel suggestions to the user. Another issue is that the effectiveness
of the techniques that we have proposed to infer and exploit semantic information from user profiles depend
heavily on the richness and the quality of the reference ontology. We assumed the existence of such an
ontology, but it is essential to note that the quality of the recommendations will depend on the quality and
richness of the underlying reference ontology deployed. Another weakness of our work is that some of the

34


models and mechanisms depend on parameters and settings that need to be pre-optimized. For example,
in order to compute the semantic relatedness between ontological concepts, we need to pre-identify the
weights of each relation type in an ontology. Similarly, the processes of learning, adapting and exploiting
dynamic user profiles have a number of parameters that need to be identified. For our evaluation, such
parameters were defined by running experiments using training datasets where the values and settings that
provide the optimum results were selected. However, such a mechanism is too slow and requires conducting
a large number of experiments. Hence when our methods are applied in different domains, these settings
need to identified and carefully selected.

Ontological user profiles comprise a rich representation of user interests which can help us to understand
the user needs in a more effective way and hence deliver better services. There are a number of avenues
for future work and extensions that we would like to explore. Currently, although we are using ontologies
to model the user profiles, we are not making use of the axioms that may be encoded into an ontology.
Such axioms may allow more useful information to be extracted from the user profiles and more complex
inferences to be made. We would like to explore how such axioms can play a role in further shaping the
user profiles and also as part of the recommendation process.

With the huge growth of social networks, we believe that some of the developed methods can be ex-
tended and applied in order to provide social recommendations to users. In particular, we could extend the
use of our methods to combine information included in social network profiles of individual users, but also
make use of information in their connections’ profiles in order to understand user preferences and interests
better and also identify similarities with connections that can help enrich recommendations further. The
latter could also help alleviate the problem of over-specialisation that our profiles may suffer from. In ad-
dition, such social network profiles contain multi-modal information which includes tags, videos, pictures,
etc. and another direction that we could extend our work would be toward developing methods that could
make use of different types of information.

In order to deal with the cold start issue, additional information can be used as suggested in (Middleton
et al., 2004) which could be an external ontology that includes more information about users such as their
job positions, publications, or the projects that they are working on. Such information could for instance be
collated from sites like LinkedIn8 (even if not available in an ontology format). Another potential solution
would be to develop a hybrid recommendation technique which combines our ontological user profile with
a collaborative filtering approach such as in (Basiri et al., 2010). It would be interesting to use such hybrid
system to also take advantage of the features of both techniques in order to extend and enhance our system
to provide more accurate and diverse services to users.

Acknowledgements

Dr Ahmad Hawalah was sponsored for his PhD studies by the University of Taibah, Medina, Kingdom
of Saudi Arabia.

Anand, S.S., Kearney, P., Shapcott, M., 2007. Generating semantically enriched user profiles for web personalization. ACM
Transactions on Internet Technology 7.

Baeza-Yates, R., Ribeiro-Neto, B., 2011. Modern Information Retrieval: The Concepts and Technology behind Search (2nd
Edition). 2 ed., Addison-Wesley Professional.

Basiri, J., Shakery, A., Moshiri, B., Hayat, M., 2010. Alleviating the cold-start problem of recommender systems using a new
hybrid approach, in: Proceedings of the 5th International Symposium on Telecommunications (IST 2010), pp. 962–967.

Blanco-Fernandez, Y., Nores, M.L., Gil-Solla, A., Cabrer, M.R., Arias, J.J.P., 2011. Exploring synergies between content-based
filtering and spreading activation techniques in knowledge-based recommender systems. Information Sciences 181, 4823–4846.

8https://uk.linkedin.com

35


Borlund, P., 2003. The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information
Research 8.

Cantador, I., Fernandez, M., Vallet, D., Castells, P., Picault, J., Ribire, M., 2008. A multi-purpose ontology-based approach for
personalised content filtering and retrieval, in: Advances in Semantic Media Adaptation and Personalization. Springer Verlag.
volume 93, pp. 25—51. Volume title: Advances in Semantic Media Adaptation and Personalization.

Challam, V., Gauch, S., Chandramouli, A., 2007. Contextual search using ontology-based user profiles, in: Large Scale Semantic
Access to Content (Text, Image, Video, and Sound) (RIAO ’07), Pittsburgh, Pennsylvania. pp. 612—617.

Chen, C.C., Chen, M.C., Sun, Y., 2002. PVA: A Self-Adaptive personal view agent. Journal of Intelligent Information Systems
18, 173—194.

Claypool, M., Le, P., Wased, M., Brown, D., 2001. Implicit interest indicators, in: Proceedings of the 6th International Conference
on Intelligent User Interfaces, ACM. pp. 33—40.

Daoud, M., Tamine-Lechani, L., Boughanem, M., 2008. Learning user interests for a session-based personalized search, in:
Proceedings of the Second International Symposium on Information Interaction in Context, ACM, New York, NY, USA. pp.
57—64.

Dou, Z., Song, R., Wen, J., 2007. A large-scale evaluation and analysis of personalized search strategies, in: Proceedings of the
16th International Conference on the World Wide Web, ACM, New York, NY, USA. pp. 581—590.

Eirinaki, M., Mavroeidis, D., Tsatsaronis, G., Vazirgiannis, M., 2006. Introducing semantics in web personalization: The role of
ontologies, in: Ackermann, M., Berendt, B., Grobelnik, M., Hotho, A., Mladeni, D., Semeraro, G., Spiliopoulou, M., Stumme,
G., Svtek, V., Someren, M. (Eds.), Semantics, Web and Mining. Springer. volume 4289, pp. 147–162.

Felden, C., Linden, M., 2007. Ontology-Based user profiling, in: Abramowicz, W. (Ed.), Business Information Systems. Springer
Berlin / Heidelberg. volume 4439 of Lecture Notes in Computer Science, pp. 314—327.

Gao, Q., Yan, J., Liu, M., 2008. A semantic approach to recommendation system based on user ontology and spreading activation
model, in: Proceedings of the 2008 IFIP International Conference on Network and Parallel Computing, pp. 488–492.

Gauch, S., Speretta, M., Chandramouli, A., Micarelli, A., Brusilovsky, P., Kobsa, A., Nejdl, W., 2007. User profiles for personalized
information access the adaptive web, in: The Adaptive Web. Springer Berlin / Heidelberg. volume 4321, pp. 54—89.

Godoy, D., Amandi, A., 2009. Interest drifts in user profiling: A Relevance-Based approach and analysis of scenarios. The
Computer Journal 52, 771—788.

Gorgoglione, M., Palmisano, C., Tuzhilin, A., 2006. Personalization in context: Does context matter when building personalized
customer models?, in: Proceedings of the IEEE International Conference on Data Mining, pp. 222—231.

Grcar, M., Mladeni, D., Grobelnik, M., 2005. User profiling for interest-focused browsing history, in: Proceedings of the Workshop
on End User Aspects of the Semantic Web (in conjunction with the 2nd European Semantic Web Conference), pp. 99–109.

Hawalah, A., Fasli, M., 2011a. Improving the mapping process in ontology-based user profiles for web personalization systems,
in: Proceedings of the International Conference or Agents and Artificial Intelligence ICAART, pp. 321—328.

Hawalah, A., Fasli, M., 2011b. A multi-agent system using ontological user profiles for dynamic user modelling, in: Proceedings
of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 430—437.

Huang, Z., Chung, W., Chen, H., 2004. A graph model for E-Commerce recommender systems. Journal of the American Society
for Information Science and Technology 55, 259—274.

Kim, J., Kim, J., Kim, C., 2007. Ontology-Based user preference modeling for enhancing interoperability in personalized services,
in: Stephanidis, C. (Ed.), Universal Access in Human-Computer Interaction. Applications and Services. Springer, pp. 903—912.

Konstan, J.A., Miller, B.N., Maltz, D., Herlocker, J.L., Gordon, L.R., Riedl, J., Volume, H., 1997. Grouplens: Applying collabor-
ative filtering to usenet news. Communications of the ACM 40, 77—87.

Li, L., Yang, Z., Wang, B., Kitsuregawa, M., 2007. Dynamic adaptation strategies for long-term and short-term user profile to
personalize search, in: Proceedings of the Joint 9th Asia-Pacific Web and 8th International Conference on Web-age Information
Management Conference on Advances in Data and Web Management, pp. 228—240.

Liang, T.P., Yang, Y.F., Chen, D.N., Ku, Y.C., 2008. A semantic-expansion approach to personalized knowledge recommendation.
Decision Support Systems 45, 401–412.

Liu, F., Yu, C., Meng, W., 2002. Personalized web search by mapping user queries to categories, in: Proceedings of the Eleventh
International Conference on Information and Knowledge Management (CIKM), ACM, New York, NY, USA. pp. 558—565.

Liu, W., Jin, F., Zhang, X., 2008. Ontology-based user modeling for e-commerce system, in: Proceedings of the Third International
Conference on Pervasive Computing and Applications (ICPCA) 2008, IEEE. pp. 260–263.

Middleton, S.E., Shadbolt, N.R., De Roure, D.C., 2004. Ontological user profiling in recommender systems. ACM Transactions
on Information Systems (TOIS) 22, 54—88.

Montaner, M., 2001. A taxonomy of personalized agents on the internet. Technical Report, TR-2001-05-1, Departament
d’Electronica, Informatica i Automatica. Universitat de .

Mooney, R.J., Bennett, P.N., Roy, L., 1998. Book recommending using text categorization with extracted information, in: Papers
from 1998 Workshop on Recommender systems, AAAI Press. pp. 49–54.

36


Mooney, R.J., Roy, L., 2000. Content-based book recommending using learning for text categorization, in: Proceedings of the
Fifth ACM Conference on Digital Libraries, ACM, New York, NY, USA. pp. 195–204.

MozillaWiki, 2010. User:Mconnor/Past/PlacesFrecency - MozillaWiki. See https://wiki.mozilla.org/User:Mconnor/PlacesFrecency.
Pan, X., Wang, Z., Gu, X., 2007. Context-Based adaptive personalized web search for improving information retrieval effectiveness,

in: Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing, IEEE. pp.
5427—5430.

Picault, J., Ribiere, M., 2008. An empirical user profile adaptation mechanism that reacts to shifts of interests, in: Proceedings of
the European Conference in Artificial Intelligence ECAI.

Pignotti, E., Edwards, P., Grimnes, G.A., 2004. Context aware personalised service delivery, in: Proceedings of the European
Conference in Artificial Intelligence ECAI 2004, Valencia. pp. 1077–1078.

Porter, M.F., 1997. An algorithm for suffix stripping, in: Sparck Jones, K., Willett, P. (Eds.), Readings in Information Retrieval.
Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 313—316.

Razmerita, L., Lytras, M.D., 2008. Ontology-Based user modelling personalization: Analyzing the requirements of a semantic
learning portal, in: Lytras, M.D., Carroll, J.M., Damiani, E., Tennyson, R.D. (Eds.), Emerging Technologies and Information
Systems for the Knowledge Society. Springer. volume 5288, pp. 354—363.

Shani, G., Gunawardana, A., 2011. Evaluating recommendation systems, in: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (Eds.),
Recommender Systems Handbook. Springer, pp. 257–297.

Sieg, A., Mobasher, B., Burke, R., 2007. Ontological user profiles for representing context in web search, in: 2007 IEEE/WIC/ACM
International Conferences on Web Intelligence and Intelligent Agent Technology Workshops, IEEE. pp. 91—94.

Speretta, M., Gauch, S., 2005. Personalized search based on user search histories, in: Proceedings of the 2005 IEEE/WIC/ACM
International Conference on Web Intelligence, IEEE. pp. 622—628.

Sugiyama, K., Hatano, K., Yoshikawa, M., 2004. Adaptive web search based on user profile constructed without any effort from
users, in: Proceedings of the 13th International Conference on the World Wide Web WWW ’04, New York, NY, USA. p. 675.

Sun, J., Xie, Y., 2009. A recommender system based on web data mining for personalized E-Learning, in: Proceedings of the
International Conference on Information Engineering and Computer Science, 2009. ICIECS, IEEE. pp. 1—4.

Trajkova, J., Gauch, S., 2004. Improving Ontology-Based user profiles, in: Proceedings of Large Scale Semantic Access to Content
(Text, Image, Video, and Sound) RIAO, pp. 380—389.

Weng, S.S., Chang, H.L., 2008. Using ontology network analysis for research document recommendation. Expert Systems with
Applications 34, 1857–1869.

Wu, K.L., Aggarwal, C.C., Yu, P.S., 2001. Personalization with dynamic profiler, in: Advanced Issues of E-Commerce and
Web-Based Information Systems, Third International Workshop on WECWIS 2001, IEEE. pp. 12–20.

Xu, K., Zhu, M., Zhang, D., Gu, T., 2008. Context-aware content filtering & presentation for pervasive & mobile information
systems, in: Proceedings of the 1st International Conference on Ambient Media and Systems, pp. 1–20.

Yang, Y., Padmanabhan, B., Rajagopalan, B., Deshmukh, A., 2005. Evaluation of online personalization systems: A survey of
evaluation schemes and a knowledge-based approach. Journal of Electronic Commerce Research 6, 112–122.

Zhang, H., Song, Y., Song, H., 2007. Construction of Ontology-Based user model for web personalization, in: Proceedings of the
User Modeling Conference, pp. 67—76.

Zhuhadar, L., Nasraoui, O., 2008. Personalized cluster-based semantically enriched web search for e-learning, in: Proceedings of
the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web, ACM, New York, NY, USA.
pp. 105—112.

AppendixA. The Evaluation Scenarios

In this appendix, we present the five scenarios that were used as part of the first evaluation to address
different user browsing behaviours.
Scenario (1): ‘Normal behaviour’: In the first scenario, we look at how our model would learn and adapt
to normal behaviour that consists of uninteresting topics and short-term and long-term interests. Figure A.16
represents this scenario. This scenario consists of a total of 12 topics and 126 tasks (i.e. some of the tasks
are repeated in different days). Users were asked to complete as many tasks as possible during 20 days. All
the tasks in this scenario have been designed to simulate three browsing behaviours. The first behaviour is
browsing uninteresting topics (e.g. topics 2, 4, 6, 8, 10 and 12 in table 1). The tasks in these topics occur in
a short period and their occurrence is very low. In the second behaviour, we simulate a browsing behaviour
when a user has long-term interests in some topics. In this scenario, topics 1 and 3 simulate this behaviour

37


Figure A.16: Scenario (1): Normal Behaviour.

Figure A.17: Scenario (2): Browsing-stopped.

as both these topics occur over a long period and their occurrence is very high. In the final behaviour, we
simulate a user behaviour when he has short-term interests in some topics. Examples of this behaviour are
topics 5, 7, 9 and 11 in Figure A.16 which appear for a short period and their occurrence is high.
Scenario (2): ‘Browsing-stopped behaviour’: The second scenario again examines how our system would
learn and adapt to normal behaviour. However, the difference in this scenario is that in the second third of
the experiment’s duration (i.e. from day 8 to day 12) users will not be asked to complete any task. The main
aim of such behaviour is to see how our system reacts when users suddenly stop browsing the Internet for a
short while and then return to their normal browsing behaviour (i.e. from day 13). Figure A.17 shows the
‘browsing-stopped’ behaviour scenario.
Scenario (3): ‘Gradual interest-drift behaviour’: Scenario (3) examines interest drift. For this purpose,
we simulate a browsing behaviour where user interests are gradually changed from a set of topics to new
topics. In order for the interest drift to be gradual, we designed the tasks in such a way so that short-term
topics (i.e. topics 3, 5, 7, 9 and 12) start as interesting topics, and then they become less interesting after a
while and just before a user starts to drift his interests to new topics. Figure A.18 shows the gradual interest
drift scenario.

Scenario(4): ‘Sudden interest-shift behaviour’: Scenario (4) has been designed to test our system

38


Figure A.18: Scenario (3): Gradual interest-drift.

Figure A.19: Scenario (4): Sudden interest-shift.

when user interests suddenly shift to new ones. In this scenario, the interest-shift occurs three times: from
day 1 to day 6, from day 7 to day 11 and from day 12 to day 20 (see Figure A.19). In order to simulate the
interest-shift behaviour, we designed all the tasks to be stopped suddenly, and new tasks about new topics
take place abruptly. Our system would have to adapt to such behaviour in terms of recognizing new interests
and forgetting uninteresting concepts rapidly.

Scenario (5): ‘Interrupted browsing behaviour’: In scenario (5), we simulate a situation when a
user’s regular behaviour is interrupted. In addition, a user might have regular interests on some topics, but
in some situations he might lose this interest temporarily as he might get interested in new topics. But after
a while, the user would get back to his regular behaviour and get interested again in the previous topics. For
example, a user might have a long-term interest in reading topics related to his work, but if he decided to
take a holiday in France for example, his interests would probably shift to topics related to France. However,
once he gets back from his holiday, he would most likely return to his previous behaviour by browsing again
for the earlier topics that related to his work. Figure A.20 shows this scenario.

39


Figure A.20: Scenario (5): Interrupted browsing behaviour.

AppendixB. Example of adaptation of ontological user profile

In this appendix, we present a brief example of how a user’s ontological profile adapts based on the
interaction of the user with the system and through browsing topics. This example has been taken from
a real user profile that was created during our experiment as described in section 5. The profile shown
in Figures B.21-B.23 represents only a fraction of the actual user profile and is provided for illustrative
purposes. The numbers indicated alongside the concepts are the interest weights.

Figure B.21 illustrates a fragment of the user’s profile after this has been mapped to the reference
ontology ODP on Day 1 of the experiment. The user has the following short-term interests: HCI, Dell,
CSS, and Data mining. The average accuracy of the user profile on that day was 0.66 and this was due to
the limited information present in the user’s profile.

In Figure B.22, we can see the adapted fragment of the user profile on Day 5 of the experiment. Our
approach captured the users interests as being in: HCI, Dell, CSS and XML. By Day 5 certain concepts that
were ascertained as being of no interest to the user such as Data Mining and Database were forgotten as
they just appeared only once during Day 1. The concept Directories was also forgotten as the user showed
more interest on the HCI concept. The accuracy during day 5 was in fact close to 1 for this specific user.

Finally, Figure B.23 shows the adapted user profile by Day 15 of the experiment. By Day 15, the user
profile has experienced a number of changes:

• As the user developed new short-term interests (AI and AJAX) this caused the accuracy to drop. The
logfiles reveal that there was much noise in the collected user interests which caused the Insert Agent
to add wrong concepts to the user profile (i.e. Javascript, Agent and Fuzzy).

• Long-term interests were recognized as XML and HCI as they recorded high frequency weights (.793
and .778 respectively) compared to the short-term interests.

• The Forget Agent processed and forgot three concepts that used to be short-term interests (i.e. Data
Mining, Dell and CSS).

• Two concepts were deleted from the user profile as the user did not show any interest in them.
These concepts are Database and Directories and their frecency weights were less than the calcu-
lated threshold for concepts to remain in the user profile (even as forgotten concepts).

40


Figure B.21: Day 1: Ontological user profile after mapping user interests to reference ontology.

Figure B.22: Day 5: Adapted ontological user profile after interaction with the system.

41


Figure B.23: Day 15: Adapted ontological user profile after further interaction with the system.

42