Douglas Véras e Silva 
 
 
CD-CARS: CROSS-DOMAIN CONTEXT-AWARE RECOMMENDER 

SYSTEMS 

 
Universidade Federal de Pernambuco 

p o s g ra d u a ca o @ c i n. u fp e. b r  

www.cin.ufpe.br/~posgraduacao 

 
Recife 

2016


Douglas Véras e Silva 
 
 
“CD-CARS: Cross-Domain Context-Aware Recommender Systems” 
 

A Ph.D. Thesis presented to the Centro de Informática of 

Universidade Federal de Pernambuco in partial fulfillment 

of the requirements for the degree of Philosophy Doctor in 

Ciência da Computação. 

 
Advisor: Prof. Dr. Carlos André Guimarães Ferraz 

Co-Advisor: Prof. Dr. Ricardo Bastos Cavalcante Prudêncio 
 
 
Recife 

2016 


                                        Catalogação na fonte 

Bibliotecária Monick Raquel Silvestre da S. Portes, CRB4-1217                  
  

S586c Silva, Douglas Véras e 

CD-cars: cross domain context-aware recomender systems / Douglas Véras 
e Silva. – 2016. 

   240 f.: il., fig., tab. 
 
  Orientador: Carlos André Guimarães Ferraz. 
  Tese (Doutorado) – Universidade Federal de Pernambuco. CIn, Ciência da 

Computação, Recife, 2016. 
                       Inclui referências. 
 

  1. Inteligência artificial. 2. Sistemas de recomendação. 3. Filtragem 
colaborativa.  I. Ferraz, Carlos André Guimarães (orientador).  II. Título. 
 
       006.3                      CDD (23. ed.)                       UFPE- MEI 2016-139                              
       

Douglas Véras e Silva 
 

CD-CARS: CROSS-DOMAIN CONTEXT-AWARE 

RECOMMENDER SYSTEMS 
 
 
Tese apresentada ao Programa de Pós-

Graduação em Ciência da Computação da 

Universidade Federal de Pernambuco, como 

requisito parcial para obtenção do título de 

Doutor em Ciência da Computação. 

 
Aprovado em: 21/07/2016. 

 
___________________________________ 

Prof. Carlos André Guimarães Ferraz 
Orientador do Trabalho de Tese 

 
BANCA EXAMINADORA 

 
_________________________________________________ 

Prof. Dra. Patricia Cabral de Azevedo Restelli Tedesco 

Centro de Informática/UFPE 

 
_________________________________________________ 

Prof. Dr. Kiev Santos da Gama 

Centro de Informática/UFPE 

 
_________________________________________________ 

Prof. Dr. Sérgio Ricardo de Melo Queiroz 

Centro de Informática/UFPE 

 
_________________________________________________ 

Prof. Dr. Byron Leite Dantas Bezerra 

Escola Politécnica/UPE 

 
_________________________________________________ 

Prof. Dr. Evandro de Barros Costa 

Instituto de Computação/UFAL 

 
I dedicate this thesis to my parents and fiancée.


Acknowledgements

First of all, I thank God for giving me health and strength necessary to conclude
this work. Without Him the conclusion of this work would not be possible.

I thank my parents, Maria Aparecida and Jailson Antônio, and brother, Matheus
Véras, who always give me love and support in all my life. Also, I would like to thank my
fiancée, Laura Regina, her brothers (Diego Dermeval and Amauri Junior) and her parents,
Laura Maria and Amauri Campos, for their patience, comprehension, love and support.

My sincere gratitude to professors Carlos Ferraz and Ricardo Prudêncio, respectively,
my advisor and co-advisor, for their incentive, trust, support, knowledge transmitted and
friendship. I also thank Alysson Bispo, Thiago Prota and Rafael Ferreira, my friends
and colleagues at Universidade Federal de Pernambuco (UFPE) and Universidade Federal
Rural de Pernambuco (UFRPE), for the incentive and help in the development of this
thesis.

I am very grateful to UFPE and UFRPE for the opportunity and support to develop
my research in their facilities. Also, my sincere gratitude to Fundação de Amparo a Ciência
e Tecnologia de Pernambuco (FACEPE) for the financial support to the development of
this research.

I thank to professors Byron Leite, Patricia Tedesco, Kiev Gama, Evandro Costa
and Sérgio Queiroz for providing constructive reviews, corrections and suggestions in order
to improve my thesis.

Finally, I would like to thank all my family members and friends for the friendship
and incentive as well as to all the people who contributed directly or indirectly to this
research.


“And if I have a prophet’s power,
and have knowledge of all secret things;

and if I have all faith,
by which mountains may be moved from their place,

but have not love,
I am nothing.”

(I Corinthians 13:2 - Holy Bible)


Resumo

Tradicionalmente, “sistemas de recomendação de domínio único” (SDRS) têm alcançado
bons resultados na recomendação de itens relevantes para usuários, a fim de resolver o
problema da sobrecarga de informação. Entretanto, “sistemas de recomendação de domínio
cruzado” (CDRS) têm surgido visando melhorar os SDRS ao atingir alguns objetivos,
tais como: “melhoria de precisão”, “melhor diversidade”, abordar os problemas de “novo
usuário” e “novo item”, dentre outros. Ao invés de tratar cada domínio independentemente,
CDRS usam conhecimento adquirido em um domínio fonte (e.g. livros) a fim de melhorar a
recomendação em um domínio alvo (e.g. filmes). Assim como acontece na área de pesquisa
sobre SDRS, a filtragem colaborativa (CF) é considerada a técnica mais popular e ampla-
mente utilizada em CDRS, pois sua implementação para qualquer domínio é relativamente
simples. Além disso, sua qualidade de recomendação é geralmente maior do que a dos
algoritmos baseados em filtragem de conteúdo (CBF). De fato, a maioria dos “sistemas de
recomendação de domínio cruzado” baseados em filtragem colaborativa (CD-CFRS) podem
oferecer melhores recomendações em comparação a “sistemas de recomendação de domínio
único” baseados em filtragem colaborativa (SD-CFRS), aumentando o nível de satisfação
dos usuários e abordando problemas tais como: “início frio”, “esparsidade” e “diversidade”.
Entretanto, os CD-CFRS podem não ser mais precisos do que os SD-CFRS. Por outro
lado, “sistemas de recomendação sensíveis à contexto” (CARS) tratam de outro tópico
relevante na área de pesquisa de sistemas de recomendação, também visando melhorar a
qualidade das recomendações. Diferentes informações contextuais (e.g. localização, tempo,
humor, etc.) podem ser utilizados a fim de prover recomendações que são mais adequadas
e precisas para um usuário dependendo de seu contexto. Desta forma, nós acreditamos que
a integração de técnicas desenvolvidas separadamente (de “domínio cruzado” e “sensíveis
a contexto”) podem ser úteis em uma variedade de situações, nas quais as recomendações
podem ser melhoradas a partir de informações obtidas em diferentes fontes além de refi-
nadas considerando informações contextuais específicas. Nesta tese, nós definimos uma
nova formulação do problema de recomendação, considerando tanto a disponibilidade de
informações de diferentes domínios (fonte e alvo) quanto o uso de informações contextuais.
Baseado nessa formulação, nós propomos a integração de abordagens de “domínio cruzado”
e “sensíveis a contexto” para um novo sistema de recomendação (CD-CARS). Para avaliar
o CD-CARS proposto, nós realizamos avaliações experimentais através de dois “conjuntos
de dados” com três diferentes dimensões contextuais e três domínios distintos. Os resul-
tados dessas avaliações mostraram que o uso de técnicas sensíveis a contexto pode ser
considerado como uma boa abordagem a fim de melhorar a qualidade de recomendações
de “domínio cruzado” em comparação às recomendações de CD-CFRS tradicionais.

Palavras-Chave: Recomendação de Domínio Cruzado. Recomendação Sensível a Con-
texto. Filtragem Colaborativa. Recomendação de Domínio Cruzado Sensível a Contexto.


Abstract

Traditionally, single-domain recommender systems (SDRS) have achieved good results
in recommending relevant items for users in order to solve the information overload
problem. However, cross-domain recommender systems (CDRS) have emerged aiming
to enhance SDRS by achieving some goals such as accuracy improvement, diversity,
addressing new user and new item problems, among others. Instead of treating each
domain independently, CDRS use knowledge acquired in a source domain (e.g. books) to
improve the recommendation in a target domain (e.g. movies). Likewise SDRS research,
collaborative filtering (CF) is considered the most popular and widely adopted approach
in CDRS, because its implementation for any domain is relatively simple. In addition, its
quality of recommendation is usually higher than that of content-based filtering (CBF)
algorithms. In fact, the majority of the cross-domain collaborative filtering RS (CD-CFRS)
can give better recommendations in comparison to single-domain collaborative filtering
recommender systems (SD-CFRS), leading to a higher users’ satisfaction and addressing
cold-start, sparsity, and diversity problems. However, CD-CFRS may not necessarily be
more accurate than SD-CFRS. On the other hand, context-aware recommender systems
(CARS) deal with another relevant topic of research in the recommender systems area,
aiming to improve the quality of recommendations too. Different contextual information
(e.g., location, time, mood, etc.) can be leveraged in order to provide recommendations
that are more suitable and accurate for a user depending on his/her context. In this way, we
believe that the integration of techniques developed in isolation (cross-domain and context-
aware) can be useful in a variety of situations, in which recommendations can be improved
by information from different sources as well as they can be refined by considering specific
contextual information. In this thesis, we define a novel formulation of the recommendation
problem, considering both the availability of information from different domains (source
and target) and the use of contextual information. Based on this formulation, we propose
the integration of cross-domain and context-aware approaches for a novel recommender
system (CD-CARS). To evaluate the proposed CD-CARS, we performed experimental
evaluations through two real datasets with three different contextual dimensions and
three distinct domains. The results of these evaluations have showed that the use of
context-aware techniques can be considered as a good approach in order to improve the
cross-domain recommendation quality in comparison to traditional CD-CFRS.

Keywords: Cross-domain Recommendation. Context-Aware Recommendation. Collabo-
rative Filtering Recommendation. Cross-Domain Context-Aware Recommendation.


List of Figures

Figure 1 – Cross-domain collaborative filtering recommendation (based on (CRE-
MONESI; TRIPODI; TURRIN, 2011)(SANTOS et al., 2012)). . . . . . 25

Figure 2 – Context-aware collaborative filtering recommendation. . . . . . . . . . 26
Figure 3 – Cross-domain context-aware recommendation. . . . . . . . . . . . . . . 29
Figure 4 – “Domain” definitions according to attributes and types of recommended

items (CANTADOR et al., 2015). . . . . . . . . . . . . . . . . . . . . . 39
Figure 5 – Cross-domain recommendation tasks (CANTADOR et al., 2015). . . . 40
Figure 6 – Possible scenarios of user and/or item overlap between the source and

target domains (CREMONESI; TRIPODI; TURRIN, 2011). . . . . . . 43
Figure 7 – Cross-domain recommendation approaches taxonomy (CANTADOR et

al., 2015). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Figure 8 – Partitioning of data: (left) hold-out; (middle) leave-some-users-out; and

(right) leaveall (CANTADOR et al., 2015). . . . . . . . . . . . . . . . . 46
Figure 9 – Paradigms for incorporating context in recommender systems (ADO-

MAVICIUS; TUZHILIN, 2015). . . . . . . . . . . . . . . . . . . . . . . 55
Figure 10 – Merging user preferences approach (CANTADOR et al., 2015). . . . . . 57
Figure 11 – A contextual feature represented by dimensions, attributes and values. 70
Figure 12 – The pre-filtering cross-domain recommendation is made by filtering the

target contextual user-rating tensor for a given context. . . . . . . . . . 76
Figure 13 – The cross-domain post-filtering recommendation is made over the aggre-

gated user-rating matrices and then post-filtered according to contextual
user preferences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Figure 14 – Category preferences tensor enhancement from association rules. . . . . 80
Figure 15 – The cross-domain modelling recommendation uses contextual informa-

tion directly in the recommendation function as an explicit predictor of
a user rating for an item. . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Figure 16 – The cross-domain PreF algorithm can be used before the PostF algo-
rithm in a possible combination. . . . . . . . . . . . . . . . . . . . . . . 83

Figure 17 – The cross-domain modelling algorithm can be used before the PostF
algorithm in a possible combination. . . . . . . . . . . . . . . . . . . . 84

Figure 18 – Original (a) and enhanced (b) item-to-item connections. Solid circles
represent items belonging to a single domain, whereas blank circles
represent cross items that act as a bridge among different domains
(CREMONESI; TRIPODI; TURRIN, 2011). . . . . . . . . . . . . . . . 89

Figure 19 – Example of a temporal dimension with its possible contextual attributes
and values in a hierarchical view. . . . . . . . . . . . . . . . . . . . . . 94


Figure 20 – Process for gathering the location contextual information from the user
information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Figure 21 – Example of a location dimension with its possible contextual attributes
and values in a hierarchical view. . . . . . . . . . . . . . . . . . . . . . 98

Figure 22 – Example of a companion dimension with its possible contextual at-
tributes and values in a hierarchical view. . . . . . . . . . . . . . . . . 101

Figure 23 – Data model class diagram focusing contextual aspects of the CD-CARS
implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Figure 24 – Data model class diagram focusing dataset aspects of the CD-CARS
implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Figure 25 – Class diagram illustrating entities used by the pre-filtering class. . . . . 114
Figure 26 – Example of the pre-filtering process considering the context of user-

ratings and the recommendation context. . . . . . . . . . . . . . . . . . 115
Figure 27 – Example of selected categories in the post-filtering recommendation. . . 116
Figure 28 – A class diagram illustrating the main post-filtering entities. . . . . . . . 117
Figure 29 – Splitting training and test sets considering the target domain and context

under test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Figure 30 – Overall prediction error (MAE) for cross-domain algorithms by varying

user overlap level in the temporal dimension (source domain: book, and
target domain: television). . . . . . . . . . . . . . . . . . . . . . . . . . 133

Figure 31 – Overall prediction performance (MAE) boxplots for television domain
in the temporal dimension with different user overlap levels (source
domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Figure 32 – F-metric performance x top ‘N’ items for the television domain in the
temporal dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Figure 33 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the temporal dimension (target domain:
television, and source: book). . . . . . . . . . . . . . . . . . . . . . . . 136

Figure 34 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the location dimension (source domain: book, and
target domain: television). . . . . . . . . . . . . . . . . . . . . . . . . . 137

Figure 35 – Overall prediction error (RMSE) for cross-domain algorithms by varying
user overlap level in the location dimension (source domain: book, and
target domain: television). . . . . . . . . . . . . . . . . . . . . . . . . . 138

Figure 36 – Overall prediction performance (MAE) boxplots for television domain in
the location dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139


Figure 37 – F-metric performance x top ‘N’ items for the television domain in the
location dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Figure 38 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the location dimension (target domain:
television, and source: book). . . . . . . . . . . . . . . . . . . . . . . . 141

Figure 39 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the companion dimension (source domain: book,
and target domain: television). . . . . . . . . . . . . . . . . . . . . . . 142

Figure 40 – Overall prediction performance (MAE) boxplots for television domain
in the companion dimension with different user overlap levels (source
domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Figure 41 – F-metric performance x top ‘N’ items for the television domain in the
companion dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Figure 42 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the companion dimension (target domain:
television, and source: book). . . . . . . . . . . . . . . . . . . . . . . . 145

Figure 43 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal and location dimensions (source
domain: book, and target domain: television). . . . . . . . . . . . . . . 147

Figure 44 – Overall prediction performance (MAE) boxplots for television domain
in the temporal and location dimensions with different user overlap
levels (source domain: book). . . . . . . . . . . . . . . . . . . . . . . . 148

Figure 45 – F-metric performance x top ‘N’ items for the television domain in the
temporal and location dimensions with different user overlap levels
(source domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Figure 46 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the temporal and location dimensions
(target domain: television, and source: book). . . . . . . . . . . . . . . 150

Figure 47 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal dimension (source domain: television,
and target domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . 151

Figure 48 – Overall prediction performance (MAE) boxplots for book domain in the
temporal dimension with different user overlap levels (source domain:
television). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Figure 49 – F-metric performance x top ‘N’ items for the book domain in the
temporal dimension with different user overlap levels (source domain:
television). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153


Figure 50 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the temporal dimension (target domain:
book, and source: television). . . . . . . . . . . . . . . . . . . . . . . . 154

Figure 51 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the location dimension (source domain: television,
and target domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . 156

Figure 52 – Overall prediction performance (MAE) boxplots for book domain in the
location dimension with different user overlap levels (source domain:
television). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Figure 53 – F-metric performance x top ‘N’ items for the book domain in the location
dimension with different user overlap levels (source domain: television). 158

Figure 54 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the location dimension (target domain:
book, and source: television). . . . . . . . . . . . . . . . . . . . . . . . 159

Figure 55 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the companion dimension (source domain: television,
and target domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . 160

Figure 56 – Overall prediction performance (MAE) boxplots for book domain in the
companion dimension with different user overlap levels (source domain:
television). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Figure 57 – F-metric performance x top ‘N’ items for the book domain in the
companion dimension with different user overlap levels (source domain:
television). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

Figure 58 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the companion dimension (target domain:
book, and source: television). . . . . . . . . . . . . . . . . . . . . . . . 163

Figure 59 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal and location dimensions (source
domain: television, and target domain: book). . . . . . . . . . . . . . . 164

Figure 60 – Overall prediction performance (MAE) boxplots for book domain in
the temporal and location dimensions with different user overlap levels
(source domain: television). . . . . . . . . . . . . . . . . . . . . . . . . 165

Figure 61 – F-metric performance x top ‘N’ items for the book domain in the
temporal and location dimensions with different user overlap levels
(source domain: television). . . . . . . . . . . . . . . . . . . . . . . . . 166

Figure 62 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the temporal and location dimensions
(target domain: book, and source: television). . . . . . . . . . . . . . . 167


Figure 63 – Predictive performance (MAE) for the algorithms by varying target
domain (book and TV), contextual dimension and user overlap levels
(dispersion diagram). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

Figure 64 – Predictive performance (RMSE) for the algorithms by varying target
domain (book and TV), contextual dimension and user overlap levels
(dispersion diagram). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

Figure 65 – Classification performance (F-metric with N=5) for the algorithms by
varying target domain (book and TV), contextual dimension and user
overlap levels (dispersion diagram). . . . . . . . . . . . . . . . . . . . . 172

Figure 66 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal dimension (source domain: book, and
target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

Figure 67 – Overall prediction error (RMSE) for cross-domain algorithms by varying
user overlap level in the temporal dimension (source domain: book, and
target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

Figure 68 – Overall prediction performance (MAE) boxplots for Music domain in
the temporal dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

Figure 69 – F-metric performance x top ‘N’ items for the Music domain in the
temporal dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Figure 70 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the temporal dimension (target domain:
Music, and source: book). . . . . . . . . . . . . . . . . . . . . . . . . . 179

Figure 71 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the location dimension (source domain: book, and
target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Figure 72 – Overall prediction performance (MAE) boxplots for Music domain in
the location dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Figure 73 – F-metric performance x top ‘N’ items for the Music domain in the
location dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

Figure 74 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the location dimension (target domain:
Music, and source: book). . . . . . . . . . . . . . . . . . . . . . . . . . 184

Figure 75 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the companion dimension (source domain: book,
and target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . 185


Figure 76 – Overall prediction error (RMSE) for cross-domain algorithms by varying
user overlap level in the companion dimension (source domain: book,
and target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . 185

Figure 77 – Overall prediction performance (MAE) boxplots for Music domain in the
companion dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

Figure 78 – F-metric performance x top ‘N’ items for the Music domain in the
companion dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Figure 79 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the companion dimension (target domain:
Music, and source: book). . . . . . . . . . . . . . . . . . . . . . . . . . 188

Figure 80 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal and location dimensions (source
domain: book, and target domain: music). . . . . . . . . . . . . . . . . 189

Figure 81 – Overall prediction error (RMSE) for cross-domain algorithms by varying
user overlap level in the temporal and location dimensions (source
domain: book, and target domain: music). . . . . . . . . . . . . . . . . 189

Figure 82 – Overall prediction performance (MAE) boxplots for Music domain in
the temporal and location dimensions with different user overlap levels
(source domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Figure 83 – F-metric performance x top ‘N’ items for the Music domain in the
temporal and location dimensions with different user overlap levels
(source domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

Figure 84 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the temporal and location dimensions
(target domain: Music, and source: book). . . . . . . . . . . . . . . . . 193

Figure 85 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal dimension (source domain: Music,
and target domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . 195

Figure 86 – Overall prediction performance (MAE) boxplots for book domain in the
temporal dimension with different user overlap levels (source domain:
Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

Figure 87 – F-metric performance x top ‘N’ items for the book domain in the
temporal dimension with different user overlap levels (source domain:
Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Figure 88 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the temporal dimension (target domain:
book, and source: Music). . . . . . . . . . . . . . . . . . . . . . . . . . 198


Figure 89 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the location dimension (source domain: book, and
target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

Figure 90 – Overall prediction performance (MAE) boxplots for Music domain in
the location dimension with different user overlap levels (source domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Figure 91 – F-metric performance x top ‘N’ items for the book domain in the location
dimension with different user overlap levels (source domain: Music). . . 201

Figure 92 – Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the location dimension (target domain:
book, and source: Music). . . . . . . . . . . . . . . . . . . . . . . . . . 202

Figure 93 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the companion dimension (source domain: Music,
and target domain: book). . . . . . . . . . . . . . . . . . . . . . . . . . 203

Figure 94 – Overall prediction performance (MAE) boxplots for book domain in the
companion dimension with different user overlap levels (source domain:
Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

Figure 95 – F-metric performance x top ‘N’ items for the book domain in the
companion dimension with different user overlap levels (source domain:
Music). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

Figure 96 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the companion dimension (target domain:
book, and source: Music). . . . . . . . . . . . . . . . . . . . . . . . . . 207

Figure 97 – Overall prediction error (MAE) for cross-domain algorithms by varying
user overlap level in the temporal and location dimensions (source
domain: music, and target domain: book). . . . . . . . . . . . . . . . . 208

Figure 98 – Overall prediction error (RMSE) for cross-domain algorithms by varying
user overlap level in the temporal and location dimensions (source
domain: music, and target domain: book). . . . . . . . . . . . . . . . . 208

Figure 99 – Overall prediction performance (MAE) boxplots for book domain in
the temporal and location dimensions with different user overlap levels
(source domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . . 209

Figure 100 –F-metric performance x top ‘N’ items for the book domain in the
temporal and location dimensions with different user overlap levels
(source domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . . 211

Figure 101 –Overall classification performance (F-metric at 5) for the algorithms
by varying user overlap level in the temporal and location dimensions
(target domain: book, and source: Music). . . . . . . . . . . . . . . . . 212


Figure 102 –Predictive performance (MAE) for the algorithms by varying target
domain (book and music), contextual dimension and user overlap levels
(dispersion diagram). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

Figure 103 –Predictive performance (RMSE) for the algorithms by varying target
domain (book and music), contextual dimension and user overlap levels
(dispersion diagram). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Figure 104 –Classification performance (F-metric with N=5) for the algorithms by
varying target domain (book and music), contextual dimension and user
overlap levels (dispersion diagram). . . . . . . . . . . . . . . . . . . . . 216


List of Tables

Table 1 – Summary of techniques for representation of context (VIEIRA; TEDESCO;
SALGADO, 2009). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Table 2 – Cross-Domain CF-based RS using the Merging user preferences approach. 58
Table 3 – Classification of context-aware-based related works regarding cross-

domain RS aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Table 4 – Classification of context-aware-based related works with respect to CARS

aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Table 5 – Main limitations of context-aware-based related works in comparison to

our proposed CD-CARS. . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Table 6 – Classification accuracy of the companion extraction. . . . . . . . . . . . 102
Table 7 – Information gain of contextual attributes in different target domains for

the book-television dataset. . . . . . . . . . . . . . . . . . . . . . . . . . 103
Table 8 – Information gain of contextual attributes in different target domains for

the book-music dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Table 9 – Cross-domain and single-domain “book-television dataset” properties

with 100% of user overlap. . . . . . . . . . . . . . . . . . . . . . . . . . 105
Table 10 – “book-television dataset” properties with 50% of user overlap when “TV”

is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Table 11 – “book-television dataset” properties with 10% of user overlap when “TV”

is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Table 12 – “book-television dataset” properties with 50% of user overlap when

“Book” is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . 106
Table 13 – “book-television dataset” properties with 10% of user overlap when

“Book” is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . 107
Table 14 – Cross-domain and single-domain “book-music dataset” properties with

100% of user overlap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Table 15 – “book-music dataset” properties with 50% of user overlap when “Music”

is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Table 16 – “book-music dataset” properties with 10% of user overlap when “Music”

is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Table 17 – “book-music dataset” properties with 50% of user overlap when “Book”

is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Table 18 – “book-music dataset” properties with 10% of user overlap when “Book”

is the target domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108


Table 19 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Temporal dimension (source domain: Book, and target domain: Television).132

Table 20 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Location dimension (source domain: Book, and target domain: Television).137

Table 21 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from
the Companion dimension (source domain: Book, and target domain:
Television). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Table 22 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual value combina-
tions from the temporal and location dimensions (source domain: Book,
and target domain: Television). . . . . . . . . . . . . . . . . . . . . . . . 146

Table 23 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from
the Temporal dimension (source domain: Television, and target domain:
Book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Table 24 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Location dimension (source domain: Television, and target domain: Book).155

Table 25 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Companion dimension (source domain: television, and target domain:
book). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Table 26 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual value com-
binations from the temporal and location dimensions (source domain:
Television, and target domain: Book). . . . . . . . . . . . . . . . . . . . 164

Table 27 – Overall predictive performance (MAE) of the proposed algorithms in
comparison to the best baseline one by varying target domain (book and
TV), contextual dimension and user overlap levels. . . . . . . . . . . . . 173

Table 28 – Overall classification performance (F-metric with N=5) of the proposed
algorithms in comparison to the best baseline one by varying target
domain (book and TV), contextual dimension and user overlap levels. . 174

Table 29 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Temporal dimension (source domain: Book, and target domain: Music). 175


Table 30 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Location dimension (source domain: Book, and target domain: Music). 180

Table 31 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Companion dimension (source domain: Book, and target domain: Music).184

Table 32 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual value combina-
tions from the temporal and location dimensions (source domain: Book,
and target domain: Music). . . . . . . . . . . . . . . . . . . . . . . . . . 188

Table 33 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Temporal dimension (source domain: Music, and target domain: Book). 194

Table 34 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Location dimension (source domain: Book, and target domain: Music). 198

Table 35 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual values from the
Companion dimension (source domain: Music, and target domain: Book).203

Table 36 – Overall predictive performance (MAE/RMSE) with standard deviation
(std) by varying the user overlap level for all contextual value combina-
tions from the temporal and location dimensions (source domain: Music,
and target domain: Book). . . . . . . . . . . . . . . . . . . . . . . . . . 207

Table 37 – Overall predictive performance (MAE) of the proposed algorithms in
comparison to the best baseline one by varying target domain (book and
music), contextual dimension and user overlap levels. . . . . . . . . . . . 213

Table 38 – Overall classification performance (F-metric with N=5) of the proposed
algorithms in comparison to the best baseline one by varying target
domain (book and music), contextual dimension and user overlap levels. 217


Contents

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.1 Contextualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.5 Proposal Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.6 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.7 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2 BACKGROUND AND RELATED WORK . . . . . . . . . . . . . . . 32
2.1 Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.1.1 Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.2 User Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2 Cross-Domain Recommender Systems . . . . . . . . . . . . . . . . . . 38
2.2.1 Definition of Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2.2 Cross-Domain Recommendation Tasks . . . . . . . . . . . . . . . . . . . . 39
2.2.3 Cross-Domain Recommendation Goals . . . . . . . . . . . . . . . . . . . . 41
2.2.4 Cross-Domain Recommendation Scenarios . . . . . . . . . . . . . . . . . . 42
2.2.5 Cross-Domain Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.6 Cross-Domain Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.6.1 Evaluation Data Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.6.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.6.3 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3 Context-Aware Recommender Systems . . . . . . . . . . . . . . . . . 47
2.3.1 Definition of Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.2 Modelling Contextual Information . . . . . . . . . . . . . . . . . . . . . . 48
2.3.3 Obtaining Contextual Information . . . . . . . . . . . . . . . . . . . . . . 51
2.3.4 Contextual Information Relevance . . . . . . . . . . . . . . . . . . . . . . 52
2.3.5 Context-Aware Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.3.6 CARS Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4.1 Cross-Domain Recommendation based on Collaborative Filtering . . . . . . 56
2.4.2 Cross-Domain Recommendation based on Context-Awareness . . . . . . . . 61
2.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67


3 CD-CARS PROPOSAL . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.1 CD-CARS Problem Formalization . . . . . . . . . . . . . . . . . . . . 68
3.2 Modelling Contextual Information . . . . . . . . . . . . . . . . . . . . 69
3.2.1 Contextual Features Formalization . . . . . . . . . . . . . . . . . . . . . . 69
3.2.2 Obtaining and Selecting Relevant Contextual Information . . . . . . . . . . 72
3.3 CD-CARS Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.1 Proposed Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.1.1 Cross-Domain PreF Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3.1.2 Cross-Domain PostF Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.3.1.3 Cross-Domain Modelling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3.1.4 Cross-Domain Hybrid Contextual Algorithms . . . . . . . . . . . . . . . . . . . . 82
3.3.2 Base Cross-Domain Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 83
3.3.2.1 Single-Domain as Cross-domain Algorithms . . . . . . . . . . . . . . . . . . . . . 84
3.3.2.1.1 Neighborhood-based Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.3.2.1.2 Matrix factorization algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.3.2.2 Cross-Domain Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4 CD-CARS IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . 92
4.1 Dataset Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.1.1 Obtaining Contextual Information . . . . . . . . . . . . . . . . . . . . . . 94
4.1.1.1 Temporal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.1.1.2 Location Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.1.3 Companion Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.1.2 Selecting Relevant Contextual Attributes and Values . . . . . . . . . . . . 102
4.1.3 Cross-Domain Datasets Description . . . . . . . . . . . . . . . . . . . . . 104
4.1.3.1 Book-Television dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.1.3.2 Book-Music dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.2 Contextual Model Implementation . . . . . . . . . . . . . . . . . . . . 108
4.3 Proposed Algorithms Implementation . . . . . . . . . . . . . . . . . . 112
4.3.1 Pre-filtering Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.3.2 Post-filtering Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.4 Base Cross-domain Algorithm Implementation . . . . . . . . . . . . . 123
4.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

5 CD-CARS EVALUATION . . . . . . . . . . . . . . . . . . . . . . . . 127
5.1 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.1.1 Settings of the Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.1.2 Predictive Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.1.3 Classification Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 129


5.1.4 Sensitivity Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.1.5 Statistical Significance Analysis . . . . . . . . . . . . . . . . . . . . . . . 131
5.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.2.1 Book-Television Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2.1.1 Television as Target Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2.1.1.1 Temporal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2.1.1.2 Location Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.2.1.1.3 Companion Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.2.1.1.4 Combining Contextual Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.2.1.2 Book as Target Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.2.1.2.1 Temporal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.2.1.2.2 Location Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.2.1.2.3 Companion Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.2.1.2.4 Combining Contextual Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.2.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.2.2 Book-Music Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.2.2.1 Music as Target Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.2.2.1.1 Temporal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.2.2.1.2 Location Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
5.2.2.1.3 Companion Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.2.2.1.4 Combining Contextual Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.2.2.2 Book as Target Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.2.2.2.1 Temporal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.2.2.2.2 Location Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5.2.2.2.3 Companion Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.2.2.2.4 Combining Contextual Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.2.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
5.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
5.3 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

6 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
6.3 Lines for Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . 224

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227


23

1 Introduction

In this chapter, we contextualize and motivate the problem addressed in this
thesis, respectively, in Section 1.1 and Section 1.2. The problem statement is described in
Section 1.3. In Section 1.4, we define the objectives of this thesis and in Section 1.5 we
describe an overview of the proposal. At last, we highlight the expected contributions and
describe the structure of this thesis (Section 1.7).

1.1 Contextualization

In the last years, the growth of the Internet has increased the amount of information
available for users. Consequently, the task of finding relevant information among a myriad
of options became a problem. This problem is traditionally known as the information
overload problem (RESNICK et al., 1994)(HILL et al., 1995)(SHARDANAND; MAES,
1995)(ADOMAVICIUS; TUZHILIN, 2005)(RICCI; ROKACH; SHAPIRA, 2011). Taking
into account the variety of information provided by applications on the Internet, we can
consider information as any item capable of being consumed and rated by a user (e.g.
movies, books, music, and so on). Given that, the information overload problem makes
the process of finding relevant items in an extensive amount of options difficult for a user.

Fortunately, this problem has received notable attention of researchers in the
Artificial Intelligence area. In this way, recommender systems (RSs) have been designed
in order to solve this problem (ADOMAVICIUS; TUZHILIN, 2005)(RICCI; ROKACH;
SHAPIRA, 2011). For example, a recommender system (RS) can be used for suggesting to
users interesting movies to watch, books to read, music to listen to, etc. The suggestions
(recommendations) are provided according the users’ profile, which could be inferred from
the consumption log, for instance.

Recently, a large number of Web sites and applications have adopted recommender
systems to provide their users with more relevant items, such as Amazon1, Netflix2,
Youtube3, Last.fm 4, BookCrossing5, Buscape6, among many others. However, most of
these systems are developed to recommend items in a specific domain such as movies,
videos, music, books, and so on. Thus, they are known as single-domain RS, since they
just consider the user profile from a unique domain to recommend relevant items in the
same domain. For example, a single-domain RS could recommend a movie based on the
1 Amazon e-commerce site, http://www.amazon.com
2 Netflix on-demand streaming media site, http://www.netflix.com
3 YouTube video sharing site, https://www.youtube.com
4 Last.fm on-line radio, http://www.last.fm
5 BookCrossing on-line book exchange site, http://bookcrossing.com
6 Buscape helps users to find best prices for products and services, http://www.buscape.com.br


24 Chapter 1. Introduction

previous movies watched or recommend a book similar to other ones read by a user. It is
important to mention that despite there are several definitions of “domain”, our notion of
domain can be expressed as a “kind of item” (e.g. movies, music, books, news, among
others) (FERNÁNDEZ-TOBÍAS et al., 2012).

Although single-domain RSs have achieved a good quality in suggesting relevant
items for users, some issues still are significant for the information overload problem
(RICCI; ROKACH; SHAPIRA, 2011), such as:

• cold-start: it is related to situations in which a RS is unable to generate recommen-
dations due to an initial lack of user preferences;

• sparsity: when the average number of ratings per user and item is low, which may
negatively affect the quality of the recommendations, a sparsity problem might exist;

• diversity: when very similar or redundant items are recommended, which may not
satisfy the users;

• accuracy: even when the issues mentioned above are resolved, the RS may still not
be accurate, which means that incorrect predictions of ratings may be made as well
as a list of recommended items may not satisfy a user; among others.

In order to alleviate these problems, cross-domain recommender systems (WINOTO;
TANG, 2008)(CREMONESI; TRIPODI; TURRIN, 2011) have arisen, aiming to improve
the quality of single-domain recommendations (FERNÁNDEZ-TOBÍAS et al., 2012).
Instead of treating each domain independently, cross-domain RSs use knowledge acquired
in a source domain (e.g. books) to improve the recommendation of a target domain (e.g.
movies). To illustrate this, consider a user with no information about his/her favorite
movie genres. This lack of information can be inferred either from his/her favorite book
genres or from his/her similarity with users across different domains, for example. One of
the first studies on this emerging research topic was that presented in (WINOTO; TANG,
2008), which investigated if the consumption behaviors on related items from different
domains could be useful to make recommendations in a target domain. As shown by
Winoto e Tang (2008), joint recommendations of items from multiple domains may be less
accurate, but more diverse than recommendations of items in a single domain.

Since the first cross-domain recommender systems have arisen, several approaches
have been proposed to deal with different goals (FERNÁNDEZ-TOBÍAS et al., 2012).
For instance, knowledge-based recommender systems try to exploit knowledge about users
and items besides the relationship between them in order to produce recommendations
(TREWIN, 2000). In this way, knowledge-based recommender systems demand a great
amount of knowledge about users and items (and their domains), which should be stored
and organized in a way that enables inferring and reasoning. However, that knowledge


1.2. Motivation 25

acquisition is a very difficult process and a knowledge engineer is required to construct the
knowledge-base which causes a bottle-neck for the knowledge-based recommender systems
(AZAK, 2010).

As described above, the knowledge-based recommender systems rely on an “ad-hoc”
approach, which may be difficult to customize to new situations, once that they are usually
designed for a specific domain (TREWIN, 2000). However, other approaches have been
successfully adopted for cross-domain RS and, in general, require little domain knowledge
(FERNÁNDEZ-TOBÍAS et al., 2012), since they are based on simple information obtained
from user ratings.

Fernández-Tobías et al. (2012) stated that domains can be explicitly or implicitly
linked by means of content-based (CBF) or collaborative filtering (CF) characteristics asso-
ciated with users and/or items, such as ratings, social tags, and latent factors. Cremonesi,
Tripodi e Turrin (2011) surveyed and categorized cross-domain collaborative filtering
recommender systems (CD-CFRS), which recommend items from target domain explor-
ing the similarities between users considering ratings from source and target domains,
as illustrated in Figure 1. Likewise single-domain RS research, collaborative filtering
is considered the most popular and widely implemented approach in cross-domain RS,
because its implementation and integration in existing domains is relatively easy as well
as its quality is generally higher than other approaches (ADOMAVICIUS; TUZHILIN,
2005)(FERNÁNDEZ-TOBÍAS et al., 2012).

Figure 1 – Cross-domain collaborative filtering recommendation (based on (CREMONESI;
TRIPODI; TURRIN, 2011)(SANTOS et al., 2012)).

1.2 Motivation

In fact, the majority of the CD-CFRS can give better recommendations in compar-
ison to single-domain RS, leading to a higher user satisfaction and addressing cold-start,
sparsity, and diversity problems, but not necessarily, they can be more accurate than single-
domain collaborative filtering RS (WINOTO; TANG, 2008)(FERNÁNDEZ-TOBÍAS et al.,


26 Chapter 1. Introduction

2012). Meanwhile, context-aware recommender systems (CARS) is another relevant topic
of research in RS and has been used in order to enhance the quality of recommendations
Adomavicius e Tuzhilin (2015), especially for providing accurate recommendations taking
into account the user’s context.

The context-aware approach uses different contextual information (e.g., loca-
tion, time, mood, etc.) to improve the accuracy of recommendations (ADOMAVICIUS;
TUZHILIN, 2015), as illustrated in Figure 2. In many applications, such as recommending
a vacation package, recommending a TV program, among others, it may not be sufficient
to consider only users and items, it is also important to incorporate contextual information
into the recommendation process in order to recommend items to users under certain
circumstances (ADOMAVICIUS; TUZHILIN, 2015). For example, using the temporal
context, a travel recommender system could provide a recommendation in the winter that
can be very different from the one in the summer. More specifically, on weekdays a user
might prefer to watch news programs when he/she turns on his/her TV in the morning,
or to watch soccer games at night, and on weekends to watch comedy movies.

Figure 2 – Context-aware collaborative filtering recommendation.

Researchers in the recommender systems area have recognized the importance of
contextual information (ADOMAVICIUS; TUZHILIN, 2015). In addition, in the cross-
domain RS field, Fernández-Tobías et al. (2012) and Cantador et al. (2015) highlighted
that context can be treated as a bridge between different domains and just a few works


1.3. Problem Statement 27

have considered context-aware techniques in cross-domain recommender systems. The use
of context-aware techniques in cross-domain RS is an interesting open research direction,
since the majority of the works about cross-domain recommender systems only adopt CBF
and CF approaches, both considering only users and considering items attributes, without
taking into account any additional contextual information.

Accurate prediction of user preferences undoubtedly depends upon the degree to
which the recommender system has incorporated the relevant contextual information into
a recommendation method (ADOMAVICIUS et al., 2005)(ADOMAVICIUS; TUZHILIN,
2015). In this way, using a context-aware approach in cross-domain recommender systems
may be useful to make suggestions under difficult situations such as in the cold-start,
sparsity and diversity problems while improving the accuracy of recommendations, in
comparison to traditional CD-CFRS, by using context and knowledge from different
domains.

We believe that the integration of techniques developed in isolation to cross-domain
and context-aware RS can be useful in a variety of situations, in which recommendations
can be improved by information from different sources and can be refined by considering
specific contextual information.

1.3 Problem Statement

In this thesis, we address the cross-domain recommendation problem under the
CF and context-awareness perspectives. Thus, besides considering user ratings, we also
consider the context of these ratings in the recommendations, which implies one more
dimension (context) in the user-rating matrices, or in this case, user-rating-context tensors
(User x Item x Context). In our cross-domain context-aware recommendation problem,
there exists a user-rating-context tensor for each domain (source and target). In both
source and target tensors, there is no item in common (no item overlap), but the same set
of contexts is observed (context overlap) and at least a user must have ratings in both
tensors (user overlap).

Therefore, our problem is to explore the user-rating-context tensors from source
and target domains to improve recommendations in the target domain, i.e., to improve
the estimation of unknown ratings for items in a target domain by exploiting the user-
rating-context tensors from these domains.

Based on the above problem, we state the hypothesis of this thesis:

The application of context-aware techniques can improve the accuracy of cross-domain
collaborative filtering recommendations.

It is important to mention that accuracy, in that hypothesis, is one quality aspect


28 Chapter 1. Introduction

of recommender systems. However, other aspects can also be addressed in this thesis such
as alleviating the cold-start and sparsity issues, or improving the recommendation diversity
or coverage, but they are not our main goals.

1.4 Objectives

The main objective of this thesis is to improve the accuracy of cross-domain
collaborative filtering recommendations by adding context-aware techniques. In order to
achieve this, we aim to build a cross-domain context-aware recommender system (CD-
CARS) through the proposal of novel algorithms for cross-domain recommendations.

Specific objectives of this thesis are:

• To allow that traditional CF-based algorithms can be used in combination with the
proposed algorithms.

• To allow that the proposed algorithms can be extended and configured.

• To allow that cross-domain context-aware recommendations can be made either to
more related domains (e.g. Book and Television) or less related ones (e.g. Book
and Music) among themselves. We consider this relation among distinct domains
according to the set of item genres of them. As more the domains have item genres
in common the more related they are considered (e.g. Book and Television have
several item genres in common such as “romance”, “educational”, “religion”, etc.).

• To provide a solution for identifying the most relevant contextual features that can
be used for a specific domain.

• To build a method responsible for generating contextual user profiles on different
domains.

• To develop a solution for extracting relevant information (e.g. contextual information,
user profiles in different domains, etc.) for the datasets used in the CD-CARS
evaluation.

1.5 Proposal Overview

As mentioned previously, this thesis aims to improve the quality of cross-domain
collaborative filtering recommendations by adding context-aware techniques. In order to
achieve this, it is necessary to understand current approaches available in the literature,
especially, about two emerging research areas: context-aware recommender systems (CARS)
and cross-domain recommender systems (CDRS).


1.5. Proposal Overview 29

After searching and exploring several approaches in these areas, we propose CD-
CARS algorithms based on distinct context-aware paradigms (pre-filtering, post-filtering,
and modelling) (ADOMAVICIUS; TUZHILIN, 2015) combined with a CD-CFRS aiming
to improve its quality (accuracy of the recommendations). Thus, this CD-CFRS is
transformed into a CD-CARS which recommends items from target domain exploring
the similarities between users considering ratings, and also their contexts, from source
and target domains, as illustrated in Figure 3. The CD-CFRS algorithms adopted in the
proposal belong to two traditional CF-based categories: neighborhood-based and matrix
factorization.

Figure 3 – Cross-domain context-aware recommendation.

To illustrate the CD-CARS, suppose that a user X, who enjoys to read romance
books on weekdays and does not have any preference known about movies, is very similar
to another user Y that also enjoys romance books on weekdays and likes to watch action
movies on weekdays and comedy movies on weekends, so, a CD-CARS could prioritize
movies enjoyed by the user Y on the top of the recommended item list for the user X in
those particular contexts (comedy movies on weekends and action movies on weekdays),
just by knowing the book preferences from user X without his/her movie preferences.

Although many cross-domain RS have used contextual information in an ad-
hoc way as part of their knowledge-based approaches (see Section 2.4.2) (BLANCO-
FERNÁNDEZ et al., 2011)(MOE; AUNG, 2014b)(KAMINSKAS et al., 2014), to the


30 Chapter 1. Introduction

best of our knowledge, there are not works in the literature addressing the cross-domain
recommendation task by means of contextual features using systematic paradigms (pre-
filtering, post-filtering, and modelling) (FERNÁNDEZ-TOBÍAS et al., 2012)(CANTADOR
et al., 2015). Those knowledge-based cross-domain RSs may be difficult to customize
to new situations (domains) and to compare their performances with other approaches.
For instance, the knowledge-based framework proposed in (KAMINSKAS et al., 2014) is
specific for the considered domains (points of interest and music). Our CD-CARS, in turn,
relies on the use of systematic context-aware approaches (ADOMAVICIUS; TUZHILIN,
2015), which have been successfully adopted for single domain recommendation and in
general require just a little knowledge about the domain (e.g user ratings) and can be
customized to new domains in a simpler way.

1.6 Contributions

The contributions of this thesis are multiple, mainly in the recommender systems
area. The main ones are listed below:

1. The formalization of the cross-domain context-aware recommendation problem from
the survey of two emergent research fields: cross-domain and context-aware RS;

2. Improving the quality of CD-CFRS, through the realization of a CD-CARS by the
proposal of novel algorithms based on three distinct and systematic paradigms of
context-aware recommendation, which were chosen rather than ad-hoc context-aware
approaches;

3. Provision of real datasets for evaluating CD-CARS, taking into account different
domains and contextual information;

4. Providing a CD-CARS that can be useful to recommend items from any domain
(e.g. books, music, movies, etc.), which allows generating cross selling or bundle
recommendations for items from multiple domains (e.g. the recommendation of a
music accompanied of a movie to watch or a book to read).

Through the findings of this thesis, we expect to contribute for the cross-domain
RS area towards future research and challenges in cross-domain context-aware recommen-
dations.

1.7 Thesis Outline

This thesis is structured as follows:


1.7. Thesis Outline 31

• Chapter 1 introduces and states the cross-domain context-aware problem, describes
the proposal of this thesis as well as its objectives and contributions.

• Chapter 2 reviews the literature about recommender systems focusing on cross-
domain RS and context-aware RS. Also in this chapter, we compare related work to
our thesis.

• Chapter 3 presents the proposed CD-CARS. For that, we describe the formalization
of the CD-CARS problem, how the contextual information is modelled, the proposed
recommendation algorithms, and cross-domain CF-based algorithms adopted in
combination with the CD-CARS algorithms.

• Chapter 4 describes particular details of an implementation of the CD-CARS pro-
posal.

• Chapter 5 presents the results of an experimental evaluation of the implemented
CD-CARS as well as a discussion about the findings of this research. Besides, we
describe details about the experiments’ settings and evaluation metrics.

• Chapter 6 presents the conclusions, limitations and future works of this thesis.


32

2 Background and Related Work

In this chapter, we provide an explanation of concepts related to this thesis. Initially,
we describe the main concepts of recommender systems (Section 2.1) such as approaches,
algorithms, user profiling, and evaluation metrics. Following, we introduce concepts of
cross-domain recommender systems (Section 2.2) and their most common approaches.
Finally, we describe the foundations of context-aware recommender systems (Section 2.3).

These concepts are necessary as background for understanding the proposed CD-
CARS and for positioning the proposal of this thesis in the state-of-the-art research.
However, since the union of cross-domain and context-aware fields has not been deeply
explored, we also describe and classify cross-domain RS and CD-CARS, both related to
the proposal of this thesis (Section 2.4).

2.1 Recommender Systems

In recent years, recommender systems (RSs) have been crucial for dealing with the
information overload problem (ADOMAVICIUS; TUZHILIN, 2005). This issue is related
to the explosive growth and variety of information available on the Web, which frequently
has overwhelmed users with a large myriad of options.

RSs are software tools and techniques providing suggestions of relevant items to
users (RESNICK; VARIAN, 1997)(BURKE, 2002)(BURKE, 2007). The suggestions relate
to various decision-making processes, such as what products to buy, what music to listen
to, or what TV programs to watch. Therefore, recommender systems can help people to
identify contents of their interest among a large set of options available. These systems
became an important research area since the publication of landmark papers in the 1990’s,
when the term “collaborative filtering” was coined (RESNICK; VARIAN, 1997).

Since then, the number of research papers published has increased significantly in
many application fields (books, documents, images, movies, music, shopping, TV programs,
and others) (PARK et al., 2012), as well as the amount of commercial applications of
recommender systems by large companies such as Amazon.com (LINDEN; SMITH; YORK,
2003), Google (DAS et al., 2007), Last.fm (EYKE, 2009), Netflix (BENNETT; LANNING,
2007), among others.

For a better understanding of RSs, we describe some perspectives of them in the
following subsections.


2.1. Recommender Systems 33

2.1.1 Strategies

There are several RS strategies (or approaches) in the literature. A strategy can
be seen as a type or category of RS, and it may vary according to the paradigm of its
recommendation algorithm, i.e., how the recommendation is made. This variation of
strategies leads to different classifications of RS. Below, we describe eight categories of RS,
which are based on (BURKE, 2007), (RICCI; ROKACH; SHAPIRA, 2011) and (VÉRAS
et al., 2015) classifications.

1. Non-personalized: Non-personalized recommender algorithms present any user a
predefined list of items. Such algorithms usually serve as a baseline for more advanced
personalized algorithms. For example, one non-personalized algorithm, called Top
Popular (TopPop), recommends the top-N items (e.g. movies) with the highest
popularity (largest number of ratings) (CREMONESI; GARZOTTO; TURRIN,
2012).

2. Content-based filtering (CBF): Content-based recommendation systems try to recom-
mend similar items to those that a user has liked or consumed in the past (PAZZANI,
1999). Indeed, the basic process performed by a content-based recommender consists
in matching up the attributes of a user profile, in which preferences and interests are
stored, with the attributes of a item’s content, in order to recommend to that user
new interesting items. For example, TV contents that are similar (based on their
genres, actors, and so on) to those the user preferred in the past are recommended.
Since such RSs tend to recommend items with the same characteristics as the ones
that a user liked in the past, the recommended items typically lack novelty, meaning
the RS proposes a limited variety of unexpected (but relevant) recommendations
(ADOMAVICIUS; TUZHILIN, 2005).

3. Collaborative filtering (CF): Collaborative recommender systems ignore content and
exploit collective preferences of the crowd, i.e., they generate recommendations using
different users’ rating profiles and suggest items that other users with similar tastes
liked in the past (PAZZANI, 1999). The degree to which two users’ preferences
are considered similar is based on a similarity measure of their rating histories.
This approach can be illustrated by the expression: “people who watched this TV
program also watched...”. In addition, CF is considered the most popular and widely
implemented approach in RS, because its implementation and integration in existing
domains are relatively easy, and its quality is usually higher than that of CBF
algorithms. A common criticism of CF recommenders is that they tend to be biased
toward popularity, constraining the degree of diversity, so, they are not able to
recommend unrated items (related to cold-start and sparsity issues) (ADOMAVI-
CIUS; TUZHILIN, 2005). The “community-filtering” strategy can be considered a


34 Chapter 2. Background and Related Work

specialization of the collaborative filtering one (KAMAHARA et al., 2005). The
“community-filtering” strategy recommends items based on the preferences of the
user’s friends (or friend of a friend)(KAMAHARA et al., 2005)(BOURKE; MC-
CARTHY; SMYTH, 2011)(HAN et al., 2015). Evidence suggests that people tend
to rely more on recommendations from their friends than on recommendations from
similar but anonymous individuals (KAMAHARA et al., 2005). This observation,
combined with the growing popularity of social networks, is generating a rising
interest in community-based systems or, as or as they usually referred to, social
recommender systems (KAMAHARA et al., 2005).

4. Data mining: Significantly, many researchers have used data mining techniques
to improve the recommender systems performance (PARK et al., 2012). Data
mining techniques are defined as extracting or mining knowledge from data. These
techniques are used for the exploration and analysis of a large amount of data in
order to discover meaningful patterns and rules. They can be used to lead decision
making and to predict the effect of decisions. For example, TV programs can be
classified into two classes: “watched” and “not watched” and a user profile is then a
collection of attributes together with the number of times they occur in positive and
negative examples. Hence, the RS computes prior probabilities that a TV program
has to belong to one of those two classes and the conditional probability that a
feature is present if a TV program is classified into either positive or negative class.
It must be noted that features are, in this case, related to content (e.g. genre) or
not (e.g. time of the day).

5. Context-awareness: Context-awareness aims to give to its applications advantages
in the use of contextual information, such as the user’s location, to offer proactive
services to the user without any explicit request (ABOWD et al., 1999). In this way,
a personalization system based on context-awareness could adapt its functionality
or behavior so that it reacts differently depending on the user’s context (location,
friends, family, time, among others) and the resources available at that moment, in
accordance with his/her personal preferences (RICCI; ROKACH; SHAPIRA, 2011).
For example, a RS could recommend a TV program that suits the user and his
current situation: staying at home or on the train, in the noon or the evening, being
in front of his TV or smartphone. For each situation (context), the user’s preference
may be different, so, some contextual patterns could be found and exploited by
recommendation algorithms, e.g. knowing that a user particularly likes watching
sports in the evening time, at home (MOON et al., 2009).

6. Semantic-based: Semantic Web is based on describing Web resources by semantic an-
notations (meta-data), formalizing these annotations in an ontology and applying rea-
soning processes aimed at discovering new knowledge (BERNERS-LEE; HENDLER,


2.1. Recommender Systems 35

2001). The synergy between recommender systems and Semantic Web has already
been explored in many domains (including the TV domain), showing significant in-
creases in the recommendation accuracy (BLANCO-FERNÁNDEZ; PAZOS-ARIAS,
2008). Instead of employing traditional syntactic approaches, Semantic-based strat-
egy discovers semantic relationships between the users’ preferences and the items
available in the domain ontology through semantic similarity metrics. For example,
using the semantic approach, a RS could recommend a place to visit (e.g. offering a
tourist package) according to the places showed in a movie or sports game that a
user liked (BLANCO-FERNÁNDEZ et al., 2011).

The approaches described above focused on making recommendations for individual
users and do not consider the problem of group recommendation. The problem of
group recommendation has also been investigated recently (QUEIROZ; CARVALHO,
2004)(RICCI; ROKACH; SHAPIRA, 2011). Various techniques have been proposed,
targeting different types of recommendation items (e.g., movie, TV program, music) and
different groups (e.g., family, friends, dynamic social groups). Most group recommendation
techniques consider the preferences of individual users and propose various strategies
to either combine the individual user profiles into a single group profile (a pseudo user)
and make recommendations for that pseudo user (BRUSILOVSKY; KOBSA; NEJDL,
2007), or generate recommendation lists for individual group members and merge the
lists for group recommendation(MARILLY et al., 2011). This kind of approach is usually
adopted in the TV domain, because watching TV activity is, traditionally, performed by
a group of people (e.g. a family). For example, a TV program could be recommended
according to the average rating of a group of users (based on their individual preferences)
(BRUSILOVSKY; KOBSA; NEJDL, 2007).

2.1.2 User Profiling

The user profile, which usually is composed of preferences and personal character-
istics, is one of the most important aspects of the recommendation process. Recommender
systems necessarily make use of user profiles in order to recommend items related to those
profiles. However, there are different approaches for creating a user profile. Each approach
has its benefits and its limitations (UBERALL; MUTTUKRISHNAN, 2009). Therefore,
this is an important perspective of any recommender system, which can be categorized
into three categories as follows (VÉRAS et al., 2015):

1. Explicit Profiling: An explicit profile can be created by a user in the first time that
he/she logs in the recommendation system. In this case, users set their preferences
(interests) such as favorite TV shows or genre of TV shows, favorite actors of movies,
favorite channels, ratings for movies (e.g. rating “four” for a TV show on a scale of


36 Chapter 2. Background and Related Work

zero to five), among others. Furthermore, users can modify any information of their
profiles at any moment through the system (UBERALL; MUTTUKRISHNAN, 2009).
However, the explicit profiling approach could bother and tire users (REICHLING;
WULF, 2009) in order to fulfill their profiles every time that they find something
interesting on TV, for example.

2. Implicit Profiling: On other hand, an implicit profile can be created automatically by
the recommender system. In this case, the RS logs and saves the viewing behaviour
of a user (UBERALL; MUTTUKRISHNAN, 2009). Through the user’s log, like
watched programs (watching time, watching duration, genre, etc.), his/her preferences
(interests) are inferred (instead of explicitly set). This inference may result in user’s
favorite TV shows (or genre of TV shows), favorite actors of movies, favorite channels,
and even a rating for a movie (e.g. calculated using the ratio between the watching
duration and the TV show duration), among others. Sometimes, this approach
could have the problem of incorrectly expressing the user profile (HU; KOREN;
VOLINSKY, 2008), because the user could be sleeping or doing something else while
TV is on, for example.

3. Contextual Profiling: The “contextual profile” usually is generated by “Context-
Aware Recommendation Systems”(ABBAR; BOUZEGHOUB; LOPEZ, 2009). In this
approach, the user’s profile is created through the relationship between contexts and
“common” user profiles (explicit or implicit profiling). Thus, the recommendation
process is based on the contextual profile, which contains contextual information
besides user preferences (MUKHERJEE et al., 2011). Users’ contextual information
can be obtained explicitly or implicitly (most common), such as location, friends,
family members, watching day/time, activity, and so on. Therefore, according to
the manner that the contextual information is obtained - explicitly or implicitly, the
same issues from these approaches are applied for the contextual profile.

2.1.3 Evaluation

Evaluation of recommender systems is fundamental in assessing the quality of their
recommendations. However, many different measures have been defined in the literature
with the aim of making better choices in general or for a specific application area. Likewise
the RS algorithms, we describe evaluation metrics in a top-level way, as follows (VÉRAS
et al., 2015):

1. Qualitative measures: these measures are used when we want a model to minimize
the number of errors. Hence, these metrics are usual in many direct applications of
recommenders. Inside this category, some of these measures are more appropriate


2.1. Recommender Systems 37

for some kinds of recommenders, predictors or information retrieval tasks. Exam-
ples of these measures are Accuracy (LEE; YANG, 2003), F-measure (LEKAKOS;
GIAGLIS, 2004), Coverage (LEKAKOS; CARAVELAS, 2008), Diversity (CRE-
MONESI; TURRIN, 2010), among others (RICCI; ROKACH; SHAPIRA, 2011).
During the evaluation, items are commonly labeled as relevant or irrelevant for a
user and a metric is adopted to measure the quality of the items once classified by
the RSs. The recommendation problem is treated as a classification task.

2. Probabilistic (Predictive) measures: these measures are especially useful when we
want an assessment of the reliability of the predictions returned by a RS, whether
they have recommended a non-relevant item with high or low probability. The
main examples of these measures are Root Mean Squared Error (RMSE) and Mean
Absolute Error (MAE) (SHANI; GUNAWARDANA, 2011)(HERLOCKER et al.,
2004). The recommendation problem, in this case, is usually treated as a regression
task when the actual rate of an item is compared to a rate or score predicted by the
RS.

3. Ranking (Classification) measures: these measures are very common in the RSs
area because they are based on “how well” recommender systems rank the rec-
ommended items. Thus, there are many examples of evaluation metrics in this
category, such as Precision and Recall curves (ZHIWEN; XINGSHE, 2003), Nor-
malized Discounted Cumulative Gain (NDCG) (BALTRUNAS; MAKCINSKAS;
RICCI, 2010), Mean Average Precision (MAP) (HOPFGARTNER; JOSE, 2010), hit
rate (HR) (O’SULLIVAN; SMYTH; WILSON, 2004), Fall-out (JOJIC; SHUKLA;
BHOSAREKAR, 2011), Area under the ROC Curve (AUC) (ZHANG; ZHENG,
2005), Breese score (BREESE; HECKERMAN; KADIE, 1998), among others (RICCI;
ROKACH; SHAPIRA, 2011). Differently from the previous category, these metrics
assess the quality of a ranking of items returned by the RS instead of the average
quality of the raw scores returned by the RS. The recommendation problem is treated
in this case as a ranking task.

4. User satisfaction: in this category, there are empirical experiments with users in
order to verify their satisfaction about the RS (BLANCO-FERNáNDEZ et al., 2008).
Although this method is used in many RSs and collects personal feedback of users
(LóPEZ-NORES et al., 2009), this kind of evaluation may have some problems,
such as biases, lack of an objective measure for RS quality assessment, comparison
between different systems, and so on (RICCI; ROKACH; SHAPIRA, 2011).


38 Chapter 2. Background and Related Work

2.2 Cross-Domain Recommender Systems

Nowadays, the majority of recommender systems provide recommendations for
items belonging to a single domain. For instance, Netflix recommends movies (BENNETT;
LANNING, 2007), Last.fm recommends songs (EYKE, 2009), among others. These single-
domain recommender systems have been successfully adopted by several websites, however,
some of them such as Amazon1 and eBay2 usually maintain user preferences for items
from multiple domains. In addition, it is common that users of social networks provide
their preferences and interests for a variety of items from distinct domains (e.g. music,
books, movies, etc.) (SHAPIRA; ROKACH; FREILIKHMAN, 2013).

Leveraging all the user preferences available in several systems or domains may
be useful for generating better recommendations, e.g., by alleviating the cold-start and
sparsity problems in a target domain, or by providing recommendations for items from
multiple domains. Thus, cross-domain recommender systems aim to generate or improve
recommendations in a target domain (e.g. music, etc.) by exploiting knowledge from a
source domain (e.g. books, etc.) (FERNÁNDEZ-TOBÍAS et al., 2012).

The cross-domain approach is a challenging and emergent field of recommender sys-
tems (CANTADOR et al., 2015). Although it has been addressed from distinct perspectives,
there are several distinct definitions of the cross-domain recommendation task. According
to surveys about cross-domain RS (FERNÁNDEZ-TOBÍAS et al., 2012)(CANTADOR et
al., 2015), we describe some of its perspectives in the following subsections.

2.2.1 Definition of Domain

In the literature, researchers have considered different definitions of “domain”. For
instance, some of them have considered items like movies and books as belonging to distinct
domains, while others have considered different item genres as different item domains
(e.g. “action movies” and “comedy movies”). Cantador et al. (2015) define the “domain”
concept regarding the attributes and types of recommended items. They consider that
domain may be defined at four levels (see Figure 4):

• (Item) Attribute level. Recommended items have the same type and the same
attributes, but they differ in the value of a certain attribute. For instance, two
movies of different genres (e.g. “action movies” and “comedy movies”) belong to
distinct domains (CAO; LIU; YANG, 2010).

• (Item) Type level. In this level, recommended items have similar types and have some
attributes in common. For example, movies and TV programs belong to distinct

1 http://www.amazon.com
2 http://www.ebay.com


2.2. Cross-Domain Recommender Systems 39

domains, since they have some attributes in common (title, genre, etc.), but they
also have different ones (e.g., airtime, channel, etc.) (HU et al., 2013)(LONI et al.,
2014).

• Item level. Recommended items have different types and attributes (or the majority
of them). For instance, movies and books belong to distinct domains even with
some attributes in common (title, release/publication year, etc.) (GAO et al.,
2013)(ENRICH; BRAUNHOFER; RICCI, 2013).

• System level. In this level, recommended items are from different systems, which are
considered as distinct domains. For example, a user could rate a movie in MovieLens3

as well as in Netflix4 (PAN; XIANG; YANG, 2012)(PAN; YANG, 2013).

Figure 4 – “Domain” definitions according to attributes and types of recommended items
(CANTADOR et al., 2015).

It is important to mention that the notion of domain adopted in this thesis is
based on the Item level, considering that movies and books belong to different domains,
for example.

2.2.2 Cross-Domain Recommendation Tasks

In the literature about cross-domain RSs, the works usually aim to exploit knowledge
from a source domain to generate better recommendations in a target domain. Despite
3 https://movielens.org/
4 https://www.netflix.com


40 Chapter 2. Background and Related Work

there is not a unified definition of cross-domain recommender systems, Cantador et al.
(2015) identified three recommendation tasks of them:

• Multi-domain recommendation. The task is to recommend items in both source
and target domains by exploiting knowledge from both domains. In this case, a
significant user overlap may be necessary (CARMAGNOLA; CENA, 2009). This
recommendation task is becoming feasible since users maintain profiles in several
interconnected social networks or websites (CARMAGNOLA; CENA; GENA, 2011).

• Linked-domain recommendation. In this task, items are recommended only in the
target domain by exploiting knowledge from the source and target domains. This
recommendation task has been mainly explored to improve the recommendations in a
target domain where there is a lack of user preferences either caused by cold-start or
sparsity problems (LOW; AGARWAL; SMOLA, 2011). A minimal user overlap may
be necessary to perform this task (VERAS et al., 2015), and some approaches aim
to establish knowledge-based links between source and target domains (MORENO
et al., 2012).

• Cross-domain recommendation. This task aims to recommend items only in the target
domain by exploiting knowledge only from the source domain. In this case, the task is
to provide recommendations in a target domain where there is no information about
the users in that domain. Therefore, there is not user overlap among domains, and
approaches intend to establish knowledge-based links between domains (TIROSHI;
KUFLIK, 2012) or to transfer knowledge from the source domain to the target
domain (STEWART et al., 2009).

Figure 5 illustrates the three cross-domain recommendation tasks identified by
Cantador et al. (2015). In the figure, IS and IT are sets of items from source (DS) and
target (DT) domains, respectively. US and UT are sets of users from source and target
domain, respectively. Grey filled areas represent the target users and recommended items,
and hatched areas represent the exploited data for generating recommendations.

Figure 5 – Cross-domain recommendation tasks (CANTADOR et al., 2015).


2.2. Cross-Domain Recommender Systems 41

As mentioned before in Section 1.3, the problem of this thesis is to explore the user-
rating-context tensors (also called “multidimensional matrices” or, informally, “cubes”)
from source and target domains to improve recommendations in the target domain.
According to this issue and the definitions of a cross-domain RS task mentioned in this
section, we can consider that problem as: “how to improve the quality of the Linked-domain
recommendation task?”.

However, despite the task that we aim to perform is classified as a Linked-domain
recommendation, we refer the recommender system proposed in this thesis as a Cross-
domain recommender system. This referral is made as a matter of simplicity and based on
the literature of cross-domain RS (CANTADOR et al., 2015), in which the majority of
the papers that perform Linked-domain and Multi-domain recommendation tasks refers
themselves as Cross-domain RS.

2.2.3 Cross-Domain Recommendation Goals

Likewise the cross-domain recommendation tasks, the cross-domain recommenda-
tion goals can vary. According to (CANTADOR et al., 2015), we present some of the most
common goals addressed by cross-domain RSs:

• Alleviating the cold-start problem. This issue may occur when a RS is unable to
generate recommendations due to an initial lack of user preferences. A possible
solution is to obtain the user preferences from other domain (source) in order to enrich
the user preferences in the target domain (SHAPIRA; ROKACH; FREILIKHMAN,
2013).

• Alleviating the new user problem. This issue may happen when a user begins using
a RS that initially has no knowledge about his/her preferences. In this case, the RS
cannot make recommendations. This issue may be alleviated by exploiting the user’s
preferences from a different domain (source) (CREMONESI; TRIPODI; TURRIN,
2011)(WINOTO; TANG, 2008)(SANTOS et al., 2012).

• Improving accuracy. Recommender systems may have to deal with a low average
number of ratings per user or item, which may negatively affect the quality of the
recommendations. Ratings obtained from other domain (source) could increase the
rating density in the target domain, which may improve the recommendation quality
(STEWART et al., 2009)(MORENO et al., 2012).

• Increasing diversity. Recommender systems may provide similar or redundant items
for users. Thus, their satisfaction may be compromised. In this case, the diversity of
recommendations could be increased by considering item preferences from multiple
domains (WINOTO; TANG, 2008).


42 Chapter 2. Background and Related Work

Again, as mentioned in Section 1.3, the problem in this thesis is to improve the
quality of cross-domain collaborative filtering recommender systems (CD-CFRS). This
quality refers to accuracy improvement by addition of context-aware techniques while
maintaining the advantages of CD-CFRS in relation to cold-start and sparsity issues.

2.2.4 Cross-Domain Recommendation Scenarios

CD-CFRSs are based on the set of ratings provided by users about items of
source and/or target domains. According to the overlap among users and/or items of
both domains, Cremonesi, Tripodi e Turrin (2011) identified four different cross-domain
scenarios:

• No overlap. There is no overlap between users and items in the domains. In other
words, each item belongs to only one domain, and each user only has preferences for
items of one domain. In this case, traditional single-domain CF-based RS cannot
make recommendations due to the lack of common data between the domains (ABEL
et al., 2011)(SZOMSZOR et al., 2008).

• User overlap. Some users have preferences for items of, at least, two domains (source
and target), but each item belongs only to a single domain. For instance, this scenario
may happen when a dataset has ratings of the same user for two domains (e.g. movies
and books) (SAHEBI; BRUSILOVSKY, 2013)(CREMONESI; TRIPODI; TURRIN,
2011).

• Item overlap. In this scenario, there are items belonging to distinct domains (source
and target). Users can give different ratings for these items depending on their
domains. For example, this scenario may occur when a user rates a TV program in
multiple systems (e.g. Movielens and Netflix), which could be considered as domains
by the RS (CREMONESI; TRIPODI; TURRIN, 2011). In this case, the domain can
be classified according to the System level (BERKOVSKY; KUFLIK; RICCI, 2007),
as described in Section 2.2.1).

• User and item overlap. In this scenario, there is overlap between the users as well as
between the items (BERKOVSKY; KUFLIK; RICCI, 2007)(TIROSHI; KUFLIK,
2012).

Figure 6 illustrates possible scenarios of user or item overlap between the source
and target domains. In the figure, IS and IT are sets of items from source (DS) and target
(DT) domains, respectively. US and UT are sets of users from source and target domain,
respectively. Grey filled areas represent the target users and recommended items, and
hatched areas represent the user and/or item overlap.


2.2. Cross-Domain Recommender Systems 43

Figure 6 – Possible scenarios of user and/or item overlap between the source and target
domains (CREMONESI; TRIPODI; TURRIN, 2011).

As stated in the problem of this thesis (Section 1.3), it is necessary a User overlap
among source and target domains whereas an Item overlap is not. In addition to the
literature, we can say that, in our problem, it is also necessary a Contextual overlap among
such domains, i.e., the same contexts observed in the source domains are observed in the
target domain.

2.2.5 Cross-Domain Approaches

As discussed earlier, the cross-domain recommendation has been addressed from
various perspectives. This fact led to the development of a variety of recommendation
approaches. In many cases, these approaches are difficult to compare, since each one may
be based on a different algorithm and adopt different data models about user preferences.

Cantador et al. (2015) analyzed some surveys about cross-domain RS (CRE-
MONESI; TRIPODI; TURRIN, 2011)(FERNÁNDEZ-TOBÍAS et al., 2012), and uni-
fied different categorizations from these surveys by proposing a two-level taxonomy of
cross-domain RS approaches, focusing on the exploitation of knowledge in cross-domain
recommendations. This taxonomy is presented below:

• Aggregating knowledge. Knowledge from one or more source domains is aggregated
to perform recommendations in a target domain. Three approaches are considered:


44 Chapter 2. Background and Related Work

1. Merging user preferences – user preferences of different forms and scales are
aggregated in a single set. These preferences may be ratings, tags, “like/dislike”
binary preferences, among others (BERKOVSKY; KUFLIK; RICCI, 2007)(SA-
HEBI; BRUSILOVSKY, 2013).

2. Mediating user modeling data – user modeling data from several recommender
systems are aggregated in a single model. For instance, user similarities and user
neighborhoods (SHAPIRA; ROKACH; FREILIKHMAN, 2013)(STEWART et
al., 2009).

3. Combining recommendations – recommendations or predictions of single-domain
RS are aggregated in a single RS (ZHUANG et al., 2010)(GIVON; LAVRENKO,
2009).

• Linking and transferring knowledge. In this approach, the knowledge is linked or
transferred between domains (source and target). Three possible approaches are:

1. Linking domains – source and target domains are linked by a common knowl-
edge, e.g., item attributes, association rules, semantic networks, among others
(SHI; LARSON; HANJALIC, 2011)(CHUNG; SUNDARAM; SRINIVASAN,
2007)(AZAK, 2010).

2. Sharing latent features – the source and target domains are related using implicit
latent features (ENRICH; BRAUNHOFER; RICCI, 2013)(PAN et al., 2010).

3. Transferring rating patterns – explicit or implicit rating patterns from source
domains are exploited in the target domain (LI; YANG; XUE, 2009b)(GAO et
al., 2013).

Figure 7 illustrates the taxonomy about cross-domain approaches proposed in
(CANTADOR et al., 2015).

Figure 7 – Cross-domain recommendation approaches taxonomy (CANTADOR et al.,
2015).


2.2. Cross-Domain Recommender Systems 45

2.2.6 Cross-Domain Evaluation

Basically, two types of evaluation can be used to compare recommender systems in
general (FREYNE; BERKOVSKY, 2013). Offline experiments evaluate a RS by analyzing
past user preferences. They are typically the easiest evaluation type to perform since they
do not require interaction with real users. Online experiments demands that a group of
real users use the RS in a controlled environment and give feedback about their experience
with it.

Cantador et al. (2015) compared the corresponding evaluation methods based on
the cross-domain recommendation goals (see Section 2.2.3) and verified that the most of
the works about cross-domain RS adopt the offline experiments. In this way, we only
describe the aspects of offline experiments, according to (CANTADOR et al., 2015), in
the following subsections.

2.2.6.1 Evaluation Data Partitioning

In the offline evaluation of traditional RSs, different data partitions can be adopted
(e.g. Hold-out, Leave-some-users-out and Leave-all-users-out)(CANTADOR et al., 2015).
Thus, the dataset is usually divided into three subsets of ratings:

• Training profiles: contain the set of ratings from users for items that are used to
train the algorithms under evaluation;

• Test profiles: contain the set of users and their known ratings for items that are
used as input by the trained algorithms under evaluation; and

• Test ratings: contain the set of users and their hidden ratings for items that are
used by the algorithms under evaluation for that they can estimate their actual (or
known) ratings.

Regarding the offline evaluation of cross-domain RS, the same partitions used for
evaluation traditional RSs can be adopted. However, for evaluation cross-domain RS is
necessary to consider the use of data from the source and target domains. Depending
on that use and the cross-domain RS goal, a certain partition may be more suitable, as
described in (CANTADOR et al., 2015) and mentioned below:

• Hold-out (see Figure 8-left) can be used when the test profiles set is a subset of the
training profiles set and contains ratings from source and target domains. The test
profiles set is sampled and hidden from the original dataset, taking into account
both domains, without partitioning the users. This kind of partitioning may be
suitable to evaluate linked- and multi-domain RS with the accuracy goal (SAHEBI;
BRUSILOVSKY, 2013)(PAN; YANG, 2013).


46 Chapter 2. Background and Related Work

• Leave-some-users-out (see Figure 8-middle) can be adopted when there is not an
intersection between the training profiles set and the test profiles set. Note that
both sets contain ratings from source and target domains as well as the test profiles
set. This partition type may be suitable to evaluate a cross-domain RS with the
new user goal (ABEL et al., 2013)(LI; YANG; XUE, 2009b).

• Leave-all-users-out (see Figure 8-right) can be adopted when there is not an intersec-
tion between the training profiles set and the test profiles set, but in this case there
is also not intersection between the training profiles set and the entire target domain
data set. Besides, the test profiles and test ratings sets have only data from the target
domain. Thus, this partition may be suitable to evaluate a cross-domain RS with
the cold-start and new item goals (JAIN; KUMARAGURU; JOSHI, 2013)(GOGA
et al., 2013).

Figure 8 – Partitioning of data: (left) hold-out; (middle) leave-some-users-out; and (right)
leaveall (CANTADOR et al., 2015).

2.2.6.2 Evaluation Metrics

As described in Section 2.1.3, there are several metrics for evaluating recommender
systems in general. All these metrics can be used in the cross-domain context depending
on the cross-domain recommendation goals and tasks (CANTADOR et al., 2015). For
instance, Probabilistic measures are preferred when the goal is to reduce the sparsity of the
target domain; Ranking measures are adopted for testing user models, especially in cold-
start situations; and Qualitative measures are best-suited for the top-N recommendation
task.

Finally, the majority of works about cross-domain recommendations adopts predic-
tion metrics (CANTADOR et al., 2015). This is motivated by the fact that the addressed
goal is to reduce sparsity and increase accuracy, and the algorithms designed for this are
often based on error-metric optimization techniques, which are naturally evaluated using
the category of predictive metrics.


2.3. Context-Aware Recommender Systems 47

2.2.6.3 Sensitivity Analysis

The performance of a cross-domain recommender is mainly affected by three pa-
rameters (CANTADOR et al., 2015): the overlap between the source and target domains
(SHI; LARSON; HANJALIC, 2011)(CREMONESI; TRIPODI; TURRIN, 2011)(ZHAO
et al., 2013)(ABEL et al., 2013), the density of the target domain data (CREMONESI;
TRIPODI; TURRIN, 2011)(SHAPIRA; ROKACH; FREILIKHMAN, 2013)(CAO; LIU;
YANG, 2010)(PAN et al., 2010), and the size of the target user’s profile (SAHEBI;
BRUSILOVSKY, 2013)(BERKOVSKY; KUFLIK; RICCI, 2008)(SHI; LARSON; HAN-
JALIC, 2011)(LI; YANG; XUE, 2009b). Thus, it is important to consider the sensitivity
of the cross-domain algorithms regarding these three parameters.

According to (CANTADOR et al., 2015), the majority of the works have assumed a
full overlap of users between the source and target domains whereas only a few ones have
been evaluated by varying the percentage level of user overlap, e.g., in the range 0%-50%
(CREMONESI; TRIPODI; TURRIN, 2011) or in the range 0%-100% (ZHAO et al., 2013).

2.3 Context-Aware Recommender Systems

As mentioned before, the context-aware approach uses different contextual infor-
mation (e.g., location, time, mood, etc.) to improve the accuracy of recommendations
(SETTEN; POKRAEV; KOOLWAAIJ, 2004)(ADOMAVICIUS; TUZHILIN, 2015). For
some applications it may not be sufficient to consider only users and items.For example,
using the temporal context, a travel recommender system could provide a recommendation
in the winter that can be very different from the one in the summer. In another example,
a user could prefer to watch news programs in the morning and to watch soccer games
at night. Therefore, accurate prediction of user preferences might depend on the use of
relevant contextual information by recommender systems (ADOMAVICIUS et al., 2005).

In recent years, researchers and companies have developed context-aware recom-
mender systems (CARS) and applied them in a variety of different domains such as movie
(SHEPSTONE; TAN; JENSEN, 2014), restaurant (PESSEMIER; DOOMS; MARTENS,
2014), tourism (MAHMOOD; RICCI; VENTURINI, 2009), music (KAMINSKAS; RICCI,
2012)(BALTRUNAS et al., 2011), mobile information (CHURCH et al., 2007), news (LEE;
PARK, 2007), among others.

Likewise the cross-domain approach, CARS is a challenging and emergent field
of recommender systems (ADOMAVICIUS; TUZHILIN, 2015). In this way, we describe
some of its perspectives in the following subsections.


48 Chapter 2. Background and Related Work

2.3.1 Definition of Context

The definition of “context” varies among different research areas, including Com-
puter Science. Since context has been studied in multiple disciplines, there is not a
standard definition of “context”. In Computer Science, one of the most known definitions
is given by Dey, Abowd e Salber (2001). They refer to “context” as:

(...) any information that can be used to characterize the situation of enti-
ties (i.e., whether a person, place or object) that are considered relevant to
the interaction between a user and application, including the user and the
application themselves.

(BAZIRE; BRÉZILLON, 2005) identified 150 different definitions of context from
different fields and made the following observation:

... it is difficult to find a relevant definition satisfying in any discipline. Is
context a frame for a given object? Is it the set of elements that have any
influence on the object? Is it possible to define context a priori or just state
the effects a posteriori? Is it something static or dynamic? Some approaches
emerge now in Artificial Intelligence [...]. In Psychology, we generally study
a person doing a task in a given situation. Which context is relevant for our
study? The context of the person? The context of the task? The context of
the interaction? The context of the situation? When does a context begin
and where does it stop? What are the real relationships between context and
cognition?

In the recommender systems area, there is also not a standard definition of “context”.
However, some authors (PALMISANO; TUZHILIN; GORGOGLIONE, 2008)(ADOMAVI-
CIUS; TUZHILIN, 2015) have a similar point of view about “context” for recommender
systems, which is the focus of this thesis. These authors consider “context” as dimensions
(e.g. location, time, mood, etc.) and their attributes (e.g. country, city, year, day, sadness,
happiness, etc.), which can be used to adapt the recommendations. Based on this definition,
we model contextual information in our CD-CARS (Section 3.2). In the next section, we
describe how the contextual information can be modelled.

2.3.2 Modelling Contextual Information

Contextual models represent which contextual information is considered in a
domain or application, and how this information affects the system’s behavior (VIEIRA;
TEDESCO; SALGADO, 2009). In general, contextual models define the elements of a
particular domain that are considered as context (e.g. location context, temporal context,


2.3. Context-Aware Recommender Systems 49

etc.). They structure entities of a domain and indicate features of these entities, which are
managed by the system. However, only this definition of elements does not provide the
notion of the context’s dynamic. Production rules are usually adopted for this purpose
(VIEIRA; TEDESCO; SALGADO, 2009).

Generic contextual models aim to describe the information that must be considered
as the context in a generic way. These models provide a classification for an initial set of
elements that compose the context in a certain domain (VIEIRA; TEDESCO; SALGADO,
2009). Different applications can reuse the modeled information by extending the model
in order to deal with particularities of a particular application. Generic contextual models
have been proposed in several areas such as pervasive systems (CHAARI et al., 2007),
collaborative systems (VIEIRA; TEDESCO; SALGADO, 2005), data integration (SOUZA
et al., 2008), and intelligent systems (GU; PUNG; ZHANG, 2005).

In this direction, researchers have investigated the adoption of several techniques
for representation of information and knowledge about context (STRANG; LINNHOFF-
POPIEN, 2004)(BETTINI et al., 2010). Vieira, Tedesco e Salgado (2009) summarize some
of these techniques, as adapted and described in Table 1.

Each representation technique described in Table 1 has advantages and disadvan-
tages. Thus, there is not a technique that is universally considered as suitable for a certain
context-aware system, since different systems have different restrictions and capabilities
(VIEIRA; TEDESCO; SALGADO, 2009). A hybrid approach, which combines two or
more techniques, may also be adopted as a contextual model. For example, Henricksen e
Indulska (2006) proposed a hybrid model that combines ontologies and a graph model
based on Object-Role Modeling (ORM). (VIEIRA et al., 2008) proposed a hybrid model
that combines ontologies and contextual graphs (BRÉZILLON, 2007) to represent the
structure of the contextual information and context-aware behavior.

With respect to CARSs, in general they deal with modelling and predicting user
preferences by incorporating contextual information into the recommendation process.
These preferences are usually modeled as user ratings for items under specific contexts.
In this way, the user ratings can be accompanied by contextual information that may
be modelled of different types, each type defining a certain aspect of context such as
time, location, companion, mood, and so on (ADOMAVICIUS; TUZHILIN, 2015). For
instance, by considering movie recommender system, its users and movies can be described
according to the following attributes (ADOMAVICIUS; TUZHILIN, 2015):

• Movie: id, title, length, release year, director, genre, among others.

• User: id, name, address, age, gender, profession, and so on.


50 Chapter 2. Background and Related Work

Table 1 – Summary of techniques for representation of context (VIEIRA; TEDESCO;
SALGADO, 2009).

Technique Advantages Disadvantages Brief Description
“Key-Value” pair Simple structure,

and easy to imple-
ment and use.

It does not consider
hierarchy and is not
suitable for applica-
tions with complex
structures.

A linear search with
exact matching of
terms.

Markup language It provides hierarchy
moreover, a markup
scheme that imple-
ments the model it-
self.

It does not solve
incompleteness and
ambiguity. Also, it is
not suitable for appli-
cations with complex
structures.

A query language
based on marking.

Topic maps It facilitates the nav-
igation between the
contextual elements
and the human read-
ing.

It is an immature
technique with a low
support of tools.

Navigation for se-
mantic networks.

Ontologies It aggregates rules,
concepts, and facts
in a single model.
Standards make the
reuse and sharing
easier. It allows
semantic comprehen-
sion between humans
and machines

It does not allow
modelling the behav-
ior of the context-
aware system. Also,
it is a recent technol-
ogy with a low num-
ber of tools.

Inference engine
and query languages
based on OWL or
frames.

Graph models It facilitates the con-
cepts specification
and the definition of
the context-aware
system behavior.

It does not allow
to process the con-
cepts: mapping for
data structures.

It can be translated
for XML and makes
XML processing.

In addition, the contextual information may consist of the following three dimen-
sions, which can also be defined according to attributes:

• Location. The user’s location when his/her is watching a movie. It may be composed
by the attributes: id, name, street, city, state, country, among others.

• Temporal. The time when a movie is watched. It may be composed by the attributes:
date, day of week5, day type6, month, year, etc.

5 “monday”, “tuesday”, “wednesday”, “thursday”, “friday”, “saturday”, “sunday”
6 “weekday” or “weekend”


2.3. Context-Aware Recommender Systems 51

• Companion. With who the user watches the movie. It may be composed by the
attributes: companion type7, companion name8, and so on.

Given that, a user may rate (or watch) a movie depending on “where” he/she will
be, “when” he/she will watch and/or “whom” he/she will be with. Beyond these three
contextual dimensions illustrated in the example above, Neto e Freitas (2007) identified
another three basic dimensions, referred as “5W+1H”, which represents “who”, “what”,
“where”, “when”, “why”, and “how”.

In addition, each contextual dimension can have a complex structure and hierarchy
of attributes and their corresponding values. Although this complexity may be modelled
by different forms, traditional models adopt a hierarchical structure of contextual infor-
mation represented as trees (ADOMAVICIUS et al., 2005)(PALMISANO; TUZHILIN;
GORGOGLIONE, 2008). For instance, suppose two contextual dimensions: location and
temporal. These dimensions could have the following hierarchies associated with them:

• Location: Street → City → State → Country;

• Time: Date → Day of Week → Month → Year.

Besides the traditional trees for representing the hierarchical structure of contextual
information, other ways have been adopted such as Online Analytical Processing (OLAP)
(ADOMAVICIUS et al., 2005) and ontologies (KAMINSKAS et al., 2014).

2.3.3 Obtaining Contextual Information

An important aspect of CARS is how to obtain contextual information. Adomavicius
e Tuzhilin (2015) mention three of the most common methods:

• Explicitly. The contextual information is obtained directly from users (LEE; KWON,
2014)(COLOMBO-MENDOZA et al., 2015). A CARS could have this information
by asking direct questions about the users’ contexts. For example, a user could select
one of the possible contexts provided by the CARS together with the item rating.

• Implicitly. In this case, users are not aware about the contextual information
gathering process by the CARS. This information can be implicitly obtained in
several ways (OH et al., 2014)(PHAM; JUNG; VU, 2014). For instance, a CARS
could detect the user location from his/her mobile device location. Another manner
could be through temporal information that can be implicitly obtained from the
ratings’ timestamps. Therefore, the CARS does not need to interact with users to
obtain their contexts.

7 “alone”, “friends”, “girlfriend/boyfriend”, “family”, “co-workers”, etc.
8 “Joseph”, “Paul”, “Laura”, etc. In this case, the values could have an associated id.


52 Chapter 2. Background and Related Work

• Inferring. In this method, the contextual information is also obtained implicitly, but
the use of statistical or data mining methods is required since the context cannot
be obtained in a direct way (SHEPSTONE; TAN; JENSEN, 2014)(WANG; LI; XU,
2015). For example, a CARS could infer the companion (context) of a user from
his/her review about a TV program through text mining techniques or observing the
kind of the TV program watched by comparing it by using statistical data (e.g. an
adult watching a TV program for kids probably is accompanied with kids). Semantic
interpretation can also be used for inferring contextual information(BOYTSOV et
al., 2015). Recently, some works have used the term “situation” for representing a
particular contextual information which is inferred by means of semantic interpre-
tation(BOUNEFFOUF, 2013)(BOYTSOV et al., 2015). Usually, the “situation” is
inferred from sensor data and characterizes situations in which a user interacts with
the CARS(BOUNEFFOUF, 2013). For instance, consider a user associated to: a
location defined by the coordinates from his phone’s GPS; the time from his phone’s
watch; and the meeting with some person from his agenda. From this knowledge,
the CARS could infer that the user is ”in a restaurant, with the general manager of
a company, at midday, and it is a workday”. In this way, that inferred contextual
information can be called “situation” represented by three contextual dimensions
(temporal, location and companion).

2.3.4 Contextual Information Relevance

Some contextual dimensions can be more relevant in a given application than some
other types (ADOMAVICIUS; TUZHILIN, 2015). For example, the weather may be more
relevant for recommending places to visit than for recommending movies to watch.

There are several approaches to determine the relevance of a given dimension (or
type) of contextual information (ADOMAVICIUS; TUZHILIN, 2015). In particular, this
relevance can be verified either manually (e.g. by using domain knowledge of a expert for
a given application domain) (BRÉZILLON, 2007) or automatically (e.g. by using several
existing feature selection methods from machine learning, data mining, statistics, and so
on.) (GUYON; ELISSEEFF, 2003)(LIU; MOTODA, 2012)(CHATTERJEE; HADI, 2015).
In a same contextual dimension, there may exist contextual attributes more relevant than
others, since a contextual dimension can be modelled as a hierarchical tree (e.g. country x
city attributes).

In addition, some parts of a contextual dimension may not be known or available.
In this case, some authors classify the source of the contextual information according to
the relevance of its acquisition and selection. The classification proposed by Adomavicius
e Tuzhilin (2015) is divided into three categories:

• Fully observable. The relevant contextual information to the application is known


2.3. Context-Aware Recommender Systems 53

explicitly as well as its structure and its values at the moment when recommendations
are made (DOURISH, 2004). For example, a product recommender system may
consider that only the Temporal, Purchasing Purpose, and Companion dimensions
matter for it. In addition, the recommender system may know the entire structure
(attributes and values) of all these three contextual dimensions. For instance, the
“day type” attribute from the Temporal dimension can have three possible values:
“weekday”, “weekend”, and “holiday”.

• Partially observable. In this category, only some of the information about the contex-
tual dimensions is known explicitly (PALMISANO; TUZHILIN; GORGOGLIONE,
2008). For example, the recommender system may consider all the contextual di-
mensions, such as Temporal, Purchasing Purpose, and Companion, but not know all
their structure (attributes and values). Note that there may exist different levels
of “partial observability”. For example, a CARS could only have access to the
Temporal dimension for a certain user, whereas for another user it knows all the
other contextual dimensions (Purchasing Purpose, and Companion).

• Unobservable. In this category, no information about contextual dimensions is
explicitly available to the CARS, and it makes recommendations by considering
only the inferred context in an implicit way. For example, a CARS could build a
latent predictive model to estimate unknown ratings, where unobservable context is
modeled using latent variables (KOREN, 2008).

Another aspect of the contextual information relevance is whether and how its
importance changes over the time. Adomavicius e Tuzhilin (2015) also classified the
contextual dimension relevance into two categories:

• Static. The relevant contextual dimensions and their structure remains the same
(static) over the time (PALMISANO; TUZHILIN; GORGOGLIONE, 2008). For
example, a product recommender system could have three contextual dimensions
(Temporal, Purchasing Purpose, and Companion) and they could not change along
the entire RS lifetime. In this case, for example, the structure (attributes and values)
of the Purchasing Purpose dimension does also not change over the time.

• Dynamic. In this category, contextual dimensions, attributes or values change in
some way over the time (ANAND; MOBASHER, 2006). For example, a CARS (or a
CARS designer) could identify that the Companion dimension is no longer relevant
for the CARS and could remove it from the system. Besides, a CARS could change
the structure of some of the contextual dimensions (e.g. by adding new attributes to
the Purchasing Purpose dimension).


54 Chapter 2. Background and Related Work

2.3.5 Context-Aware Approaches

According to (ADOMAVICIUS; TUZHILIN, 2015), there are three systematic
paradigms (or approaches) found in the CARS literature:

• Contextual pre-filtering. In this recommendation paradigm (illustrated in Figure 9),
contextual information guides the data selection for that specific context. In other
words, information about the current context is used for selecting the relevant set of
data (i.e., user ratings) (ADOMAVICIUS et al., 2005)(VERAS et al., 2015). Then,
ratings can be predicted using any traditional collaborative-filtering recommender
system on the pre-filtered data.

• Contextual post-filtering. In this recommendation paradigm (illustrated in Figure 9),
contextual information is initially ignored and the ratings are predicted using any
traditional collaborative-filtering recommender system on the entire data. Then, the
resulting recommendations (or predictions) are adjusted (or filtered) depending on
the contextual information of the users (PANNIELLO et al., 2009)(VERAS et al.,
2015).

• Contextual modeling. Unlike the pre-filtering and post-filtering paradigms, in this
paradigm (illustrated in Figure 9) the contextual information is used directly in the
recommendation or predictive process (neither before or after it). Although the pre-
filtering and post-filtering paradigms can use traditional CF-based algorithms, the
Modelling paradigm actually needs to make “multidimensional” recommendations
by considering contextual information as another dimension, beyond the users
and items. Several approaches can be used in this algorithm such as predictive
models (e.g. decision tree, regression, probabilistic model, among others) (ANSARI;
ESSEGAIER; KOHLI, 2000) (OKU et al., 2006), matrix (or tensor) factorization
(KARATZOGLOU et al., 2010)(HIDASI; TIKK, 2012)(BALTRUNAS; LUDWIG;
RICCI, 2011)(KIM; YOON, 2014), heuristic calculations(ADOMAVICIUS et al.,
2005), among others.

On the other hand, other ad-hoc approaches, which not necessarily need user-
ratings, have also been found in CARS literature and could be used according to the
paradigms described above. Véras et al. (2015) describe some of these ad-hoc approaches:

• Contextual rules: in this category, there are all kinds of rules that allow recommender
systems to sense and to react based on their context. In general, these rules follow
the same approach of “event-condition-action” (ECA) rules (MOON et al., 2006),
“Key-value” rules (SONG; MOUSTAFA; AFIFI, 2012), among others.


2.3. Context-Aware Recommender Systems 55

Figure 9 – Paradigms for incorporating context in recommender systems (ADOMAVICIUS;
TUZHILIN, 2015).

• Contextual Ontology: contextual ontologies are not algorithms, but they are crucial
to other knowledge-based context-awareness techniques. Most studies that used
contextual ontologies also adopted some semantic-based inference. Thus, the com-
bination of semantic-based and context-awareness techniques is present in many
studies (KAMINSKAS et al., 2014)(MOE; AUNG, 2014b).

• Similarity-based: instead of using similarity metrics to compare user or items, in this
approach, algorithms compare contexts in order to recommend items (ALHAMID et
al., 2015)(VILDJIOUNAITE et al., 2009)(WANG; LI; XU, 2015). The context can
be represented in several ways such as tags, key-value pairs, among others.

• Supervised learning: in this approach, a set of labeled examples is produced, where
each example is composed by features extracted from contextual attributes (e.g.
time of the day, mood, etc.). The task of supervised learning is, given a training
set, to learn a function that predicts the user preferences based on the contextual
features. Examples of algorithms adopted in this approach include Support Vector
Machines (VILDJIOUNAITE et al., 2009), case-based reasoning (VILDJIOUNAITE
et al., 2009), reinforcement learning (MOON et al., 2009), among others.

However, these ad-hoc approaches are difficult to reproduce in distinct domains,
since they are usually designed for specific ones. Thus, the algorithms proposed in this thesis
(described in Section 3.3.1) follow the systematic paradigms described in (ADOMAVICIUS;
TUZHILIN, 2015).


56 Chapter 2. Background and Related Work

2.3.6 CARS Evaluation

As described in Section 2.1.3, there are several metrics for evaluating recommender
systems in general. All these metrics can be used for evaluating a CARS depending on its
purpose (ADOMAVICIUS; TUZHILIN, 2015).

However, the evaluation is one of the main research issues and directions for
CARS (ADOMAVICIUS; TUZHILIN, 2015). Only a few works have deeply studied
the performance evaluation of several CARS approaches and techniques, besides their
benefits and limitations. One of these works is presented in (PANNIELLO; TUZHILIN;
GORGOGLIONE, 2014), which performed a categorical evaluation and comparison of
several contextual techniques under a variety of situations. For instance, the authors
compared different recommendation tasks (e.g. to recommend all relevant items, to
recommend only the top-n relevant items, etc.), different evaluation metrics (e.g. accuracy,
diversity, etc.), and the granularity of the processed contextual information, as well as
other evaluation perspectives.

Another example is the work presented in (CAMPOS; DÍEZ; CANTADOR, 2014),
which focused on exploring “time” as one of the most relevant and widely used contextual
dimensions in many CARS. For example, the authors reviewed common evaluation practices
and methodological issues related to the comparative evaluation of time-aware recommender
systems. They also demonstrated that the choice of the assessment conditions impacts the
classification (or ranking) performance of different recommendation strategies. For that,
they proposed a methodological framework for a robust and fair evaluation process.

The works mentioned above represent an important step in direction to a more
reproducible and standardized evaluation methodology of CARS. Analogously to (PAN-
NIELLO; TUZHILIN; GORGOGLIONE, 2014), we performed different evaluation tasks
(prediction and classification), besides verifying the recommender system performance by
using contextual information from distinct dimensions, as we describe in Section 5.1.

2.4 Related Works

In this section, we present related works to this thesis, divided into two subsections.
In Section 2.4.1, we present some cross-domain recommender systems based on collaborative
filtering without considering context-aware techniques, whereas in Section 2.4.2, we describe
related cross-domain recommender systems that use contextual information, highlighting
their limitation in comparison to our proposed CD-CARS.

2.4.1 Cross-Domain Recommendation based on Collaborative Filtering

As mentioned before, the cross-domain recommendation has been addressed from
various perspectives. This fact led the development of a wide range of recommendation


2.4. Related Works 57

approaches, as categorized by Cantador et al. (2015) (see Section 2.2.5).

In general, Merging user preferences from different domains is the most direct way
to address the cross-domain recommendation problem and it is among the most widely
used strategies for the cross-domain recommendation (CANTADOR et al., 2015). The
massive use of this approach can be explained by the fact that it has been shown that
enriching sparse user preference data in a certain domain by adding user preference data
from other domains, can significantly improve the generated recommendations under cold-
start and sparsity conditions (SHAPIRA; ROKACH; FREILIKHMAN, 2013)(SAHEBI;
BRUSILOVSKY, 2013). Figure 10 illustrates the Merging user preferences approach, in
which user rating matrices from source (DS) and target (DT) domains are merged, and
traditional single-domain CF-based recommender systems can be used on the merged data
to recommend items from target domain (IT).

Figure 10 – Merging user preferences approach (CANTADOR et al., 2015).

Given that and the fact that our proposed CD-CARS is based on the Merging user
preferences approach, we describe some works related to it according to the cross-domain
RS perspectives described in Section 2.2. Table 2 describes the classification of related
papers for these perspectives.

Berkovsky, Kuflik e Ricci (2007) focused on cross-domain mediation of user models
in CF-based recommendations. In the cross-domain mediation, the user modeling data is
imported from remote systems (source domains) exploiting the same CF recommendation
technique as the target system (domain). Hence, both source and target domains represent
the user models as a list of ratings provided by a user on both domains (user overlap). The
CD-CFRS imports the complete set of nearest neighbors calculated by the remote system
and uses these similarities in the target domain (cross-domain task). The prediction
accuracy of that CD-CFRS was measured through the MAE metric (probabilistic) by
using the EachMovie dataset (MCJONES, 1997). In this dataset, movies from different


58 Chapter 2. Background and Related Work

Table 2 – Cross-Domain CF-based RS using the Merging user preferences approach.

Paper Domain
Level

Task Goal Scenario Evaluation

(BERKOVSKY;
KUFLIK; RICCI,
2007)

Item At-
tribute

Cross-
Domain

Accuracy User
and
Item
Overlap

Probabilistic

(WINOTO;
TANG, 2008)

Item Linked-
Domain

Diversity
and
Accuracy

User
Overlap

Probabilistic

(NAKATSUJI et
al., 2010)

Item Cross-
Domain

Accuracy User
Overlap

Probabilistic

(CREMONESI;
TRIPODI; TUR-
RIN, 2011)

System Linked-
Domain
and Cross-
Domain

Cold-
start and
Accuracy

User
Overlap

Ranking

(SANTOS et al.,
2012)

Item Multi-
Domain
and Linked-
Domain

Cold-
start and
New user

User
Overlap

Ranking

(TIROSHI et al.,
2013)

Item Cross-
Domain

Accuracy User
Overlap

Ranking

(SAHEBI;
BRUSILOVSKY,
2013)

Item Linked-
Domain

New User
and Accu-
racy

User
Overlap

Probabilistic

(SHAPIRA;
ROKACH; FREI-
LIKHMAN, 2013)

Item Linked-
Domain

Cold-
start and
Accuracy

User
Overlap

Probabilistic
and Rank-
ing

(LONI et al., 2014) Item Cross-
Domain

Accuracy User
Overlap

Probabilistic

Proposed CD-
CARS

Item Linked-
Domain

Cold-
start,
New
user and
Accuracy

User
Overlap

Probabilistic
and Rank-
ing

genres are considered as from distinct domains (item attribute domain level), and one item
can belong to two or more genres (domains), i.e., there is an item overlap.

Winoto e Tang (2008) investigated alternative benefits that cross-domain recom-
mendations may have such as serendipity and diversity. For that, they applied a traditional
single-domain CF-based algorithm for making cross-domain recommendations by consid-
ering aggregated ratings from source and target domains. The unique change made for
them in the algorithm was in the weight of the Pearson correlation, which was modified in
order to find neighbors with higher number of co-rated items in the target domain. The
CD-CFRS performance was measured through the MAE metric (probabilistic) by using


2.4. Related Works 59

a real collected dataset. This dataset was composed by several domains such as movies,
books, games, among others (item domain level). On it, a user could rate items from all
domains (user overlap).

Instead of aggregating user preferences directly, several researches have focused on
directed weighted graphs that link user preferences from multiple domains (NAKATSUJI
et al., 2010)(CREMONESI; TRIPODI; TURRIN, 2011)(TIROSHI et al., 2013).

Nakatsuji et al. (2010) created a domain-specific-user graph (DSUG) for each
domain (source and target). In the DSUG, the nodes are users and sets weighted edges
between user nodes according to the similarity of users computed in each domain. Also, the
DSUGs from distinct domains are connected to create a cross-domain-user graph (CDUG).
Thus, the cross-domain RS performs a Random Walk with Restarts (RWR)(LOVÁSZ
et al., 1996) on the CDUG from the active user node, and extracts user nodes that are
present in DSUGs from the target domain that do not include the node of the active user
in the source domain. The authors evaluated the cross-domain RS by using a dataset
from two different domains (movies and music) with user overlap, and verified that the
accuracy (measured with the MAE metric) of their method is higher than one method
that predicts user preference by merging the user ratings from all domains.

Cremonesi, Tripodi e Turrin (2011) built a graph whose nodes are associated with
items and whose edges reflect rating-based item similarities. In this case, the inter-domain
connections are the edges between pairs of items in different domains. The authors also
proposed to enhance inter-domain edges by discovering new edges and strengthening
existing ones, through strategies based on the transitive closure. Through datasets from
different systems (system domain level) and with the same item type (movies), they
evaluated several CF-based (nearest neighborhood and latent factor techniques) algorithms
using the built multi-domain graph. The authors estimated the accuracy of the algorithms
in terms of F-metric (ranking) (CREMONESI; KOREN; TURRIN, 2010) by varying the
user overlap level (sensitivity analysis) in the datasets.

Santos et al. (2012) proposed an architecture for a recommender system in an inter-
application environment and compared traditional (single-domain) and inter-applications
recommendations through the Breese metric (Ranking). The proposed recommender
system handles different profiles from various applications (with different user-rating
scales and forms) by normalizing them. Besides, its recommendation module is based on
traditional collaborative filtering techniques, which are adjusted for making cross-domain
recommendations (Multi-domain and Linked-domain). The authors developed a web
application in order to obtain real user preferences for generating three datasets with
different item domains (movies, books, bands and singers) and user-rating scales/forms.
In the experiments, 60 users evaluated at least 10 items in the analyzed item domains
(user overlap) so that each user have, at least, 30 preferences in the system. In these


60 Chapter 2. Background and Related Work

experiments, the authors evaluated several scenarios (combining distinct domains as target)
in order to verify the quality of the proposed RS face to cold-start and new user issues.

Tiroshi et al. (2013) merged data from source and target domains into a single
bipartite user-item graph. From it, several statistical and graph-based features of users
and items were extracted. These features were exploited by a machine learning algorithm
that addressed the recommendation problem as a binary classification problem. Then,
they applied a Random Forest classifier (LIAW; WIENER, 2002) in order to recommend
items from the target domain based on the user preferences in the source domain, present
in the unified bipartite user-item graph. The authors collected a dataset containing user
preferences in multiple domains (book, movie and music) extracted from social network
profiles (Facebook9, Last.fm10, LinkedIn11, etc.) with user overlap. They adopted Precision
(ranking) as evaluation metric in order to verify the accuracy of their cross-domain RS.

Sahebi e Brusilovsky (2013) examined the impact of the size of user profiles in the
source and target domains on the quality of cross-domain recommendations, and showed
that aggregating ratings from a dense source domain increases the accuracy of recommen-
dations in the target domain under cold-start conditions. Basically, the authors applied
the k-Nearest Neighbor (k-NN) algorithm (LAROSE, 2005) in the aggregated ratings in
order to perform recommendation in the target domain (linked-domain recommendation).
For evaluating the CD-CFRS, they adopted a dataset (SAHEBI; COHEN, 2011) with
two item domains (book and movie) and user overlap. The accuracy of the system was
measured with the RMSE metric (Probabilistic).

Similar to (SAHEBI; BRUSILOVSKY, 2013), the work proposed in (SHAPIRA;
ROKACH; FREILIKHMAN, 2013) showed significant accuracy improvement by using
aggregation-based methods when the user preferences from the target domain are sparse.
In this case, the authors used a dataset composed of unary Facebook “likes” as user
preferences from several domains (movies, TV shows, and music). Two algorithms were
adopted by them: the k-NN algorithm with Jaccard similarity (AMATRIAIN et al., 2011),
since the user ratings are in unary form, and the “Facebook popularity” one, which simply
lists the top-mentioned user preference items in the Facebook profiles. The CD-CFRS
accuracy was mainly measured by Recall (ranking) and MAE metrics.

Finally, Loni et al. (2014) proposed a CD-CFRS with factorization machines
(RENDLE, 2012) capable of transferring knowledge from different auxiliary domains to
a target domains to improve rating predictions in the target domain. The CD-CFRS
encodes rating matrices from multiple domains as real-valued feature vectors. With
these vectors, the factorization machine finds patterns between features from the source

9 http://www.facebook.com
10 http://www.last.fm
11 http://www.linkedin.com


2.4. Related Works 61

and target domains, and estimates preferences associated with the input vectors. The
CD-CFRS accuracy is evaluated through MAE and RMSE metrics. Besides, the Amazon
dataset (LESKOVEC; ADAMIC; HUBERMAN, 2007) is used for evaluation purposes.
This dataset is composed by user ratings from three different domains (book, music, and
television) with user overlap.

It is important to notice that only four related papers are based on the “Linked-
Domain” task (see Table 2) like our proposed CD-CARS. While (WINOTO; TANG, 2008),
(SANTOS et al., 2012), (SHAPIRA; ROKACH; FREILIKHMAN, 2013) and (SAHEBI;
BRUSILOVSKY, 2013) explored only traditional single-domain CF-based algorithms used
for making cross-domain recommendations, (CREMONESI; TRIPODI; TURRIN, 2011)
proposed a cross-domain CF-based algorithm and compared it with those traditional ones.

The cross-domain CF-based algorithms presented in Table 2 differ from our proposed
CD-CARS by the fact that they do not take into account any contextual information for
making recommendations. Therefore, in the next section, we present some related works
about cross-domain algorithms that use contextual information in order to improve the
quality of their recommendations.

2.4.2 Cross-Domain Recommendation based on Context-Awareness

This thesis focuses on investigating the use of contextual information to enhance
cross-domain collaborative filtering recommendations. According to Fernández-Tobías et al.
(2012), no previous work had addressed the cross-domain recommendation task by deploying
of contextual features until then. Seminal works have been published more recently
(BRAUNHOFER; KAMINSKAS; RICCI, 2013)(ZHANG; YUAN; YU, 2014)(MOE; AUNG
et al., 2013)(MOE; AUNG, 2014b)(MOE; AUNG, 2014a)(KAMINSKAS et al., 2014)
(TANG; WAN; ZHANG, 2014)(TEKIN; SCHAAR, 2015)(JI; SHEN, 2015), adopting
various approaches to this issue, from semantic techniques to supervised learning, for
instance.

Taking into account the cross-domain RS and CARS aspects described in Section 2.2
and Section 2.3, respectively, we describe and categorize some related works that make
use of context-awareness techniques for providing cross-domain recommendations. Table 3
presents a classification of these works regarding cross-domain RS aspects whereas Table 4
shows the categorization of them with respect to the CARS perspectives.

TripFromTV+ (BLANCO-FERNÁNDEZ et al., 2011) selects personalized tourism
resources (target domain) for Digital TV viewers, by inferring their particular preferences
from the kind of TV programs (source domain) that they enjoyed and from their activity
on social networking sites (user overlap). The user profile, contextual information, and
item (resources) are modelled through ontologies, so, the recommendation is made with
semantic reasoning methods. Specifically, relevant context information is associated with


62 Chapter 2. Background and Related Work

Table 3 – Classification of context-aware-based related works regarding cross-domain RS
aspects.

Paper Domain
Level

Task Goal Scenario Evaluation

(BLANCO-
FERNÁNDEZ et
al., 2010)(BLANCO-
FERNÁNDEZ et
al., 2011)(BLANCO-
FERNÁNDEZ et al.,
2011)

Item Cross-
Domain

New
Item and
Diversity

User
Overlap

User Satis-
faction

(BRAUNHOFER;
KAMINSKAS; RICCI,
2013)

Item Cross-
Domain

Diversity No
Overlap

User Satis-
faction

(YUAN et al.,
2012)(ZHANG; YUAN;
YU, 2014)

Item Multi-
Domain

Diversity User
Overlap

Qualitative
and Rank-
ing

(MOE; AUNG et al.,
2013)(MOE; AUNG,
2014b)(MOE; AUNG,
2014a)

Item Cross-
Domain

Accuracy No
Overlap

Ranking

(KAMINSKAS et al.,
2014)

Item Cross-
Domain

Diversity No
Overlap

Ranking

(TANG; WAN;
ZHANG, 2014)

Item At-
tribute

Cross-
Domain

Diversity No
Overlap

Ranking

(TEKIN; SCHAAR,
2015)

Item Multi-
Domain

Diversity No
Overlap

Qualitative

(JI; SHEN, 2015) Item Linked-
Domain

Accuracy User
Overlap

Probabilistic

Proposed CD-CARS Item Linked-
Domain

Cold-
start,
New
user and
Accuracy

User
Overlap

Probabilistic
and Rank-
ing

each tourism resource (e.g. opening times, dates, duration, location and ticket price) and
matched by the recommendation strategy against the user’s partially observed and static
context (e.g location, temporal, etc.). The authors performed a simple experiment with 95
users in order to measure their satisfactions in the use of the TripFromTV+. That work is
difficult to compare with other cross-domain RSs since its evaluation is empirical and the
TripFromTV+ adopts a knowledge-based method, which in general is domain-specific and
requires a great knowledge about the domains and their interconnections.

Braunhofer, Kaminskas e Ricci (2013) addressed the cross-domain recommendation
task by developing a mobile application that selects music content (target domain) that
fits a place of interest (source domain) visited by the user. For that, the application used


2.4. Related Works 63

Table 4 – Classification of context-aware-based related works with respect to CARS as-
pects.

Paper RepresentationObtention Relevance Approach
(BLANCO-
FERNÁNDEZ et
al., 2010)(BLANCO-
FERNÁNDEZ et
al., 2011)(BLANCO-
FERNÁNDEZ et al.,
2011)

Ontologies Implicitly Partially
Observable
(Static)

Contextual
Ontology

(BRAUNHOFER;
KAMINSKAS; RICCI,
2013)

Key-Value Explicitly
and Implic-
itly

Partially Ob-
servable (Dy-
namic)

Similarity-
based

(YUAN et al.,
2012)(ZHANG; YUAN;
YU, 2014)

Key-Value Explicitly Partially Ob-
servable (Dy-
namic)

Modelling

(MOE; AUNG et al.,
2013)(MOE; AUNG,
2014b)(MOE; AUNG,
2014a)

Ontologies Explicitly Fully Ob-
servable
(Static)

Contextual
Ontology

(KAMINSKAS et al.,
2014)

Ontologies Explicitly Fully Ob-
servable
(Static)

Contextual
Ontology

(TANG; WAN; ZHANG,
2014)

Key-Value Inferred Fully Ob-
servable
(Dynamic)

Supervised
Learning

(TEKIN; SCHAAR,
2015)

Key-Value Implicitly Partially
Observable
(Static)

Modelling

(JI; SHEN, 2015) Key-Value Implicitly Partially
Observable
(Static)

Modelling

Proposed CD-CARS Key-Value Implicitly
and In-
ferred

Partially
Observable
(Static)

Pre-Filtering,
Post-Filtering
and Mod-
elling

the users’ location and emotional tags (contextual information) assigned to both music
tracks and point-of-interests (POIs), and adopted similarity metrics (e.g cosine, Jaccard,
etc.) for establishing a match between music tracks and POIs based on their emotional
tags. These tags were given explicitly by users without any user overlap between the
domains. Through a live user study with 10 users, the authors evaluated if the mobile
application is capable of providing recommendation with a certain grade of diversity.
That work is domain-specific and does not taking into account the users’ preferences in
the cross-domain recommendation. In contrast, it recommends items from the target


64 Chapter 2. Background and Related Work

domain (music) directly related to the source domain (POI) according to their contextual
information. The user’s context is only used for identifying in which POI he/she is located.
Therefore, the same recommendations may be made for different users located in that
POI.

Yuan et al. (2012) proposed a context-aware feature selection framework for cross
media recommendation in a digital library. The recommended items in that digital library
(DL) can be from different domains (e.g. book, movie and music) and have different tags
defined by the users representing the contextual features such as emotions, location, and
so on. Thus, the set of items, users and contexts is represented by a user-item-context
tensor. The authors initially applied a tensor factorization method (TUCKER, 1966) in
that tensor and then used a k-NN clustering algorithm to recommend the top-n items
regardless the items’ domain (multi-domain recommendation task). At least, the authors
performed experiments by using the Douban12 cross media dataset with user overlap in
order to evaluate the quality of cross media recommendations by means of recall and
diversity metrics. As we described, the goal of that work is to recommend items in several
domains aiming to improve the diversity of the system. In this case, only items from
a specific domain could be recommend without taking into account the current user’s
context. Besides, the contextual features considered in that work are based on user tags,
which can be too varied among different users and also are applied to items. In this way,
the users’ contexts are not considered by that work.

In (MOE; AUNG, 2014b), a cross-domain RS was developed to recommend cosmet-
ics (target domain) related to skin care problems (source domain). The developed system
represented the contextual information through ontologies. This contextual information
was related to cosmetics such as Place Zone, Age Level, Cosmetics Brand, Season, and
Price Range. The system was developed by using Taxonomic conversational case-based
reasoning (Taxonomic CCBR) on ontological properties to manage personalization sys-
tematically (GUPTA, 2001), Ford-Fulkerson algorithm (PARAMESWARAN; VENETIS;
GARCIA-MOLINA, 2011) applied to build the bridge of the semantic concepts between
source and target domains and a technique for gathering recommendations according to the
users’ contexts (called TOPSIS) (JADIDI; FIROUZI; BAGLIERY, 2010). The accuracy
of the developed system was measured by means of raking metrics such as Precision,
Recall and F-measure in a simple dataset without user overlap and with information
about cosmetics and skin care problems. Likewise (BLANCO-FERNÁNDEZ et al., 2011),
the work presented in (MOE; AUNG, 2014b) relies on the extensive use of knowledge
about two domains, in which their interconnections must be established a priori by the RS
designer. Thus, their domain-specific approach may be difficult to be adjusted for other
domains (e.g. book, movie, music, etc.).

12 http://www.douban.com


2.4. Related Works 65

By extending the work proposed in (BRAUNHOFER; KAMINSKAS; RICCI, 2013),
Kaminskas et al. (2014) proposed a knowledge-based framework for semantic networks that
link concepts from different domains. The framework propagates the node weights, in order
to identify target concepts that are most related to the source concepts. Based on data from
DBpedia13 without user overlap, the authors evaluated the framework for recommending
music (target domain) related to places of interest (source domain) according to location
and time as contextual information explicitly defined by the users. Similar to evaluated
in (BRAUNHOFER; KAMINSKAS; RICCI, 2013), the authors evaluated the knowledge-
based framework by means of a empirical experimentation with some users. Therefore,
the same criticism that we mentioned for (BRAUNHOFER; KAMINSKAS; RICCI, 2013)
can be applied to that work.

Tang, Wan e Zhang (2014) defined a task of cross-language context-aware citation
recommendation14, aiming to recommend English citations (target domain) for a given
part of the text (context) where a citation is made (e.g. introduction, motivation, related
work, etc.) in a Chinese paper (source domain). This task is very challenging because
the contexts and citations are written in different languages and there is a language gap
when matching them. To handle this problem, they adopted a method that uses machine
translation (MT) to translate contexts and/or citations, and then the problem is reduced
to the monolingual context-aware citation recommendation. With this reduced problem,
they proposed a bilingual context-citation embedding algorithm (called BLSRec-I), which
can learn a low-dimensional joint embedding space for both contexts and citations. They
evaluated the proposed methods based on a real dataset that contains Chinese contexts
and English citations. In this case, there is not the concept of user or item overlap,
given that the paper is considered as a “user”. In this way, they adopted three ranking
measures (Recall, MAP and Mean Reciprocal Rank) in order to evaluate the positions
of the right citations in the ranking list for each given context. Therefore, that work is
designed especially for the citation domain and intended for matching citations in the
correct context, without taking into account users.

Tekin e Schaar (2015) proposed a multimedia content aggregation framework, which
gathers content generated by multiple sources in order to provide content on demand for
its users. They proposed a content aggregation algorithm, called DIStributed COntent
Matching (DISCOM), capable of learning which content to gather and performing a
matching between it and users’ preferences, by exploiting similarities between user types.
In that system, each user is represented together with its context, which is considered as
the user’s type. Based on this user type (context), the content aggregation framework
requests content from one of the multimedia sources (multi-domain recommendation).
Thus, the context can be represented as user information such as age, gender, among

13 http://wiki.dbpedia.org
14 Despite being a cross-language task, we can classify it as a cross-domain one due to its approach.


66 Chapter 2. Background and Related Work

others. In addition, it may also be represented by the device type that the user is using
(e.g., computer, mobile phone, etc.). The authors adopted two datasets without user
or item overlap for evaluation purposes: the Yahoo! Today Module (YTM) (LI et al.,
2010) and a collected one with music items. Based on these datasets, they evaluated the
diversity of the recommendations generated by the content aggregation framework. A
limitation of that work is the fact that the contextualized recommendations are provided
for user/device types, thus, they are not personalized to a single user.

Ji e Shen (2015) proposed an improved group-aware CF-based algorithm15 which
predicts a user rating using a weighted sum of similar ratings from multiple user subgroups.
The algorithm is based on matrix factorization and CodeBook Transfer (CBT) (LI; YANG;
XUE, 2009a). The user subgroups are defined according to contextual information available
from their ratings. This contextual information can be divided into three categories: users’
contexts (age, gender, etc.), items’ contexts (genre, release date, etc.), and environments’
contexts from the user ratings (time, place, etc.). Experiments were done based on three
datasets with distinct domains (book, movie and music) with user overlap. The accuracy of
the proposed algorithm was evaluated through probabilistic measures (MAE and RMSE).
As we can note, the same limitation that we mentioned for (TEKIN; SCHAAR, 2015) can
be applied to that work.

In summary, the majority of the cross-domain RS described and categorized
above relies on ad-hoc approaches of CARS (Contextual Ontology, Similarity-based and
Supervised Learning), which may be difficult to customize to new situations, once that
they are usually designed for a specific domain and do not take into account context
obtained from user ratings (ADOMAVICIUS; TUZHILIN, 2015). As it can be seen from
Table 4, our proposed CD-CARS, in turn, relies on the use of systematic context-aware
techniques (Pre-Filtering, Post-Filtering and Modelling). These techniques have been
successfully adopted for single domain RS and, in general, require little domain knowledge,
since they are based on context obtained from user ratings (ADOMAVICIUS; TUZHILIN,
2015).

It is important to mention that some related works adopted the Modelling systematic
context-aware technique (YUAN et al., 2012)(TEKIN; SCHAAR, 2015)(JI; SHEN, 2015),
as it can be seen from Table 4. However, two of them (YUAN et al., 2012)(TEKIN;
SCHAAR, 2015) perform different cross-domain tasks and have distinct cross-domain goals
in comparison to our CD-CARS. In addition, the work proposed in (TEKIN; SCHAAR,
2015) is designed for making cross-domain recommendations with no overlap among users,
in opposite to the proposed in our CD-CARS. The related work proposed in (JI; SHEN,
2015) is the most similar to the our proposed CD-CARS according to the classification in

15 The authors consider their work as a context-aware RS by determining that a group can be viewed
as a user type (context).


2.5. Final Remarks 67

Table 3 and Table 4, but (JI; SHEN, 2015) differs from it once that they only propose
a single systematic context-aware technique (Modelling) and it is also not based on the
Merging user preferences approach from cross-domain CF-based algorithms, which allows
that traditional CF-based algorithms can also be used. Finally, the work proposed in
(JI; SHEN, 2015) is intended for making recommendations for group of users instead of
recommending items to a singular user. Table 5 summarizes the main limitations of the
related works mentioned above in comparison to our proposed CD-CARS.

Table 5 – Main limitations of context-aware-based related works in comparison to our
proposed CD-CARS.

Paper Systematic
approach

Accuracy
Goal

Linked-
Domain
Task

User
Overlap

Merging
user pref-
erences

(BLANCO-
FERNÁNDEZ et
al., 2010)(BLANCO-
FERNÁNDEZ et
al., 2011)(BLANCO-
FERNÁNDEZ et al.,
2011)

No No No Yes No

(BRAUNHOFER;
KAMINSKAS; RICCI,
2013)

No No No No No

(YUAN et al.,
2012)(ZHANG; YUAN;
YU, 2014)

Yes No No Yes Yes

(MOE; AUNG et al.,
2013)(MOE; AUNG,
2014b)(MOE; AUNG,
2014a)

No Yes No No No

(KAMINSKAS et al.,
2014)

No No No No No

(TANG; WAN; ZHANG,
2014)

No No No No No

(TEKIN; SCHAAR,
2015)

Yes No No No Yes

(JI; SHEN, 2015) Yes Yes Yes Yes No
Proposed CD-CARS Yes Yes Yes Yes Yes

2.5 Final Remarks

In this chapter, we presented the main concepts related to this thesis as well
as its related works. The research about these concepts provides a background for the
understanding of the proposed CD-CARS, described in the next chapter.


68

3 CD-CARS Proposal

In this chapter, we describe the CD-CARS proposal. For that, we formalize
the cross-domain context-aware recommendation problem (Section 3.1) and model the
contextual information (Section 3.2). In Section 3.3, we describe the proposed CD-CARS
algorithms whereas in Section 3.3.2 we present the cross-domain algorithms that can be
adopted as base, in combination with the proposed CD-CARS. At last, in Section 3.4, we
mention the final remarks of this chapter.

3.1 CD-CARS Problem Formalization

As mentioned in Chapter 1, the majority of the proposed approaches to cross-
domain recommendation deals with collaborative filtering (CF) (CREMONESI; TRIPODI;
TURRIN, 2011)(FERNÁNDEZ-TOBÍAS et al., 2012). CF is more convenient for cross-
domain recommendation due to the lack of homogeneous description of item content
in different domains. It can rely only on user ratings of items, usually represented by
user-rating matrices (User x Item). Also, most of the available cross-domain RS suggest
items regardless of the contextual conditions, which can be important to predict the users’
preferences in a particular context. Despite the large number of existing cross-domain CF-
based recommender systems (CD-CFRS), the synergy between them and context-aware
techniques is still little explored (FERNÁNDEZ-TOBÍAS et al., 2012)(CANTADOR;
CREMONESI, 2014).

In this way, we address the cross-domain recommendation problem under the CF
and context-awareness perspectives. For that, as defined by Adomavicius e Tuzhilin (2015),
we consider the user ratings as a function of three dimensions:

CR : User × Item×Context −→ Contextual Ratings

Thus, user ratings can be stored in a multidimensional user-rating-context tensor
for each item domain (e.g. books, movies, music, among others). Notice that the notion
of domain adopted in this thesis is based on the “Item level” definition (described in
Section 2.2.1) by considering movies and books belonging to different domains, for example.

In order to formalize our cross-domain context-aware recommendation problem, we
introduce the following definitions, considering a set of ‘n’ source domains (S1,S2, ...,Sn)
and just one target domain (T).


3.2. Modelling Contextual Information 69

Definition 1.

US1,US2, ...,USn,UT: sets of users for each domain;

IS1,IS2, ...,ISn,IT: sets of items for each domain;

CS1,CS2,CSn,CT: sets of contextual features for each domain;

CRSi : USi ×ISi ×CSi (where i=1,2,...,n) and CRT : UT ×IT ×CT: contextual user-rating
tensors (i.e., multidimensional matrices or cubes) for each domain;

US,T = (US1 ∪US2 ∪ ...∪USn ) ∩UT 6= �: at least one user must have preferences for items
in the target domain and, at least, a source domain (user overlap);

IS,T = IS1 ∩ IS2 ∩ ...∩ ISn ∩ IT = �: there is no item overlap between domains;

CS,T = CS1 ∩CS2 ∩ ...∩CSn ∩CT = CS1 ∪CS2 ∪ ...∪CSn ∪CT 6= �: the same set of possible
contexts is observed for user ratings in all domains (contextual overlap).

Hence, the problem to be solved in this thesis is to estimate unknown ratings for
items in a target domain (IT) by exploiting the user-rating tensors from the source and
target domains (CRSi where i = 1, 2, ...,n and CRT), assuming US,T , IS,T and CS,T .

It is important to mention that the ratings from the contextual user-rating tensors
can have different scales or forms in distinct domains. For example, ratings of music could
be represented as a binary form such as “Like” or “Dislike” while the ratings of movies
and books could be represented, respectively, by five-star or ten-star scales. Therefore, the
recommendation algorithms have to deal with this issue. For instance, an algorithm could
normalize the different scales from ratings among distinct domains (SANTOS et al., 2012).

3.2 Modelling Contextual Information

In this section, we describe how the contextual features are formalized (Section 3.2.1)
as well as the contextual information is obtained and selected considering its relevance
(Section 3.2.2).

3.2.1 Contextual Features Formalization

As mentioned in Section 2.3.2, contextual information can be of different “types”,
each one defining a certain contextual dimension, such as time (e.g. “day of week”, “period
of the day”, etc.), location (e.g. “at home”, “at work”, etc.), companion (e.g. “alone”,
“with friends”, etc.), among others. Furthermore, each contextual dimension can have a
hierarchical structure that can be represented as different attributes (e.g. Time: Date →


70 Chapter 3. CD-CARS Proposal

DayOfWeek → TimeOfWeek, or Date → Month → Quarter → Year) (ADOMAVICIUS;
TUZHILIN, 2015).

According to the CD-CARS problem formalization described before, we modelled a
set of contextual features (illustrated in Figure 11), for each domain (CS1,CS2, ...,CSn,CT),
as a Cartesian product of k contextual dimensions: Cd = D1 × D2 × ... × Dk (where
d=S1,S2, ...,Sn,T domains) (ADOMAVICIUS et al., 2005). Each dimension Dj (j =
1,2,...,k) can be represented by l contextual attributes (A1,A2, ...,Al). Each attribute Az
(z = 1,2,...,l) has a set of m values (v1,v2,...,vm) representing a part of the contextual
information. Moreover, “Unknown”, which represents a missing (or not observable) part
of the contextual information, is a default value (v1) for any contextual attribute.

Figure 11 – A contextual feature represented by dimensions, attributes and values.

Thus, the contextual information can be represented as a tuple of w values from
different contextual attributes and/or dimensions, i.e., a possible context (c’) of a set of
contextual features can be denoted as c′ = (v1,v2, ...,vw), where each value vs (s=1,2,...,w)
belongs to a different contextual attribute Az (z = 1,2,...,l) and/or dimension Dj (j =
1,2,...,k). Note that the order of these values in the tuple does not change the meaning of
the represented context (c’).


3.2. Modelling Contextual Information 71

For instance, consider three contextual dimensions (k = 3): D1 = Temporal,D2 =
Location,D3 = Companion. Each one can have different hierarchical representation
through contextual attributes. Suppose that D1 has two (l = 2) attributes (A1 =
Day,A2 = DayType), D2 has three (l = 3) attributes (A1 = City,A2 = State,A3 =
Country) and D3 has one (l = 1) attribute (A1 = CompanionType). For each contextual
attribute of those dimensions, there is a set of possible values such as:

• Temporal dimension (D1): A1 = {v1 = Unknown,v2 = Sunday,v3 = Monday,v4 =
Tuesday,v5 = Wednesday,v6 = Thursday,v7 = Friday,v8 = Saturday} with
eight possible values (m = 8), A2 = {v1 = Unknown,v2 = Weekday,v3 =
Weekend} with three possible values (m = 3);

• Location dimension (D2): A1 = {v1 = Unknown,v2 = Aberdeen,...,v2839 =
Zurich} with 2839 possible values (m = 2839), A2 = {v1 = Unknown,v2 =
Alabama,...,v381 = Wisconsin} with 381 possible values (m = 381), A3 = {v1 =
Unknown,v2 = Australia, ...v113 = Zambia} with 113 possible values (m = 113);

• Companion dimension (D3): A1 = {v1 = Unknown,v2 = Alone,v3 = Accompanied,
v4 = Family,v5 = Friends,v6 = Partner,v7 = Fellows} with seven possible values
(m = 7),

Given this example, a set of contextual features could be the combination of all
possible values from different attributes (six) and dimensions (three). So, by multiplying
the ‘m’ values of each attribute, in this case, it will be approximately twenty billion different
contexts resulted from the Cartesian product. Notice that this contextual feature modelling
does not guarantee the consistency of the information among different attributes. For
example, a context c′ ={Sunday,Weekday,Recife,Alagoas,EUA,Unknown} would be valid
according to our modelling, but inconsistent considering the real contextual information
(e.g., a consistent context could be c′ ={Sunday, Weekend, Recife, Pernambuco, Brazil,
Unknown}).

In this way, we let the RS application that uses this modelling responsible for
obtaining consistent contextual information. In fact, despite the huge amount of possible
contexts in the example given above, a real dataset obtained by the RS application, in that
example, would have approximately one hundred and sixty thousand possible contexts,
which is the result of the multiplication of the ‘m’ values of the more discrete contextual
attributes from different dimensions: Day (A1 from D1) with m = 8, City (A1 from D2)
with m = 2839, and CompanionType (A1 from D3) with m = 7, instead of the twenty
billion ones.

It is important to mention that the proposed contextual feature modelling is based
on the “Key-Value” model (referred in Section 2.3.2). In this case, the matching between


72 Chapter 3. CD-CARS Proposal

the context of the recommendation, which is called “contextual criteria”, and the contextual
information, represented by this model in the user-ratings, is made in a linear way. In
other words, once the tuple of w contextual values is established, from different contextual
attributes and/or dimensions, then a contextual criteria can be used as a query term (i.e.,
context of the recommendation), as described in Algorithm 1. A contextual criteria can
also be represented as a tuple of w contextual values from the same contextual attributes
and/or dimensions as the contextual information from ratings.

Despite the “Unknown” value be always a possible one for each contextual attribute,
it has distinct meanings in the contextual criteria and the contextual information from
ratings. For the contextual criteria, “Unknown” (v1) can be viewed as a part of the
context to be ignored (i.e., uninformed). In this case, this value means that any value
of the contextual information from ratings is acceptable for that contextual attribute
and dimension, including the “Unknown” one, which for the contextual information from
ratings represents a missing (or not observable) part of the contextual information, as
mentioned before. Therefore, the algorithm described above considers, for contextual
matching purposes, only the values different from “Unknown” on the contextual criteria.
This mechanism is sufficient for the proposed CD-CARS algorithms.

Algorithm 1. Matching between the context of the recommendation and the contextual
information from ratings.

Input: C, I, n (where C is the contextual criteria array of contextual values, and I is
the contextual information array of contextual values, considering both arrays with the
same size n).
Output: isMatched (a boolean value determining if there is a match between the context
of the recommendation and the contextual information).

1: procedure contextualMatching(C, I, n)
2: for v=1 to n do
3: if C[v] 6= “Unknown′′ and C[v] 6= I[v] then return false
4: end if
5: end for
6: return true
7: end procedure

end

3.2.2 Obtaining and Selecting Relevant Contextual Information

In this thesis, we are not concerned about how to obtain contextual information.
We let the RS application responsible for gathering this information and persisting it


3.2. Modelling Contextual Information 73

according to the proposed contextual model. As mentioned in Section 2.3.3, three methods
are more often used to acquire contextual information: explicit, implicit, and inferred. So,
depending on the data available in the datasets used by the RS application, some of these
methods can be more useful than others. Although the source of contextual information is
irrelevant for the proposed CD-CARS algorithms, the quality of the obtained contextual
information remains relevant for them.

On the other hand, after obtaining the contextual information, if there are many
contextual attributes available for a contextual dimension in the contextual model, then
selecting relevant contextual information is important for the quality and performance
of the CD-CARS. Taking the example previously given, if the temporal dimension (D1)
has two attributes: Day (A1 = {v1 = Unknown,v2 = Sunday,v3 = Monday,v4 =
Tuesday,v5 = Wednesday,v6 = Thursday,v7 = Friday,v8 = Saturday}), and DayType
A2 = {v1 = Unknown,v2 = Weekday,v3 = Weekend}, then CD-CARS could select just
one of these attributes for recommendation. The same may occur among two distinct
contextual dimensions, for example, also considering the location dimension (D2), CD-
CARS could select just one of these dimensions for the recommendation.

Some types of contextual information (e.g. temporal, location, companion, etc.)
can be more relevant in a given domain (e.g. books, movies, music, etc.) than others. As
mentioned in Section 2.3.4, the selection of a relevant contextual dimension or attribute
can be made with a feature selection method from data mining (LIU; MOTODA, 1998).
In the proposed CD-CARS, for each different target domain where the recommendation
takes place, we can apply the information gain measure1 considering the user-rating as a
class and each contextual attribute as a tested attribute of the information gain measure.
For instance, the user-rating class could have five possible nominal values representing
ratings of a five-star scale. Note that in that example we assume that all ratings from
the source and target domains are in the same scale and form. However, as mentioned
before, for different forms or scales of ratings among distinct domains an algorithm must
normalize them(SANTOS et al., 2012). Besides, the information gain calculated for each
contextual attribute may vary depending on the target domain in which the data mining
method is applied.

So, from the list of most relevant attributes generated by the information gain
measure, the CD-CARS could select only the contextual attribute with higher information
gain value. Then, it could execute performance experiments in the selected attribute and,
progressively, select the next relevant attribute of a different contextual dimension if the
performance difference is significative between the previously selected attribute and the
next one. In the case of selecting contextual attributes in the same dimension, however,

1 InfoGain(Class,Attribute) = H(Class) - H(Class | Attribute), where H means “entropy”, defined in
Information Theory.


74 Chapter 3. CD-CARS Proposal

the CD-CARS could select only the top attribute with higher information gain value,
since the subsequent attributes are nothing more than different representations of the top
attribute selected. For instance, if the top contextual attribute was Day, followed by the
DayType, then selecting only the Day attribute is sufficient, as it represents all values of
DayType attribute in a more discrete way.

3.3 CD-CARS Algorithms

The algorithms proposed in our work rely on the use of a base cross-domain
recommender system, in which the predicted rating (R̂(u,i)) for a particular pair of user
u and item i, belonging to the target domain item set (IT), can be formalized as:

R̂(u,i) = CD(u,i,RS1,RS2, ...,RSn,RT ), i ∈ IT (3.1)

In which, RSi (i=1,2,...,n domains) and RT are 2-dimensional user-rating matrices,
respectively in the source and target domains. Notice that the base cross-domain RS
does not take into account the contextual information. In addition, it is possible that
different scales and forms of ratings from distinct domains have to be handled by the
base cross-domain RS algorithm. As mentioned before, an algorithm could normalize the
ratings among distinct domains (SANTOS et al., 2012). In this way, we assume that all
ratings from the source and target domains are in the same scale and form.

In our CD-CARS problem, we consider contextual user-rating tensors and we need
a function (F) to make rating predictions of items (i) for users (u) in contexts (c) given
the tensors from source (CRSi, where i=1,2,...,n domains) and target domains (CRT), as
defined in Equation 3.2.

ĈR(u,i,c) = F(u,i,c,CRS1,CRS2, ...,CRSn,CRT ), i ∈ IT (3.2)

The function (F) can be implemented using any of the three proposed CD-CARS
algorithms described in the following section.

3.3.1 Proposed Algorithms

We designed the algorithms according to three different context-aware RS paradigms
(ADOMAVICIUS; TUZHILIN, 2015): Pre-filtering (PreF), Post-filtering (PostF) and
Modelling. These paradigms are usually adopted in single-domain RS, but we extended
their directives for the cross-domain recommendation task by taking into account the
contextual user-rating tensors from different domains.


3.3. CD-CARS Algorithms 75

3.3.1.1 Cross-Domain PreF Algorithm

PreF algorithm initially uses contextual information to filter the contextual user-
rating tensor from the target domain (CRT) in order to obtain a two-dimension (2D)
user-rating matrix. On the other hand, the contextual user-rating tensors from the
source domains (CRS1,CRS2, ...,CRSn) are collapsed into a two-dimension (2D) user-
rating matrix, by aggregating ratings for the same user-item pair in different contexts,
prioritizing the user-ratings of the context of the recommendation (c). Then, the base
cross-domain algorithm is applied to these matrices to produce the predicted ratings
(ĈR(u,i,c)). Figure 12 illustrates the PreF technique, which is formalized in three steps
as follows.

• Step (1) Define the 2D reduced matrix (context-filtered matrix) for the target domain:

RcT (u,i) = CRT (u,i,c) (3.3)

The context-filtered matrix only has ratings according to:

R̂T (u,i) =




ˆCRT (u,i,c), if c = o
not available, otherwise

(3.4)

where ‘o’ represents the rating’s context.

• Step (2) Define the 2D aggregated matrices (prioritizing the user-ratings with the
context of the recommendation) for the source domains:

Rj(u,i) = CRj(u,i,c) (3.5)

Where j = S1,S2, ...,Sn source domains. For each source domain ‘j’, the aggregated
ratings are calculated as:

R̂j(u,i) =




ˆCRj(u,i,c), if c = o∑
c∈Cj

ˆCRj (u,i,c)

|Cj|
, otherwise

(3.6)

where ‘o’ represents the rating’s context.

• Step (3): Apply the base cross-domain technique using the reduced matrices:

ĈR(u,i,c) = CD(u,i,RS1,RS2, ...,RSn,R
c
T ), i ∈ IT (3.7)

In the steps above, the matching between the user-rating context (c) and the
context of the recommendation (o) is made according to Algorithm 1 (page 72). Besides,
we assume that the user-ratings from the source and target domains are in the same scale
and form, as mentioned before.


76 Chapter 3. CD-CARS Proposal

Figure 12 – The pre-filtering cross-domain recommendation is made by filtering the target
contextual user-rating tensor for a given context.

Since PreF algorithm filters the contextual user-rating tensor from the target
domain, it could have a few user-ratings to make recommendations in very specific
contexts, thus, the recommendation process would be made using almost entirely only
the user-ratings from the source domains. Due to this drawback, we also proposed other
context-awareness techniques as described in the next sections.

3.3.1.2 Cross-Domain PostF Algorithm

PostF algorithm initially produces a single unified user-rating matrix by aggregating
ratings for the same user-item pair in different contexts, prioritizing the user-ratings of
the context of the recommendation (c). The base cross-domain is then applied using as
input the aggregated rating matrices. Finally, contextual information is used to filter the
ratings produced by the cross-domain algorithm. This filtering is done by considering
items contained in the set of preferred item categories (e.g. comedy, action, rock, etc.) by
the user in a given context (e.g. considering only comedy movies on weekdays). Figure 13
illustrates this algorithm, which is formalized in the following steps:

• Step (1) Define the 2D aggregated matrices (prioritizing the user-ratings with the
context of the recommendation) for the source and target domains:

Rj(u,i) = CRj(u,i,c) (3.8)


3.3. CD-CARS Algorithms 77

Where j = S1,S2, ...,Sn,T domains. For each domain ‘j’, the aggregated ratings are
calculated as:

R̂j(u,i) =




ˆCRj(u,i,c), if c = o∑
c∈Cj

ˆCRj (u,i,c)

|Cj|
, otherwise

(3.9)

where ‘o’ represents the rating’s context.

• Step (2): Apply the base cross-domain technique using the matrices from Step (1)
and collect the predicted ratings:

R̂(u,i) = CD(u,i,RS1,RS2, ...,RSn,RT ), i ∈ IT (3.10)

• Step (3): Given a context of recommendation (o) and user u as input, a rating
produced for an item (R̂(u,i)) is discarded if the number of “good” rated items of a
given item category (gi) is less than a threshold value (θ) in that context. Otherwise,
the rating predicted by the cross-domain algorithm is maintained:

ĈR(u,i,c) =


 R̂(u,i), if c = o and CP(u,c,gi) >= θnot available, otherwise (3.11)

where the category preferences tensor (CP(u,c,g)) contains the number of “good”
rated items for each item category g, from different domains, observed in a context
c for a user u. The definition of a “good” rated item can be made according to
the scale and form of the user-ratings from distinct domains. As mentioned before,
we considered that all user-ratings are normalized among the different domains. In
this way, a “good” rated item could have a rating of, at least, “four” in a five-star
scale, for example. We let the responsibility of this definition for the CD-CARS
implementation as well as the optimal θ value, which could be calculated considering
the number of “good” rated items in general. For instance, if a user has fifty “good”
rated items in general (regardless their categories), then the θ value could be set to
10% of that number (i.e., θ = 5), which would mean that only categories with at
least five “good” rated items would be considered.

An alternative way to define Equation (3.11) is:

ĈR(u,i,c) =


 R̂(u,i) × (1 + ω), if c = o and CP(u,c,gi) >= θR̂(u,i) × (1 −ω), otherwise (3.12)

where ω is a factor to increase or decrease the predicted rating value (R̂(u,i)). The
θ value still has the same meaning of Equation (3.11) by defining a threshold for
determining which categories of items are relevant or not according to the minimal


78 Chapter 3. CD-CARS Proposal

number of “good” rated items. Thus, relevant categories will have the predicted
rating value increased whereas irrelevant categories will have the predicted rating
value decreased. In this case, the ω could be an empirically defined value (e.g.
“0.1” value would increase or decrease by 10% the predicted rating value). Also, it
could be calculated, proportionally, according to the relevance of the item category
preferred by the user in a given context (e.g. the higher is the number of “good”
rated items the higher is the ω value). Therefore, the PostF algorithm could adjust
the predicted rating instead of filtering it. Likewise the θ value, the ω value definition
is responsibility of the CD-CARS implementation.

Figure 13 – The cross-domain post-filtering recommendation is made over the aggregated
user-rating matrices and then post-filtered according to contextual user pref-
erences.

It is important to mention that g could also be expressed as a set of attributes
which characterize an item (e.g. user tags), instead of being expressed as item categories
like item genres (e.g. comedy, action, rock, etc.), without losing generality.


3.3. CD-CARS Algorithms 79

Similar to PreF algorithm, in the PostF algorithm, we apply a base cross-domain
algorithm. However, none of the input tensors (from source and target domains) are
filtered by context. Instead, all contextual user-rating tensors are reduced to matrices (by
aggregating user-ratings from different contexts) that serve as input for the base cross-
domain algorithm. After applying the base algorithm, only the post-filtered ratings are
taken into account. In this process, an important task is to build the category preferences
tensor (CP(u,c,g)).

We build the category preferences tensor from the contextual user-rating tensors
of the source and target domains. Depending on the θ value, it is possible that a user
only has category preferences in source domains. The same situation could happen in a
scenario where a dataset contains just a few users with overlap. In these cases, the PostF
algorithm would not be able to recommend items in the target domain.

In order to alleviate this problem, some techniques can be used. For example,
using association rule mining (HIPP; GÜNTZER; NAKHAEIZADEH, 2000) to discover
usage patterns between different domains and contexts (e.g. we could infer that users
who like to read romance books on weekdays also like to watch romance movies on
weekdays). Thus, we propose the enhancement of the category preferences tensor by
using association rules to infer other item categories preferred by the users according to
the possible contexts. Figure 14 illustrates this idea, which is only required when a user
receives a recommendation in the target domain and the category preferences tensor does
not have information about his/her rated item categories in that domain.

As it can be seen from Figure 14, the association rules input is generated from the
category preferences tensor (CP(u,c,g)). Each entry of this input is extracted from a
user (u) and represents a set of pairs, composed of an item category (g) and context (c)
in which that user has the number of “good” rated items greater or equal than the theta
value. With that input, we can use an algorithm for generating association rules such as,
for example, the Apriori (AGRAWAL; IMIELIŃSKI; SWAMI, 1993).

After applying that algorithm and obtaining the resulting association rules, we
can select only the most relevant ones according to their confidence and support levels.
Optimal values for these parameters can vary depending on the dataset used in the
CD-CARS application. In addition, we are interested only in “cross-domain” rules, i.e.,
rules that relate item categories between a source domain and the target domain. We
discard rules that relate item categories between two source domains (or for the same
domain), since they do not make the PostF recommendation possible when a user only has
item category preferences in source domains. Finally, these rules are used for enhancing
the category preferences tensor, which will have inferred item categories and contexts
beyond the original preferences.


80 Chapter 3. CD-CARS Proposal

Figure 14 – Category preferences tensor enhancement from association rules.

Note that source and target domains have different sets of categories (e.g. music x
books). However, by using association rules in the category preferences tensor enhancement,
we can make cross-domain PostF recommendations even to less related domains such as
music and movies, making the CD-CARS domain-independent. We remember that we
consider this relation among distinct domains according to the set of item genres of them.
As more the domains have item genres in common the more related they are considered (e.g.
Book and Television have several item genres in common such as “romance”, “educational”,
“religion”, etc.).

3.3.1.3 Cross-Domain Modelling Algorithm

Unlike the PreF and PostF algorithms, the Modelling algorithm does not need to
use a base cross-domain algorithm. In fact, it makes “multidimensional” recommendations
by considering contextual information beyond users and items without reducing the
contextual user-rating tensors.

In this thesis, we propose the extension of two single-domain context-aware Mod-
elling approaches: heuristic calculations (ADOMAVICIUS et al., 2005) and matrix factor-


3.3. CD-CARS Algorithms 81

ization (BALTRUNAS; LUDWIG; RICCI, 2011) (as mentioned in Section 2.3.5). This
extension goes beyond the inclusion of contextual information in a single-domain CF-based
algorithm since we intend to perform cross-domain context-aware recommendations. Thus,
we consider four dimensions (user, item, context, and domain) instead of three (user, item,
and context) for the cross-domain context-aware recommendation.

The heuristic calculation approach described in (ADOMAVICIUS et al., 2005)
includes contextual information by using an n-dimensional distance metric instead of the
user-user or item-item similarity metrics traditionally used in such techniques (RICCI;
ROKACH; SHAPIRA, 2011) whereas the matrix factorization approach described in
(BALTRUNAS; LUDWIG; RICCI, 2011) could be generalized to consider additional
dimensions (e.g. item domain) for the representation of the data as a tensor of four
dimensions (user, item, context, and domain).

Figure 15 – The cross-domain modelling recommendation uses contextual information
directly in the recommendation function as an explicit predictor of a user
rating for an item.

The Modelling algorithms proposed in this thesis (illustrated in Figure 15) are
formalized in a single step as follows.

• Step (1): Apply the extended version of the base cross-domain algorithm using the
user-rating-context tensors:

ĈR(u,i,c) = CD(u,i,c,CRS1,CRS2, ...,CRSn,CRT ), i ∈ IT (3.13)

In this way, for the Modelling algorithm based on heuristic calculations, instead
of using a similarity metric only for calculating the user-user and item-item distances,
it could also include other dimensions as context and item domain. For example, if the
similarity metric is the Euclidian distance, it could be defined as:

dist[(u,i,c,d), (u′, i′,c′,d′)] =
√
w1d

2
1(u,u′) + w2d22(i, i′) + w3d23(c,c′) + w4d24(d,d′)

(3.14)


82 Chapter 3. CD-CARS Proposal

where d1, d2, d3, and d4 are distance functions defined for dimensions User, Item, Context,
and Domain, respectively, and w1, w2, w3, and w4 are the weights assigned for each of
these dimensions. For instance, these weights can be set according to the relevance of
the four dimensions or empirical values. As mentioned in Section 2.3.4, it is important to
select properly the contextual information used in the recommendation dataset.

In addition, depending on the way how contextual information is obtained, it
could be more or less relevant. For example, the contextual relevance might be low in a
system that a user rates an item without explicitly consider the context in that rating,
i.e., the context is dissociated from the user-rating (ADOMAVICIUS; TUZHILIN, 2015).
On the other hand, a system that collects the user contextual information together with
his/her rating may be more reliable, obtaining a more relevant contextual information
(ADOMAVICIUS; TUZHILIN, 2015).

Therefore, the use of the Modelling algorithm based on heuristic calculations will
depend on the relevance of the association between context and user-rating. To avoid this
issue, we also propose to adapt some algorithms based on matrix factorization (or tensor
factorization, to be more specific) that considers the contextual information, such as the
proposed ones in (KARATZOGLOU et al., 2010), (HIDASI; TIKK, 2012), and (KIM;
YOON, 2014). For these algorithms, the item domain (e.g. book, TV, music, etc.) could
be considered as a contextual dimension. This adaptation is simple and maintains the
original logic of those algorithms.

Finally, it is important to notice that the Modelling algorithm does not consider
the context of the recommendation, as opposed to the PreF and PostF algorithms. On the
other hand, it only takes into account the context of the user-ratings. Thus, the Modelling
algorithm can recommend items without knowing the context of the user at the moment
of the recommendation.

3.3.1.4 Cross-Domain Hybrid Contextual Algorithms

In the previous sections, we described three proposed CD-CARS algorithms. In
this section, we discuss how these different algorithms can be combined.

One possibility is to combine the PreF and PostF algorithms, as illustrated in
Figure 16. Naturally, the PreF algorithm can be used before the PostF one, since the PreF
only filters out the recommendation data before the base cross-domain algorithm is applied,
whereas the PostF filters out only the outcome of the base cross-domain algorithm.

Another possibility is to combine the Modelling and PostF algorithms, as illustrated
in Figure 17. Again, the PostF algorithm can be used after the first proposed algorithm,
which in this case is the Modelling one. Unlike the utilization of the Modelling algorithm
alone, in which the context of the recommendation is not considered, in this combination


3.3. CD-CARS Algorithms 83

is necessary to take into account the context in the moment of the recommendation once
that the PostF algorithm demands this information in order to filter irrelevant items
recommended by the Modelling algorithm.

Figure 16 – The cross-domain PreF algorithm can be used before the PostF algorithm in
a possible combination.

In the two hybrid algorithms mentioned above (PreF + PostF and Modelling +
PostF), the combinations are made in a direct way, without requiring to adapt any proposed
algorithm. Other combinations might demand adaptation of the proposed algorithms, so,
this could be a future direction of our research.

3.3.2 Base Cross-Domain Algorithms

In this thesis, we propose the adoption of single-domain and cross-domain algo-
rithms. According to the taxonomy presented in Section 2.2.5, the adopted algorithms
fit in the Aggregating knowledge category and, more specifically, in the Merging user
preferences approach. Section 3.3.2.1 describes the single-domain CF-based algorithms
whereas Section 3.3.2.2 describes the cross-domain CF-based algorithms.


84 Chapter 3. CD-CARS Proposal

Figure 17 – The cross-domain modelling algorithm can be used before the PostF algorithm
in a possible combination.

3.3.2.1 Single-Domain as Cross-domain Algorithms

As stated by Cremonesi, Tripodi e Turrin (2011), if there is overlap among users
and/or items, then standard single-domain CF algorithms can be used for generating
cross-domain recommendations by merging user-rating matrices from different domains,
considering that these matrices are normalized (see Section 2.2.5). Thus, these algorithms
also can be used to make cross-domain recommendations considering our formalization of
the CD-CARS problem, since we have an overlap of users among domains (see Section 3.1).
Therefore, those CF-based algorithms can be used as a base cross-domain algorithm
together with our proposed CD-CARS algorithms.

In this way, we can apply single-domain collaborative filtering algorithms as
cross-domain (CD) technique in equations 3.7 and 3.10, for PreF and PostF algorithms,
respectively. Note that the Modelling algorithm does not require a base cross-domain
algorithm, as mentioned before. The following sections describe some algorithms for two
traditional classes of CF-based algorithms that can be applied as a base cross-domain
algorithm.

3.3.2.1.1 Neighborhood-based Algorithms

Neighborhood-based algorithms (RICCI; ROKACH; SHAPIRA, 2011) calculate the
similarity between two users or items, producing a rating prediction, which is computed
by averaging the ratings expressed by similar users or items, weighted with the respec-
tive similarity values. We describe some of these algorithms below (RICCI; ROKACH;
SHAPIRA, 2011):


3.3. CD-CARS Algorithms 85

• NNUserNgbr computes a neighborhood consisting of the nearest ‘n’ users to a
given user. “Nearest” users are defined by a similarity metric. In other words, the
recommendations are derived from a neighborhood of the ‘n’ most similar users. The
optimal value of ‘n’ can be defined through experiments.

• ThresholdUserNgbr computes a neighborhood through a similarity threshold and
takes any users that are at least that similar to a given user. The threshold should
be between −1 and 1. The higher is the threshold value, the more selective is
the neighborhood. Again, the optimal threshold value can be estimated from
experimentation.

• GenericItemBasedCF is simpler than the user-based CF algorithms described above
because there is no parameter to be adjusted (like as ‘n’ or threshold). This item-
based CF algorithm compares series of preferences expressed by many users, for
one item, rather than by one user for many items (user-based). Some similarity
metrics used by user-based CF algorithms also can be used in order to compute the
similarity between items.

A crucial aspect of these algorithms is the similarity computation between items or
users. Similarity in user-based and item-based CF algorithms can be computed by means
of traditional similarity metrics, such as (RICCI; ROKACH; SHAPIRA, 2011):

• Weighted Euclidean distance similarity computes the Euclidean distance (dist)
between two such user or items points. The equation below denotes a generic
calculation of this metric:

dist[(u,i), (u′, i′)] =
√
w1d

2
1(u,u′) + w2d22(i, i′) (3.15)

where d1 and d2 are distance functions defined for two dimensions: User and Item,
respectively, and w1 and w2 are the weights assigned for each of these dimensions.
This similarity metric never returns a negative value, and the more similar two users
are (i.e., the larger the similarity value between them is), the smaller is the distance
between them. In addition, if we only need to calculate the distance between users,
then we can consider i = i′. On the other hand, if we only need to calculate the
distance between items, then we can consider u = u′.

• Cosine similarity (cos) is a measure of similarity between two vectors of an inner
product space that measures the cosine of the angle between them. In the context of
item recommendation, this measure can be employed to compute user similarities by


86 Chapter 3. CD-CARS Proposal

considering a user u as a vector xu ∈ R|I|, where xui = rui if user u has rated item i,
and 0 otherwise. The similarity between two users u and v is then computed as

cos(u,v) =
∑
i∈Iuv ruirvi√∑

i∈Iu r
2
ui

∑
j∈Iv r

2
vj

(3.16)

where Iuv denotes the items rated by both u and v users. The same idea can be used
to obtain similarities between two items i and j, according to the equation below:

cos(i,j) =
∑
u∈Uij ruiruj√∑

u∈Ui r
2
ui

∑
u∈Uj r

2
uj

(3.17)

Cosine similarity is particularly used in positive space, where the outcome is bounded
in a [0,1] interval.

• Pearson correlation (PC) similarity is a ratio of the covariance of two data sets to
their standard deviations. Unlike the Cosine similarity, this metric considers the
effects of mean and variance of the ratings made by users u and v. The equation
below denotes the calculation of this metric:

PC(u,v) =
∑
i∈Iuv (rui − r̄u)(rvi − r̄v)√∑

i∈Iuv (rui − r̄u)2
∑
i∈Iuv (rvi − r̄v)2

(3.18)

The same idea can be used to obtain similarities between two items i and j, according
to the equation below:

PC(i,j) =
∑
u∈Uij (rui − r̄i)(ruj − r̄j)√∑

u∈Uij (rui − r̄i)2
∑
u∈Uij (ruj − r̄j)2

(3.19)

While the sign of the outcome of this metric indicates whether the correlation is
direct or inverse, its magnitude (ranging from 0 to 1) represents the strength of the
correlation.

More sophisticated similarity metrics can also be used such as the proposed ones
in (DIDAY; BOCK, 2000), (BEZERRA; CARVALHO, 2004), (BEZERRA; CARVALHO,
2011), among others.

The recommendation of these single-domain neighborhood-based algorithms, used
for cross-domain purposes, is made as usual (CREMONESI; TRIPODI; TURRIN, 2011),
except by the fact that only items from the target domain are recommended.

3.3.2.1.2 Matrix factorization algorithms

Matrix factorization algorithms (RICCI; ROKACH; SHAPIRA, 2011) map users
and items to a latent feature space, commonly with reduced dimensionality (f). User-item


3.3. CD-CARS Algorithms 87

interactions are modeled as inner products in that space. The latent space tries to explain
ratings by characterizing both items and users on factors automatically inferred from user
feedback. The major challenge of these algorithms is computing the mapping of each item
and user to factor vectors. After the recommender system completes this mapping, it can
easily estimate the rating a user will give to any item.

Unlike neighborhood-based algorithms, matrix factorization algorithms do not use
similarity metrics, but techniques for identifying latent semantic factors like singular value
decomposition (SVD) (RICCI; ROKACH; SHAPIRA, 2011). For the SVD, consider that
each item i is associated with a vector qi ∈ Rf and each user u is associated with a vector
pu ∈ Rf. For a given item i, the elements of qi measure the extent to which the item
possesses those factors, positive or negative. For a given user u, the elements of pu measure
the extent of interest the user has in items that are high on the corresponding factors
(again, these may be positive or negative).

The resulting dot product2, qTi pu, captures the interaction between user u and item
i, i.e., the overall interest of the user in characteristics of the item. The final rating is
created by the rule above:

r̂ui = µ + bi + bu + qTi pu (3.20)

where µ is the overall average rating and the parameters bu and bi indicate the observed
deviations of user u and item i, respectively, from the average µ.

In order to learn the model parameters (bu, bi, pu and qi) the regularized squared
error can be minimized:

min
b∗,q∗,p∗

∑
(u,i)∈κ

(rui −µ− bi − bu − qTi pu)
2 + λ4(b2i + b

2
u+ ‖ qi ‖

2 + ‖ pu ‖2) (3.21)

The constant λ4, which controls the extent of regularization, is usually determined
by cross-validation. Minimization is typically performed by either stochastic gradient
descent or alternating least squares. Alternating least squares techniques rotate between
fixing the pu’s to solve for the gi’s and fixing the qi’s to solve for the pu’s. When one of
these is taken as a constant, the optimization problem is quadratic and can be optimally
solved (BELL; KOREN; VOLINSKY, 2007).

An easy stochastic gradient descent optimization was popularized by Koren (2008).
The algorithm loops through all ratings in the training data. For each given rating rui, a
prediction (r̂ui) is made, and the associated prediction error eui = rui − r̂ui is computed.
For a given training case rui, the parameters are modified to move in the opposite direction
of the gradient, producing:
2 The dot product between two vectors x, y ∈ Rf is defined as xT y =

∑f
k=1 xk · yk.


88 Chapter 3. CD-CARS Proposal

• bu ← bu + γ · (eui −λ4 · bu)

• bi ← bi + γ · (eui −λ4 · bi)

• qi ← qi + γ · (eui ·pu −λ4 · qi)

• pu ← pu + γ · (eui · qi −λ4 ·pu)

As the single-domain neighborhood-based algorithms, single-domain matrix factor-
ization algorithms can be used as the base of cross-domain recommendations, considering
that a unique merged user-rating matrix from different domain contains information
about users and items. The recommendation of these single-domain neighborhood-based
algorithms, used for cross-domain purposes, is made as usual (CREMONESI; TRIPODI;
TURRIN, 2011), except by the fact that only items from the target domain are recom-
mended.

It is important to mention that the single-domain matrix factorization algorithms
are different from the tensor factorization described in the Modelling approach. Instead
of considering the mapping between users and items, matrix factorization (or tensor
factorization, to be more specific) algorithms can be generalized to consider the context
for the representation of the data as a tensor (KIM; YOON, 2014).

3.3.2.2 Cross-Domain Algorithm

In the previous section, we described the single-domain CF-based algorithms
adopted as a base cross-domain algorithm. One of the CD-CARS’s advantages is to
allow that the majority of traditional single-domain CF-based algorithms can be used in
combination with the proposed CD-CARS algorithms.

In this section, we describe an actual cross-domain algorithm adopted in this thesis.
Once that this algorithm is originally intended to perform cross-domain recommendations,
we can directly apply it as a base cross-domain algorithm in combination with our
proposed CD-CARS algorithms. For that, we adopted a neighborhood-based (CF-based)
cross-domain algorithm, due to its simplicity.

The cross-domain neighborhood-based algorithm adopted is proposed by (CRE-
MONESI; TRIPODI; TURRIN, 2011), NNUserNgbr-transClosure. It enhances the NN-
UserNgbr algorithm (described in Section 3.3.2.1.1), which is intended for single-domain
recommendations, by improving its user-to-user or item-to-item similarities calculations
with a “transitive closure” method. This improvement is achieved by discovering indirect
relations among elements (i.e., transitive closure discovers all n-steps similarity paths
between any pair of users, extending their neighborhood), as illustrated in Figure 18. For
instance, if there exist two direct links: user A = user B = 1 (e.g. full similarity by the


3.4. Final Remarks 89

Pearson metric) and user B = user C = 1, then the transitive closure allows to set user A
= user C = 1.

According to (CREMONESI; TRIPODI; TURRIN, 2011), this transitive closure
procedure is described as follows. Given a binary relation S, where sij is equal to either 1
or 0, the algebraic transitive closure of S is the union of successive powers of the original
matrix, i.e.:

Strans =
⋃
n∈N

Sn (3.22)

where
⋃

is the union operator. Matrix S is represented by the weighted connections
among a set of items. However, since this matrix does not represent a binary relation,
Equation (3.22) has been adapted as follows.

Figure 18 – Original (a) and enhanced (b) item-to-item connections. Solid circles represent
items belonging to a single domain, whereas blank circles represent cross items
that act as a bridge among different domains (CREMONESI; TRIPODI;
TURRIN, 2011).

The “union” operator, which is defined for binary relations, has been replaced by
the “maximum” operator, Z = max(X,Y) where the maximum matrix Z between two
similarity matrices X and Y has been defined so that zij = max(xij,yij). The maximum
operator adds the similarities discovered for new links while maintaining the original values
for existing connections (since original similarities are generally stronger than derived
ones).

(CREMONESI; TRIPODI; TURRIN, 2011) has limited the transitive closure to
only two steps. Experiments showed that a transitive closure with more than two steps did
not provide any sensible improvement in the recommendation accuracy while increasing
computational requirements. Thus, the enhanced item-to-item similarity matrix was
computed as:

S∗ = max(S,S2) (3.23)

Except by this user-to-user (or item-to-item) similarity calculation, the remaining
logic of the NNUserNgbr-transClosure algorithm is the same as the NNUserNgbr one.

3.4 Final Remarks

In this chapter, we described the CD-CARS proposal. For that, we formalized
the cross-domain context-aware recommendation problem and modeled the contextual


90 Chapter 3. CD-CARS Proposal

information. In addition, we described the proposed CD-CARS algorithms as well as the
base cross-domain algorithms adopted in this thesis.

Regarding the CD-CARS problem formalization, we remember that we assumed
that all ratings from the source and target domains are in the same scale and form.
As mentioned before, an algorithm could normalize the ratings among distinct domains
(SANTOS et al., 2012). In the same way, the proposal of the CD-CARS algorithms
considers that all ratings from different domains are normalized.

As outlined in Section 3.2, the proposed contextual feature modelling is based on
the “Key-Value” model (mentioned in Section 2.3.2), since it is simple and relatively easy
to implement and use (VIEIRA; TEDESCO; SALGADO, 2009)(BETTINI et al., 2010).
Besides, this kind of contextual model allows a quickly matching between the context of
the recommendation and the context represented in the user-ratings, so, that model is
sufficient for the proposed CD-CARS algorithms.

One of the advantages of the proposed CD-CARS algorithms is the possibility of
using traditional single-domain and cross-domain CF-based algorithms as a base algorithm,
which is used in combination with the proposed ones. The adoption of other algorithms of
different approaches (e.g. content-based filtering, semantic-based, and so on.) for serving
as a base cross-domain algorithm may be studied in future research.

A particular common aspect between the PreF and PostF algorithms relies on
the aggregation of user-ratings from different contexts in order to reduce the contextual
user-rating tensors into matrices. Despite the proposal of these algorithms is concerned
about a theoretical scenario, in which a user can have multiple ratings for the same item
in distinct contexts, this usually does not happen in real datasets, as it is the case of the
adopted ones in this thesis. Thus, this important aspect of those algorithms could not be
experimented.

Taking into account the PreF and Modelling algorithms, we can say that they are
capable of making domain-independent cross-domain recommendations, i.e., they do not
use any information about the item attributes (unlike the PostF approach), so, any “kind
of item” (domain) can be used in such approaches independently. Indeed, the pre-filtering
approach only reduces the dataset to be used by any traditional CD-CFRS, which is
capable of making domain-independent cross-domain recommendations (CREMONESI;
TRIPODI; TURRIN, 2011)(FERNÁNDEZ-TOBÍAS et al., 2012). Also, a heuristic-based
modelling approach only changes the user/item similarity metric used by any traditional
CD-CFRS, so, the domain-independent cross-domain recommendation can be made as
well.

On the other hand, for making domain-independent cross-domain recommendations
with the post-filtering approach, the used dataset must have the description of item


3.4. Final Remarks 91

categories/genres (or any other attribute that categorizes them). Hence, the post-filtering
approach filters out (or adjusts), from the cross-domain recommended item list, the items
of a category according to the category preferences of the users. These preferences are
obtained according to the users’ contexts and can be generated by several approaches,
from simple arithmetic calculations to more sophisticated ones as the solution described
in this chapter.

In the next chapter, we will present an implementation of the CD-CARS proposal
and an experimental evaluation performed on real datasets.


92

4 CD-CARS Implementation

This chapter describes particular details of an implementation of the CD-CARS
proposal. For that, we investigate the problem outlined in Section 3.1 taking into account
three contextual dimensions and three distinct domains from real datasets. Section 4.1
describes the properties of CD-CARS datasets with different contextual information and
domains and how they were obtained. Also in this section, the process of selecting relevant
contextual attributes and values is shown. Section 4.2 presents how the contextual model
is implemented through the extension of a traditional framework available in the area of
single-domain recommender systems. Section 4.3 describes two of the proposed algorithms
(see Section 3.3.1) considering the implemented contextual model. Section 4.4 describes
implementation details of two base cross-domain algorithms. Finally, Section 4.5 presents
the final remarks of this chapter.

4.1 Dataset Acquisition

One of the main difficulties for evaluating cross-domain recommender systems is
the lack of publicly available data, representing the ratings of the same users on items
classified in multiple domains (TANG et al., 2012). Although there are several datasets
from different domains (television, music, books, etc.) separately, just a few of them have
user-overlapped ratings, i.e., all users only have ratings in a single domain. However, our
problem demands a cross-domain dataset with contextual information. In other words,
this dataset must have at least a small number of users that have some ratings in the
source and target domains (i.e. a cross-domain dataset with some level of user overlap)
and some of these ratings must contain contextual information.

In order to achieve that, we extracted two datasets based on the dataset from
(LESKOVEC; ADAMIC; HUBERMAN, 2007), since it was not designed for evaluating
cross-domain context-aware recommendations. This dataset contains product metadata
and review information about different Amazon products (Books, music CDs, DVDs, and
so on)1 and we implemented a method to extract only its relevant information for our
problem. For instance, we removed duplicated ratings and irrelevant information (e.g.
number of votes, Amazon sales rank, product reviews, etc.), besides the creation of methods
for gathering contextual information of three contextual dimensions (see Section 4.1.1).

One of the extracted datasets was used for evaluating the CD-CARS algorithms
in two more related domains (Book and Television, named as “book-television dataset”)
and another for evaluating it in two less related domains (Book and Music, named as

1 https://snap.stanford.edu/data/amazon-meta.html


4.1. Dataset Acquisition 93

“book-music dataset”). Both datasets contain a set of ratings, which are composed of:

• User ID: a positive integer value;

• Item ID: a positive integer value;

• Rating value: a positive integer value, defined explicitly by the user on a five-star
scale; and

• Contextual information: an array of integer values that represent contextual values.
Each index from the array represents a distinct contextual attribute (e.g. country)
of a certain contextual dimension (e.g. Location), as described in Section 3.2.1.

Unfortunately, the contextual information was not available directly in the original
dataset, so, we had to obtain the contextual information implicitly from the ratings’ dates,
users’ Web accounts (from their Amazon IDs), and by inferring the ratings’ reviews, as
detailed in the Section 4.1.1.

Finally, we discarded users that had less than twenty ratings and did not have
ratings in both domains (book/television for one dataset, and book/music for the other one)
from datasets, which means that only overlapped users were included (full overlap). From
these datasets, we created reduced versions of them in order to evaluate the sensitivity of
the CD-CARS algorithms for datasets with different levels of user overlap. Thus, beyond
the full-overlapped “book-television” and “book-music” datasets, we have four variations
of each one of them: two datasets with 10% of overlap (one for each domain as a target),
and two with 50% of overlap (one for each domain as a target). We generated these
reduced versions from the full-overlapped datasets by removing all ratings for items from
the target domain of the users chosen randomly according to the overlap percentage. For
example, the “book-television” dataset with 10% of user overlap and Television domain as
the target has 10% of the users with ratings in both domains (source and target) and the
remaining 90% of them have ratings only for items in the Book domain (source). So, this
dataset can be used for evaluating the cross-domain recommendation in the Television
domain as a target when just a few number of users has ratings in the target domain.

Besides the set of ratings, the extracted datasets contain information about items
such as item ID, title, domain, and categories. We extracted those categories from the
original dataset for all items in all domains (book, television, and music). The categories
of the Book and Television domains were mapped into a single set of 27 categories2 while
2 Based on Amazon’s Movie&TV categories: unknown, action&adventure, international, animation,

anime, boxed sets, classics, comedy, documentary, drama, educational, health, religion, fantasy, LGBT,
holiday&seasonal, horror, artistical, kids&family, war, musicals, mystery, romance, sci-fi, special,
sports, westerns


94 Chapter 4. CD-CARS Implementation

the Music domain had 19 categories3.

Also, the extracted datasets contain information about users such as user ID,
Amazon user ID, and address (obtained from the Amazon user ID, as detailed in Sec-
tion 4.1.1.2). At last, they contain the user-rating reviews, which are used for obtaining
companion contextual information (as detailed in Section 4.1.1.3).

4.1.1 Obtaining Contextual Information

As mentioned in Section 2.3.3, three methods are more often used to gather contex-
tual information: explicit, implicit, and inferred. According to the contextual information
available in the extracted datasets, we considered three contextual dimensions in the CD-
CARS implementation. For two of them (Temporal and Location), we implicitly obtained
the contextual information from the user ratings, while for one of them (Companion), we
inferred the contextual information from the user-rating reviews.

4.1.1.1 Temporal Dimension

The contextual information in the Temporal dimension can be directly extracted
from user-rating timestamps, which are present in the majority of the datasets containing
user-ratings. In this way, the timestamps could be transformed into several contextual
attributes with different values and hierarchical levels. For example, timestamps could
represent the “period of the day” attribute (Dawn, Morning, Afternoon and Night) as well
as “day type” (weekend or weekday), as illustrated in Figure 19. A discussion about the
hierarchical levels and possible values of contextual attributes can be found in Section 3.2.2.

Figure 19 – Example of a temporal dimension with its possible contextual attributes and
values in a hierarchical view.

3 Based on Amazon’s CD music categories: unknown, jazz, rock, classic rock, international, classical,
pop, blues, gospel, dance, new age, country, folk, vocal, alternative rock, hard rock, kids&family, rap,
special


4.1. Dataset Acquisition 95

In this implementation, the real datasets used in the experiments only had date
information of the user-ratings. For that reason, we could not extract contextual attributes
related to the rating time (e.g. rating hour) or “period of the day”. So, only contextual
attributes related to day or month could be extracted, such as “day type” or “period of
the year”.

It is important to mention that the contextual information extracted from user-
rating date is not entirely reliable because a user can rate an item and consume it in
distinct moments, which clearly generates an impact in the temporal context. For instance,
a user could watch a movie on Saturday and rate it only on Sunday.

This temporal gap could be more frequent for the “period of the day” attribute,
since a user could watch a movie in the afternoon and, due to its duration, rate it only at
night, when that movie ends. However, despite this risk, we believe that there are rating
patterns according to the users’ contexts in the same way that there are consumption
patterns from these users’ contexts. In other words, we can say that there are users that
usually rate (instead of watching) comedy movies on Sundays, for example. Thus, we
expect that the risk of gathering the temporal context in an implicit way is minimal, which
can be observed in the evaluation of the proposed recommender system.

4.1.1.2 Location Dimension

Usually, the contextual information in the Location dimension can be implicitly
collected when the user is using some device with Internet access or with a Global
Positioning System (GPS), for example. However, this information is not available in
the real datasets adopted in this implementation. On the other hand, all user-ratings in
the datasets have information about the user IDs (and the real Amazon user IDs), as
mentioned before.

From the actual Amazon user IDs, we created a web crawler responsible for
extracting the address information in the profile web pages from the users’ accounts,
which can be accessed at a Uniform Resource Locator (URL)4 containing the Amazon
user IDs. The web crawler simply makes a GET request from the Hypertext Transfer
Protocol (HTTP) and receives an HyperText Markup Language (HTML) page. This page
is parsed by a regular expression based on a particular HTML tag in order to extract a
string containing the user’s address information. However, this string is defined by the
user and is not standardized.

In this way, after obtaining the not-standardized address information from the user
profile web page, we used a Representational State Transfer (REST) web service, called

4 http://www.amazon.com/gp/pdp/profile/AMAZONUSERID/ is an example of URL, where “AMA-
ZONUSERID” represents the real Amazon user ID, omitted here for privacy issues.


96 Chapter 4. CD-CARS Implementation

Google Maps Geocoding5, that provides an Application Programming Interface (API) to
retrieve the standardized address information of the users. This service requires a string
address as parameter and returns a JavaScript Object Notation (JSON) document with
multi-level address information about that address, such as “country”, “locality” (repre-
senting a “city”), “administrative_area_level_1” (representing a “state”, for example),
“administrative_area_level_2” (representing a “county” or “district”, for example), and
so on (“administrative_area_level_N”, with ‘N’ greater than two, representing more
specific administrative areas). Listing 4.1 shows a response example for the requested
string “recife”.

Listing 4.1 – Example of a response (JSON document) from google maps geocoding api
for the input “recife”.

{
" r e s u l t s " : [

{
" a d d r e s s _ c o m p o n e n t s " : [

{
" l o n g _ n a m e " : " R e c i f e " ,
" s h o r t _ n a m e " : " R e c i f e " ,
" t y p e s " : [ " l o c a l i t y " , " p o l i t i c a l " ]

} ,
{

" l o n g _ n a m e " : " R e c i f e " ,
" s h o r t _ n a m e " : " R e c i f e " ,
" t y p e s " : [ " a d m i n i s t r a t i v e _ a r e a _ l e v e l _ 2 " , " p o l i t i c a l " ]

} ,
{

" l o n g _ n a m e " : " P e r n a m b u c o " ,
" s h o r t _ n a m e " : " PE " ,
" t y p e s " : [ " a d m i n i s t r a t i v e _ a r e a _ l e v e l _ 1 " , " p o l i t i c a l " ]

} ,
{

" l o n g _ n a m e " : " B r a z i l " ,
" s h o r t _ n a m e " : " BR " ,
" t y p e s " : [ " c o u n t r y " , " p o l i t i c a l " ]

}
] ,
...

}
}

The standardized address (composed by country, state, and city) is extracted from
the JSON document (through a JSON parser) and persisted into a database, serving as an
5 A developer key is required to use the web service, as described at

https://developers.google.com/maps/documentation/geocoding/intro


4.1. Dataset Acquisition 97

address catalog. This mechanism avoids future unnecessary requests to the service. Thus,
for each address searched in the web service, we persisted the search string (raw address)
and its corresponding standardized address. This information is also persisted together
with the users’ information. Figure 20 illustrates the process for gathering the location
contextual information, described above.

Figure 20 – Process for gathering the location contextual information from the user infor-
mation.

Since we only obtained static address information (country, state, and city were
obtained from users’ web profiles), we could not extract contextual attributes related to
the abstract locations, as, “place” attribute (at home, at work, in a movie theater, etc.).
Therefore, only contextual attributes related to geographical location could be extracted,
as illustrated in Figure 21. It is important to mention that each user has the same location
context for all his/her ratings once that his/her location context was extracted from
his/her address defined in his/her static web profile.

Despite the users’ geographical locations are extracted, there is no guarantee that
they are actually in those locations where they are rating (or consuming) an item, given
that their locations are extracted from their web profiles, which are defined in their initial
registration in the system. However, we believe that the majority of the items are rated
by the users in their registered location. Finally, it is important to say that not all users
had the address information available at their web profile, thus, many users did not have
any contextual information about their location.


98 Chapter 4. CD-CARS Implementation

Figure 21 – Example of a location dimension with its possible contextual attributes and
values in a hierarchical view.

4.1.1.3 Companion Dimension

In opposite to the previous contextual dimensions, in which we implicitly obtained
the contextual information, the contextual information of the Companion dimension
was inferred from the user-rating reviews available in the real datasets. For that, we
implemented a method based on (BAUMAN; TUZHILIN, 2014).

(BAUMAN; TUZHILIN, 2014) proposed an unsupervised text mining algorithm
for discovering relevant contextual information from the user-generated reviews. Initially,
they observed that contextual information appears more likely in specific reviews (those
that describe specific details of an item, such as a book or movie) than in generic reviews
(that describes overall comments about an item). After they cluster the user reviews
into two groups (specific and generic), they try to find key-words or topics describing
the contextual information in those reviews. They state that these topics appear more
frequently in the specific reviews than in the generic ones. In this way, they compare the
frequencies of key-words (or topics) appearing in the specific and the generic reviews and
then select these key-words that have high-frequency ratios, assuming that the selected
key-words should contain most of the contextual information among the user reviews.
Finally, they inspect the list of the selected key-words by manually identifying the relevant
context-related topics.

In this way, we applied that method to our dataset with some adaptations by
considering different domains. This method is described in the following. In the first
step of the method, we separated reviews into specific and generic reviews by using the
measures proposed in the original method (BAUMAN; TUZHILIN, 2014):


4.1. Dataset Acquisition 99

• LogSentences: logarithm of the number of sentences in the review plus one6.

• LogWords: logarithm of the number of words used in the review plus one.

• VBDsum: logarithm of the number of verbs in the past tenses in the review plus
one.

• Vsum: logarithm of the number of verbs in the review plus one.

• VRatio - the ratio of VBDsum and Vsum (V BDsum
V sum

).

With those measures, we used the classical K-means clustering method (JAIN,
2010) to separate all the reviews into the “specific” and “generic” clusters, as described
in (BAUMAN; TUZHILIN, 2014). However, we applied the clustering separately for
each domain (Book, Television, and Music). This was the first adaptation in the original
method, which is intended for single domain reviews. As a result, the vast majority of the
reviews (99.8%) were classified as “specific” for all domains in our dataset. This result
might have occurred due to the nature of user-rating reviews available in the original
dataset, in which they were analyzed by the dataset provider in order to maintain only
relevant user-rating reviews. It is important to mention that the majority of user-ratings
(76%) did not have reviews (only did the rating values), considering both datasets.

Given that the great majority of the reviews were classified as “specific”, we
simplified the word-based and LDA-based (BLEI; NG; JORDAN, 2003) methods proposed
in (BAUMAN; TUZHILIN, 2014), since these methods rely on the separation of specific
and generic reviews.

The adapted word-based method is explained below:

1. For each review Ri, identify the set of nouns Ni appearing in it.

2. For each noun nk, determine its weighted frequencies ws(nk) corresponding to the
specific (s) reviews, as follows

ws(nk) =
|Ri : Ri ∈ specific and nk ∈ Ni|

|Ri : Ri ∈ specific|
(4.1)

3. Filter out the words nk that have overall low frequency, i.e.,

w(nk) =
|Ri : nk ∈ Ni|

|Ri : Ri ∈ specific|
< α, (4.2)

where α is a threshold value for the application (e.g., α = 0.005)
6 The authors of the proposed method added one to avoid the problem of having empty reviews when

logarithm becomes −∞


100 Chapter 4. CD-CARS Implementation

4. For each remaining noun nk left after filtering in the previous step, find the set of
senses synset(nk) using WordNet7 (MILLER, 1995).

5. Combine senses into groups gt having close meanings using WordNet taxonomy
distance. Words with several distinct meanings can be represented in several distinct
groups.

6. For each group gt determine its weighted frequencies ws(gt) through frequencies of
its members as:

ws(gt) =
|Ri : Ri ∈ specific and gt ∩Ni 6= ∅|

|Ri : Ri ∈ specific|
(4.3)

7. Sort groups by its weighted frequencies ws(gt) in its descending order.

The adapted LDA-based method is described below:

1. Build an LDA model on the set of the specific reviews.

2. Apply this LDA model to all the user-generated reviews in order to obtain the set of
topics Ti for each review Ri with a probability higher than a certain threshold level.

3. For each topic tk from the generated LDA model, determine its weighted frequencies
ws(tk) corresponding to the specific (s) reviews, as follows

ws(tk) =
|Ri : Ri ∈ specific and tk ∈ Ti|

|Ri : Ri ∈ specific|
(4.4)

4. Filter out the topics tk that have overall low frequency, i.e.,

w(tk) =
|Ri : tk ∈ Ti|

|Ri : Ri ∈ specific|
< α, (4.5)

where α is a threshold value for the application (e.g., α = 0.005)

5. Sort topics by its weighted frequencies ws(tk) in its descending order.

Note, that we did not use the generic reviews in both adapted methods described
above, unlike they are originally (BAUMAN; TUZHILIN, 2014). In addition, after
generating the sorted lists of key-words (or topics), we manually selected in the list of
topics for each item domain only the topics related to the Companion contextual dimension.
In contrast, in the original method, there is no restriction about the contextual dimensions
extracted from the key-words (or topics).
7 WordNet is a large lexical database of English. Words are grouped into sets of cognitive synonyms,

each expressing a distinct concept. Function synset(word) returns a list of lemmas of this word that
represent distinct concepts.


4.1. Dataset Acquisition 101

Figure 22 – Example of a companion dimension with its possible contextual attributes
and values in a hierarchical view.

In this way, we identified six contextual values (alone, accompanied8, family, friends,
partner, and colleagues9) for only one contextual attribute of “companion”, from the word
groups and topics selected. This contextual attribute is high-level and likewise other
contextual dimensions, it could be expanded into more granular contextual attributes
(illustrated in Figure 22), as, specific family members (father, mother, siblings, and so
on.), or yet, the person itself (e.g. by pointing the companion’s name) that the user is
with. However, we let the contextual attribute of the Companion dimension in high-level
due to the source of contextual information (inferred from reviews), in which has a few
number of reviews with particular description of companion (specific family members or
companion’s name).

Furthermore, contextual information of other contextual dimensions could be
discovered (e.g. the task or purchase purpose). However, we did not consider other
contextual dimensions due to the small percentage of user-rating reviews with their
respective inferred contexts, in contrast to the Companion dimension, which had 85% of
the total of user-rating reviews with the inferred contexts. This observation is reasonable
because a few users usually mention their temporal contexts in the reviews and inferring
the companion context of user-ratings seems to be easier than inferring the temporal
context, for example.

In order to evaluate the classification performance of the implemented method for
the companion extraction, we adopted the same methodology described by the authors
of that method in (BAUMAN; TUZHILIN, 2014). In this way, for each item domain we
randomly selected 300 reviews from the entire set of user-reviews (i.e., book, television,

8 User-ratings are classified in this high-level contextual value only when a more particular value could
not be inferred, such as “family”.

9 In opposition to the “friends” value, in this contextual value are considered only co-workers, classmates,
etc.


102 Chapter 4. CD-CARS Implementation

and music - 900 reviews in total), however, we let 50 reviews for each contextual value
(i.e., alone, accompanied, family, friends, partner, and colleagues). Hence, we manually
labeled these reviews according to their contextual values and measured the accuracy of
the contextual classification by comparing the labeled reviews to the classified reviews. The
accuracy was calculated considering the number of correct classifications in comparison
to the total of tested reviews. Table 6 reports the results of this empirical evaluation
considering the different domains and contextual values.

Table 6 – Classification accuracy of the companion extraction.

Target
Domain

Overall
Accuracy

Contextual Values
Alone Accompanied Family Friends Partner Colleague

Book 19.67% 94% 3% 8% 2% 8% 3%
TV 17% 76% 9% 5% 4% 6% 2%
Music 10.83% 52% 2% 4% 1% 5% 1%

As it can be seen from table, the implemented method did not have a good
performance in the companion extraction task. Book was the domain with better results
in general, while the Music had the worst ones. This result may be associated with the
length of the user reviews, which is greater in the Book domain in comparison to the
other ones. In addition, the average implemented method achieved better results for the
Alone contextual value than for other values. This result may have occurred due to the
great presence of personal pronoun “I” in the user reviews, which can be considered as a
“topic” by the implemented method. For that reason, that method infers that the user was
“alone”.

4.1.2 Selecting Relevant Contextual Attributes and Values

As mentioned in Section 2.3.4, there are several approaches to determine the
relevance of a given type of contextual information (contextual dimension). If we consider
that only relevant dimensions (Location, Temporal and companion) are present in the
datasets, we have to determine the relevance of the contextual attributes of these dimensions.
For that, each user-rating in the datasets has contextual information about all contextual
dimensions and their attribute variations, as described below:

• Temporal - in this dimension, we persisted two contextual attributes: day (with
values: Sunday to Saturday) and day type (values: weekend and weekday);

• Location - in this dimension, we persisted three contextual attributes: country, state,
and city10;

• Companion - in this dimension, we persisted one contextual attribute: companion
type (with values: alone, accompanied, family, friends, partner, and colleagues).

10 112 countries, 380 states, and 2838 cities were found in the datasets used in this thesis


4.1. Dataset Acquisition 103

Given these contextual dimensions and their attributes, we applied a data mining
method (as mentioned in Section 3.2.2) to select only the most relevant contextual
attributes of each contextual dimension. For that, we adopted the InfoGainAttributeEval
method from Weka (HALL et al., 2009), which evaluates the worth of an attribute by
measuring the information gain for a “class”. The output of this method is a ranking of the
attribute list which indicates the importance of the attributes in the task of classification.

In our case, the task of classification serves to analyze the influence of the distinct
contextual attributes in the user-rating value (class). In this way, we applied the Info-
GainAttributeEval11 with the user-rating value as a class (five possible values: 1 to 5) and
six attributes (day, day type, country, state, city, companion type) for the two datasets
used in this thesis, considering all target domains separately. Table 7 and Table 8 report
the information gain values12 of the contextual attributes in different target domains for
these datasets.

Table 7 – Information gain of contextual attributes in different target domains for the
book-television dataset.

Target
Domain

Temporal Dimension Location Dimension Companion Dimension
Day Day Type Country State City Companion Type

Book 2.6e-4 1.1e-5 2.3e-3 7.9e-3 2.8e-2 9.9e-5
TV 2.3e-4 8e-5 5e-3 1e-2 3.6e-2 4.6e-5

Table 8 – Information gain of contextual attributes in different target domains for the
book-music dataset.

Target
Domain

Temporal Dimension Location Dimension Companion Dimension
Day Day Type Country State City Companion Type

Book 3.9e-4 3.1e-5 2.2e-3 9e-3 3.2e-2 1.2e-4
Music 2.4e-4 2.6e-5 4.6e-3 8.3e-3 3.6e-2 2.5e-5

As these tables demonstrate, the information gain was similar in both datasets
and their respective domains. For them, the “day” attribute was the most relevant in the
Temporal dimension as well as the “city” attribute and the “companion type” attribute
were the most worth in their respective contextual dimensions. In addition, the “city”
attribute is the most relevant to them, followed by the “day” and “companion type”
attributes, in that order. Therefore, we have chosen only these three attributes for the
evaluation of the proposed CD-CARS. However, there is no guarantee that the quality
of recommendation is better in the “city” attribute than in the others. The quality may
depend on how good the recommendation algorithms explore the contextual information
available. Thus, this analysis does not discard the necessity of experimental evaluations
11 The weka.attributeSelection.Ranker was the ranker used with the configuration: -T (generateRanking)

-1.7976931348623157E308 (threshold) -N (startSet) -1 (numToSelect).
12 These values range between 0 and 1, where a higher value represents a more discriminating feature


104 Chapter 4. CD-CARS Implementation

for measuring the quality of recommendations in different (or with less information gain)
contexts.

It is important to mention that this analysis considers each attribute independently,
and so do not take into account any correlation between distinct contextual attributes.
For that reason, we also made experimental evaluations combining the selected contextual
attributes (see Chapter 5). Moreover, we selected just one contextual attribute for each
contextual dimension, but, other relevant contextual attributes could be used in the
Location dimension once not all user-ratings have information about their particular
location (e.g. instead of “city”, a user can only have information about “country” or
“state”). Therefore, a CD-CARS implementation could consider any contextual information
available although, for evaluation purposes, we have selected only the contextual attributes
with higher information gain per dimension in order to minimize the evaluation cost by
considering several contextual attributes and their combinations.

Another aspect of the contextual information selection refers to selecting the most
relevant contextual values of the contextual attributes. In order to verified this aspect, we
had generalized the Companion values into only two categories: “alone” or “not-alone”,
which contained all the other values such as “accompanied”, “family”, “friends”, “partner”,
and “colleagues”. However, we verified that the information gain was higher when the
Companion values were more granular, so, we kept all the original companion values
separately. However, we remember that the quality of the inferred contextual information
is low, which may impact in the calculation of the information gain.

4.1.3 Cross-Domain Datasets Description

In this section, we describe the properties of two extracted datasets. One of them
for evaluating the CD-CARS in two more related domains (Book and Television, named
as “book-television dataset”, described in Section 4.1.3.1) and another considering two
less related domains (Book and Music, named as “book-music dataset”, described in
Section 4.1.3.2).

4.1.3.1 Book-Television dataset

Table 9 summarizes the properties of the “book-television dataset”, which can be
split into two single-domain sub-datasets and three or more samples considering ratings
from specific contextual dimensions. As it can be seen from the table, the Books domain
has more ratings ('64% from total) than Television domain ('36% from total). Also,
the table summarizes the cross-domain dataset properties according to three contextual
dimensions (Temporal, Location and Companion). In the table, we can see that 100% of
the ratings have information about Temporal dimension, while almost 45% and 20% of


4.1. Dataset Acquisition 105

them, respectively, have information about Location (city) and Companion dimensions. In
Section 4.1.2, we described why these contextual dimensions were chosen.

Table 9 – Cross-domain and single-domain “book-television dataset” properties with 100%
of user overlap.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 15341 194615 1249949 81.47 6.42
Books (single-domain) 15341 165896 805102 52.48 4.85
Television (single-domain) 15341 28719 444847 28.99 15.48
Temporal context (both domains) 15341 194615 1249949 81.47 6.42
Location (city) context (both domains) 7405 118020 557018 75.22 4.72
Companion context (both domains) 13598 76295 251707 18.51 3.30
Location (city) AND Companion contexts
(both domains)

6846 52032 131257 19.17 2.52

Note that, in Table 9, all ratings from the full cross-domain dataset have a
Temporal context associated with13, since they have this information extracted from dates,
as described in Section 4.1. For this reason, we omitted the properties of the combination
between the Temporal and Location (or Companion) dimensions, since the number of
ratings for these combinations is the same as the number of ratings considering only the
Location (or Companion) dimension alone. For example, if we consider the dataset by
selecting ratings only with Location (city) and Temporal contextual dimensions, then we
will have a sample of the dataset with 557018 ratings, which is the same number of ratings
from the Location (city) dimension alone.

It is important to mention that Table 9 shows the properties of the “book-television
dataset” with full overlap between users. However, samples of that dataset are used with
other user overlap levels in order to perform a sensitivity evaluation (see Section 2.2.6.3).
As mentioned in Section 4.1, we generated reduced versions from the full-overlapped
datasets by removing all ratings for items from the target domain of the users chosen
randomly according to the overlap percentage. In this way, Table 10 and Table 11 present,
respectively, the “book-television dataset” properties by considering user overlap levels of
50% and 10% when Television is the target domain while Table 12 and Table 13 show the
“book-television dataset” properties by considering the same user overlap levels when Book
is the target domain.

4.1.3.2 Book-Music dataset

Table 14 summarizes the properties of the “book-music dataset”. We can see on
the table that the Books domain has more ratings ('72% from total) than Music domain
('28% from total). In addition, 100% of the ratings have information about Temporal
13 See the number of ratings in the 1st and 4th rows.


106 Chapter 4. CD-CARS Implementation

Table 10 – “book-television dataset” properties with 50% of user overlap when “TV” is
the target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 15341 188402 1011324 65.92 5.37
Books (single-domain) 15341 165896 805102 52.48 4.85
Television (single-domain) 7671 22506 206222 26.88 9.16
Temporal context (both domains) 15341 188402 1011324 65.92 5.37
Location (city) context (both domains) 7405 113049 446297 60.27 3.95
Companion context (both domains) 13012 73353 206512 15.87 2.82
Location (city) AND Companion contexts
(both domains)

6571 49216 106664 16.24 2.17

Table 11 – “book-television dataset” properties with 10% of user overlap when “TV” is
the target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 15341 178646 851680 55.57 4.38
Books (single-domain) 15341 165896 805102 52.48 4.85
Television (single-domain) 1534 12750 46578 30.36 3.65
Temporal context (both domains) 15341 178646 851680 55.57 4.38
Location (city) context (both domains) 7405 103862 529307 71.48 5.10
Companion context (both domains) 12546 68084 332271 26.49 4.88
Location (city) AND Companion contexts
(both domains)

6363 44830 248592 39.06 5.55

Table 12 – “book-television dataset” properties with 50% of user overlap when “Book” is
the target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 15341 131456 819335 53.41 6.23
Books (single-domain) 7671 102737 374488 48.82 3.65
Television (single-domain) 15341 28719 444847 28.99 15.48
Temporal context (both domains) 15341 131456 819335 53.41 6.23
Location (city) context (both domains) 7405 90838 581614 78.54 6.40
Companion context (both domains) 12425 53684 361950 29.13 6.74
Location (city) AND Companion contexts
(both domains)

6298 36102 281688 44.73 7.80

dimension, while almost 46% and 11% of them, respectively, have information about
Location (city) and Companion dimensions.

Note that, due to the same reason mentioned in Section 4.1.3.1, we omitted the
properties of the combination between other contextual dimensions, remaining only the
combination between the Location and Companion dimensions.


4.1. Dataset Acquisition 107

Table 13 – “book-television dataset” properties with 10% of user overlap when “Book” is
the target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 15341 62722 514965 33.57 8.21
Books (single-domain) 1534 34003 70118 45.71 2.06
Television (single-domain) 15341 28719 444847 28.99 15.48
Temporal context (both domains) 15341 62722 514965 33.57 8.21
Location (city) context (both domains) 7087 43859 244913 34.56 5.58
Companion context (both domains) 11420 24291 104353 9.14 4.30
Location (city) AND Companion contexts
(both domains)

5788 17147 55803 9.64 3.25

Table 14 – Cross-domain and single-domain “book-music dataset” properties with 100%
of user overlap.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 13189 219034 1031386 78.20 4.71
Books (single-domain) 13189 162449 742844 56.32 4.57
Music (single-domain) 13189 56585 288542 21.88 5.10
Temporal context (both domains) 13189 219034 1031386 78.20 4.71
Location (city) context (both domains) 6951 132830 478510 68.84 3.60
Companion context (both domains) 11519 75754 207010 17.97 2.73
Location (city) AND Companion contexts
(both domains)

6412 53999 116100 18.11 2.15

Table 15 – “book-music dataset” properties with 50% of user overlap when “Music” is the
target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 13189 208427 897227 68.03 4.31
Books (single-domain) 13189 162449 742844 56.32 4.57
Music (single-domain) 6595 45978 154383 23.41 3.36
Temporal context (both domains) 13189 208427 897227 68.03 4.31
Location (city) context (both domains) 6921 120509 407003 58.81 3.38
Companion context (both domains) 11315 75233 197571 17.46 2.63
Location (city) AND Companion contexts
(both domains)

6303 53474 110730 17.57 2.07

Regarding the sensitivity analysis, Table 15 and Table 16 present, respectively, the
“book-music dataset” properties by considering user overlap levels of 50% and 10% when
“Music” is the target domain while Table 17 and Table 18 show the “book-music dataset”
properties by considering the same user overlap levels when Book is the target domain.


108 Chapter 4. CD-CARS Implementation

Table 16 – “book-music dataset” properties with 10% of user overlap when “Music” is the
target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 13189 177368 770030 58.38 4.34
Books (single-domain) 13189 162449 742844 56.32 4.57
Music (single-domain) 1319 14919 27186 20.61 1.82
Temporal context (both domains) 13189 177368 770030 58.38 4.34
Location (city) context (both domains) 6897 102369 350664 50.84 3.43
Companion context (both domains) 11141 74144 190478 17.10 2.57
Location (city) AND Companion contexts
(both domains)

6203 52438 106797 17.22 2.04

Table 17 – “book-music dataset” properties with 50% of user overlap when “Book” is the
target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 13189 154329 635947 48.22 4.12
Books (single-domain) 6595 97744 347405 52.68 3.56
Music (single-domain) 13189 56585 288542 21.88 5.10
Temporal context (both domains) 13189 154329 635947 48.22 4.12
Location (city) context (both domains) 6482 104249 317421 48.97 3.04
Companion context (both domains) 8209 51833 116360 14.17 2.25
Location (city) AND Companion contexts
(both domains)

4578 36396 66499 14.53 1.83

Table 18 – “book-music dataset” properties with 10% of user overlap when “Book” is the
target domain.

Dataset Users Items Ratings Ratings perUser Item
Cross-domain (both domains) 13189 87993 347805 26.37 3.95
Books (single-domain) 1319 31408 59263 44.93 1.89
Music (single-domain) 13189 56585 288542 21.88 5.10
Temporal context (both domains) 13189 87993 347805 26.37 3.95
Location (city) context (both domains) 5945 57465 172780 29.06 3.00
Companion context (both domains) 5454 15949 35785 6.56 2.24
Location (city) AND Companion contexts
(both domains)

3071 10310 19971 6.50 1.94

4.2 Contextual Model Implementation

In this section, we describe how the contextual model, outlined in Section 3.2, was
implemented. To implement this contextual model, we extended the implementation of


4.2. Contextual Model Implementation 109

the Mahout14 framework (OWEN et al., 2011). Figure 23 and Figure 24 show two class
diagrams that represent the contextual data model considering the extension/realization
of three Mahout entities (RecommenderBuilder, AbstractDataModel and IDRescorer).

Figure 23 – Data model class diagram focusing contextual aspects of the CD-CARS
implementation.

Figure 24 – Data model class diagram focusing dataset aspects of the CD-CARS imple-
mentation.

14 Apache Mahout open source project is a machine learning library under Apache Software Foundation
- http://mahout.apache.org/


110 Chapter 4. CD-CARS Implementation

In the following, we describe the main entities represented in those class diagrams:

• RecommenderBuilder - this Mahout interface guides the implementation of classes
responsible for building recommender algorithms in order to be evaluated based on
a given realization of the AbstractDataModel Mahout abstract class. In turn, these
recommender algorithms must implement the Recommender Mahout interface to
recommend items for a user.

• AbstractDataModel - this Mahout abstract class implements some basic methods
defined by the DataModel Mahout interface, which represents a repository of infor-
mation about users and their associated preferences for items (i.e., user-ratings).

• ContextualRecommenderBuilder - this novel interface extends the Recommender-
Builder Mahout interface in order to guide the building of recommender algorithms
capable of taking advantage of context-awareness features.

• PreFilteringContextualRecommenderBuilder - this class implements the Contextu-
alRecommenderBuilder interface. It is responsible for building and preparing the
pre-filtering recommender algorithm (see Section 3.3.1.1) for evaluation according to
three parameters: target domain (restricting source domains declared in the ItemDo-
mainRescorer class, showed in Figure 24), contextual data model (represented by the
ContextualDataModel class), and the context of the recommendation (represented by
the ContextualCriteria class).

• PostFilteringContextualRecommenderBuilder - this class implements the Contextual-
RecommenderBuilder interface and follows the same logic as the PreFilteringContex-
tualRecommenderBuilder, but, considering the post-filtering recommender algorithm
(see Section 3.3.1.2) instead of the pre-filtering one.

• BaseCrossDomainRecommenderBuilder - this class implements the ContextualRec-
ommenderBuilder interface. It is responsible for building and preparing the base
cross-domain recommender algorithm (see Section 3.3.2) for evaluation according
to two parameters: target domain (restricting source domains declared in the Item-
DomainRescorer class, described later) and contextual data model (represented
by the ContextualDataModel class). In contrast to the PreFilteringContextualRec-
ommenderBuilder and the PostFilteringContextualRecommenderBuilder, this class
does not take into account the context of the recommendation (represented by
the ContextualCriteria class), since it does not use any contextual information to
recommend items in the target domain. However, it is used by the PreFiltering-
ContextualRecommenderBuilder and PostFilteringContextualRecommenderBuilder
classes that bring the contextual data model, which in turn can be used by the


4.2. Contextual Model Implementation 111

BaseCrossDomainRecommenderBuilder only for evaluation purposes, for example, in
specific contexts as a baseline (without interfering in the recommendation process).

• ContextualDataModel - while the AbstractDataModel Mahout abstract class contains
only user-ratings, this novel class contains, besides the user-ratings, contextual
information about these user-ratings, thus, the ContextualDataModel extends the
AbstractDataModel. It is important to mention that the implementation of the
AbstractDataModel is worried about performance issues (OWEN et al., 2011) and,
for that reason, represents all preferences of a user through a PreferenceArray
object. This object contains a single user ID, an array of item IDs, and an array of
preference values. In our implementation, we extended the PreferenceArray with a
multidimensional array of contextual feature codes (ContextualPreferenceArray).

• ContextualCriteria - this novel class encapsulates the context of the recommendation
represented by a list of contextual values (according to their contextual dimensions
and attributes), which in turn are implemented as enumeration objects that realize
the AbstractContextualAttribute interface. The list of contextual values is composed
of six enumeration objects, one for each implemented AbstractContextualAttribute.

• IDRescorer - this Mahout interface allows the realization of classes that can filter
out items from the recommended item list according to several attributes such as an
item genre (e.g. action, comedy, etc.) or an item domain (e.g. book, music,etc.).
The developer that creates a class to implement the IDRescorer is free to choose the
appropriate attribute for his purposes. Therefore, this Mahout interface makes the
cross-domain recommendation possible, as well as the post-filtering recommendation
(see Section 3.3.1.2), since both item genre and item domain can be used as a filter.

• ItemDomainRescorer - this class implements the IDRescorer Mahout interface
in order to filter out from the recommend item list those items that belong to
the set of items from the source domains. These domains are specified through
ItemDomain enumeration objects. For that, the ItemDomainRescorer depends on
the information from a dataset (e.g. we created an AmazonCrossDataset class, which
is an implementation of the AbstractDataset class and contains only three domains:
Book, Television and Music).

• ItemCategoryRescorer - this class implements the IDRescorer Mahout interface in
order to filter out from the recommend item list those items that belong to the set of
categories specified by ItemCategory enumeration objects. Thus, this class is useful
for the post filtering recommendation algorithm and also depends on the information
from a dataset, which contains a set of categories retrieved from the set of items.

• AbstractDataset - this abstract class encapsulates, besides the contextual data
model, a set of meta-information about its users (UserDatasetInformation), items


112 Chapter 4. CD-CARS Implementation

(ItemDatasetInformation), addresses (AddressDatasetInformation) and association
rules (AprioriRuleItemCategoryDomain).

• AprioriRuleItemCategoryDomain - this class generates and persists a set of association
rules used by the post filtering recommendation (see Section 3.3.1.2). These general
rules are generated from all user preferences in the contextual data model, and relate
item categories (ItemCategory enumeration objects) between different domains, e.g.
who likes action movies also likes rock music - {ACTION, TV} => {ROCK, MUSIC}.
In addition, for each generated rule, the class maintain its confidence and support
levels, besides all combinations of contexts (represented by a ContextualCriteria)
from the instances that generated that rule. These combination of contexts are
used a posteriori by the PostFilteringStrategyRecommendation class (described in
Section 4.3.2) that generated more specific rules (e.g. {ACTION, TV, WEEKDAY}
=> {ROCK, MUSIC, WEEKEND}) by considering particular contexts. This
decision avoids the generation of several rules with context that are not used by the
Post-Filtering algorithm, once that these rules only are used when a user does not
have any category preferences in a specific context.

• ItemDatasetInformation - this class represents the set of ItemInformation objects. An
ItemInformation object contains information about an item, such as ID, name/title,
year released, category, domain, link, and so on.

• UserDatasetInformation - this class represents the set of UserInformation objects. A
UserInformation object contains information about a user, such as ID, Amazon user
ID, raw address (a not-standardized string), and so on.

• AddressDatasetInformation - this class represents the set of AddressInformation
objects. An AddressInformation object contains information about a standardized
address, such as raw address string, city, state, country, and so on.

4.3 Proposed Algorithms Implementation

In this thesis, we implemented two of the proposed algorithms (Section 3.3): Pre-
Filtering (PreF) and Post-Filtering (PostF). In the next subsections, we will describe
some implementation particularities of these algorithms considering the contextual model
entities mentioned in Section 4.2 and other Mahout’s entities.

4.3.1 Pre-filtering Implementation

We implemented the PreF algorithm according to its proposal, described in Sec-
tion 3.3.1.1. The PreF implementation filters out from the ContextualDataModel the
target domain user-ratings whose contexts (ContextualPreferenceArray) are “contained”


4.3. Proposed Algorithms Implementation 113

in the context of the recommendation (defined by the ContextualCriteria). In other words,
the user-rating context must be the same context as the context of the recommendation
without considering unknown contextual attributes of the user-rating context. Besides, it is
important to remember that the user-ratings are only filtered in the target domain (ItemDo-
main), which is verified through the ItemDomainRescorer and the ItemDatasetInformation
classes (described before).

For illustrating the pre-filtering process, consider a context of the recommendation
and the contexts of a set of user-ratings, both respectively represented by Contextual-
Criteria and ContextualPreferenceArray entities in Figure 25. While the ContextualPref-
erenceArray contains a multidimensional array representing different user-ratings (first
index of the array) and its contextual values in a sequence representing different con-
textual attributes (second index of the array), the ContextualCriteria contains a list of
enumeration objects representing different contextual attributes, which have a contextual
code for each contextual value. The same order of contextual attributes declared in
the ContextualPreferenceArray is automatically used in the ContextualCriteria, since the
ContextualFileAttributeSequence instance determines a unified order of the contextual
attributes.

As it can be seen in Figure 25, the user (userId = ‘1’) has ratings for five items
(itemIds from ‘1’ to ‘5’), where each rating for an item has an array of contextual codes in a
specific order (e.g. itemIds[0] = 1, ratings[0]=4.0, contextualPreferences[0] = {1,1,0,-1,-1,-
1}). Each contextual code represents a contextual value of a different contextual attribute
in the sequence defined by the ContextualFileAttributeSequence instance15. In this way,
considering that the ContextualCriteria from the figure can be expressed by the following
contextual code sequence: {1 (“SUNDAY”), 1 (“WEEKEND”), -1 (“UNKNOWN”),-1
(“UNKNOWN”),-1 (“UNKNOWN”),2 (“FAMILY”)}, so, only one user-rating (contex-
tualPreferences[4]) is discarded, whereas the others four ones (contextualPreferences[0]
to contextualPreferences[3]) are maintained by the PreF implementation. This process
is illustrated in Figure 26, in which a green square means “a matching” between the
contextual attribute values from the user-rating contexts and the recommendation context,
whereas a red square means “a non-matching”. Finally, a yellow square means that the
contextual attribute was not considered for matching between the user-rating contexts
and the recommendation context, since one (or both) of these contextual attributes is
“unknown”.

15 The six codes are represented respectively by the DayContextualAttribute, DayTypeContextualAt-
tribute, LocationCountryContextualAttribute, LocationStateContextualAttribute, LocationCityContextu-
alAttribute, and CompanionContextualAttribute enumerations


114 Chapter 4. CD-CARS Implementation

Figure 25 – Class diagram illustrating entities used by the pre-filtering class.


4.3. Proposed Algorithms Implementation 115

Figure 26 – Example of the pre-filtering process considering the context of user-ratings
and the recommendation context.

4.3.2 Post-filtering Implementation

In contrast to the PreF proposal, the PostF one (described in Section 3.3.1.2) allows
that several strategies be implemented. Thus, we investigated some strategies to perform
the PostF recommendation by varying its threshold value (θ). For instance, we set θ to 2/3
of the frequency of the most preferred category. So, suppose that a user has given good
ratings (at least 4.0 in a scale from 1.0 to 5.0), in a given context, for thirty religion books,
twenty-five educational books, twenty comedy books, nineteen romance books, and ten
action books, as illustrated in Figure 27. By applying the threshold strategy, the minimal
value of occurrences is twenty (2/3 of 30, which is the frequency of the most preferred
category - religion) for an item category to be maintained on the recommendation list.
So, only religion (30 occurrences), comedy (20 occurrences) and educational books (25
occurrences) are included in the resulting recommendation, i.e., books of other categories
are ignored such as romance (19 occurrences) and action (10 occurrences) ones.

The optimal value of θ can be set through experiments, as well as the minimal
value to consider a good rating, which will vary depending on the rating value interval
(e.g. in an interval of 1-10 we could consider 8.0 or more like a good rating). The higher
the θ value, the less the number of categories included in the users’ preferred categories.


116 Chapter 4. CD-CARS Implementation

Figure 27 – Example of selected categories in the post-filtering recommendation.

Figure 28 shows the main entities of the post-filtering implementation. From
this figure, we can see that PostFilteringStrategyRecommendation class uses two distinct
databases of preferred categories:

• UserCategoriesPrefsInContextsByDomain - this database contains all categories of
rated items for each user in the dataset. However, only the categories of well-rated
(or “good”) items were considered (at least 4.0 in a scale from 1.0 to 5.0). Besides,
the number of occurrences of each category is associated with their observed contexts
and domains (e.g. {RELIGION,BOOK,WEEKDAY} = 5, which means that a user
rated five books as “good” in weekdays). Thus, this class contains one contextual
preference tensor CP(u,c,g) (described in Section 3.3.1.2) for each item domain.

• CategoryContextDomainRulesMap - when a user does not have any information
about preferred categories in a given domain, general association rules are necessary
so that the post-filtering algorithm can recommend items even in a domain that the
user has not ratings (see Section 3.3.1.2). In this way, this database increments the
set of rules initially generated by the AprioriRuleItemCategoryDomain class with
contextual information. For instance, a {RELIGION,BOOK} => {RELIGION,TV}
rule from the AprioriRuleItemCategoryDomain could be transformed into a {RE-
LIGION,BOOK,WEEKDAY} => {RELIGION,TV,WEEKEND} rule. Each part


4.3. Proposed Algorithms Implementation 117

of a rule is represented by a RuleTuple class, which contains information about
the context (ContextualCriteria), item category (ItemCategory) and item domain
(ItemDomain). In this case, rules are composed by one precedent RuleTuple and one
consequent RuleTuple. Later in this section, we will detail the implementation of
the process of association rules generation.

Figure 28 – A class diagram illustrating the main post-filtering entities.

It is important to mention that these databases are updated periodically on demand,
i.e., in execution time just when it is necessary. We made this design decision once that we
are concerned about performance issues given that the generation of association rules is
costly if we consider the multi-dimensional RuleTuple entities (item category, item domain
and context). In order to generate these rules, we used the AprioriRuleItemCategoryDomain
and CategoryContextDomainRulesMap implementations in a two-step process described in
Algorithm 2 (1-step) and in Algorithm 3 (2-step).

Algorithm 2. AprioriRuleItemCategoryDomain algorithm for association rules generation
(1-step).

Input: d (dataset), cl (minimal confidence level), sl (minimal support level), gr (minimal
“good” rating threshold value)
Output: rs (rules with category and domain information)


118 Chapter 4. CD-CARS Implementation

1: procedure generatePreferredCategoriesMapByUser(d,gr)
2: Create a uc data structure containing a list of users where each user (u) has a

set of categories (cs)
3: for each u ∈ d do
4: Get the array (ur) of user ratings (r) from u
5: for each r ∈ ur do
6: v = value from r
7: if v ≥ gr then
8: i = item from r
9: ic = categories from i

10: acs = cs from uc for u
11: for each c ∈ ic do
12: if c * acs then
13: Add c in the acs
14: Add acs in uc for u
15: end if
16: end for
17: end if
18: end for
19: end for
20: return uc
21: end procedure
1: procedure generateAprioriCategoryDomainRules(uc, d, cl, sl)
2: for i = 0; i < size of the categories set from d; i = i + 1 do
3: for j = 0; j < size of the categories set from d; j = j + 1 do
4: di = domain from category i
5: dj = domain from category j
6: if di 6= dj then
7: Create a precedent tuple (pt) composed by the category i and di
8: Create a consequent tuple (ct) composed by the category j and dj
9: ptc = 0

10: ctc = 0
11: for each u ∈ uc do
12: acs = cs from uc for u
13: if category i ⊂ acs then
14: ptc = ptc + 1
15: if category j ⊂ acs then
16: ctc = ctc + 1
17: end if


4.3. Proposed Algorithms Implementation 119

18: end if
19: end for
20: Create a rule (r) composed by pt, ct, ptc and ctc.
21: Add r in rs
22: end if
23: end for
24: end for
25: for each r ∈ rs do
26: Get ptc and ctc from r
27: nu = number of users from uc
28: rcl = ctc/ptc
29: rsl = ctc/nu
30: if rcl < cl or rsl < sl then
31: Remove r from rs
32: end if
33: end for
34: return rs
35: end procedure

end

Algorithm 3. CategoryContextDomainRulesMap algorithm for association rules genera-
tion (2-step).

Input: rs, which is a set of rules from the 1-step algorithm; d (dataset); gr (minimal
“good” rating threshold value); uc data structure containing a list of users where each
user (u) has a set of categories (cs); a rule tuple condition (rtc) composed by an item
category ict, an item domain idt and contextual criteria (cc); cl (minimal confidence
level); and sl (minimal support level)
Output: crs (set of rules with category, domain and contextual information)

1: procedure addContextualInformationInAprioriCategoryDomainRules(rs,
d, gr, uc)

2: for each rule ∈ rs do
3: for each u ∈ d do
4: Get cs from uc for u
5: Get the precedent tuple (pt) from rule
6: Get the consequent tuple (ct) from rule
7: Get ic1 from pt
8: Get ic2 from ct


120 Chapter 4. CD-CARS Implementation

9: if ic1 ⊂ cs and ic2 ⊂ cs then
10: Get the array (ur) of user ratings (r) from u
11: Create an empty set of contexts for the rule precedent category (cpt)
12: Create an empty set of contexts for the rule consequent category (cct)
13: for each r ∈ ur do
14: v = value from r
15: if v ≥ gr then
16: i = item from r
17: ic = categories from i
18: if ic1 ⊂ ic then
19: urc = context from r
20: Add urc in cpt
21: else
22: if ic2 ⊂ ic then
23: Add urc in cct
24: end if
25: end if
26: end if
27: end for
28: if cpt 6= ∅ and cct 6= ∅ then
29: for i = 0; i < cpt size; i = i + 1 do
30: for j = 0; j < cct size; j = j + 1 do
31: Add context i from cpt in pt for the rule
32: Add context j from cct in ct for the rule
33: end for
34: end for
35: end if
36: end if
37: end for
38: end for
39:

40: end procedure
1: procedure getAprioriCategoryDomainContextRules(rs, rtc, cl, sl)
2: for each rule ∈ rs do
3: Get the precedent tuple (pt) from rule
4: Get the consequent tuple (ct) from rule
5: Get the item category ( ic) from pt
6: Get the item domain ( id) from pt
7: Get ict from rtc


4.3. Proposed Algorithms Implementation 121

8: Get idt from rtc
9: Get cc from rtc

10: if ic = ict and id = idt then
11: Get the total number contexts (nc) from pt
12: ptc = 0
13: ctc = 0
14: for i = 0; i < nc; i = i + 1 do
15: if contextualMatching(cc,context i from pt,size of the array cc) then

//According to Algorithm 1
16: ptc = ptc + 1
17: if contextualMatching(cc,context i from ct,size of the array cc)

then
18: ctc = ctc + 1
19: end if
20: end if
21: end for
22: rcl = ctc/ptc
23: rsl = ctc/nc
24: if rcl < cl and rsl < sl then
25: Get the item category ( icc) from ct
26: Get the item domain ( idc) from ct
27: Create a contextual rule (cr) containing the rtc as precedent, and a

consequent rule tuple composed by icc, idc and cc
28: Add cr in crs
29: end if
30: end if
31: end for
32: return crs
33: end procedure

end

Both algorithms described above are based on Apriori algorithm (AGRAWAL;
IMIELIŃSKI; SWAMI, 1993). They can be seen as simplified versions of it since we are
interested only in a subset of rules. More precisely, once that the PostF algorithm only
uses the rules base when a user does not have contextual preferences in the recommenda-
tion domain, so, only rules that relate categories among different domains (source and
target) are necessary. For example, we could obtain rules like “{RELIGION,BOOK} =>
{RELIGION,TV}” or “{ACTION,TV} => {ROCK,MUSIC}” from Algorithm 2.


122 Chapter 4. CD-CARS Implementation

As it can be seen in this algorithm, first, we perform the GeneratePreferredCate-
goriesMapByUser procedure considering only good ratings (we set the minGoodRatingValue
to 4.0), and then we apply the GenerateAprioriCategoryDomainRules procedure, resulting
in a 1-step rule base which contains rules among different domains considering only their
item categories. For that, this procedure makes all possible combinations of precedent and
consequent item categories of different domains (see lines 2 and 3 in the GenerateApriori-
CategoryDomainRules procedure).

Furthermore, there are two Apriori parameters (minConfidenceLevel and minSup-
portLevel) that can be used to define a minimal threshold for considering rules or not.
If the minimum confidence and support levels for mining the rules are high, then the
algorithm may not obtain enough rules for the PostF recommendation. Again, these rules
are only required when a user does not have any category preference in the context of the
recommendation for the target domain.

Therefore, for the datasets used in this implementation, we set the minConfi-
denceLevel to 0.7 and the minSupportLevel to 0.01. Thus, the 1-step algorithm obtained
25 rules16 for the “book-television dataset” and 23 rules17 for the “book-music dataset”,
both with full user overlap. If we take into account the number of categories in the three
domains (27 for Book and Television, and 19 for Music), then we can say that almost a
half of these categories have a rule with an associated category, inferred by the 1-step
algorithm. This also means that the PostF algorithm will not be able to recommend items
of the categories that do not have any association rule.

We used the same values for minConfidenceLevel and minSupportLevel parameters
in Algorithm 3. This 2-step algorithm has the set of rules generated by the 1-step algorithm
as an input and generates a rule base containing rules among different domains considering
their item categories and also their contexts. In this case, examples of inferred rules could be
like “{ACTION,TV,WEEKEND_FAMILY} => {ROCK,MUSIC,WEEKEND_FAMILY}”
or “{RELIGION,BOOK,WEEKDAY_CANADA} => {GOSPEL,MUSIC,WEEKDAY_
CANADA}”. Note that both precedent and consequent contexts of these rules are the
same (according to the line 27 in the getAprioriCategoryDomainContextRules procedure).
This rule “kind” (with the same precedent and consequent contexts) is sufficient for our
purposes since the PostF algorithm only needs to infer a set of preferred categories in the
target domain for a user with preferred categories in the source domain according to the

16 Examples of generated rules: {DRAMA,TV => FANTASY,BOOK} (confidence=0.75 and sup-
port=0.5), {ARTISTIC,BOOK => DRAMA,TV} (confidence=0.70 and support=0.12), {WEST-
ERNS,TV => DOCUMENTARY,BOOK} (confidence=0.70 and support=0.09), {HEALTH,TV =>
EDUCATIONAL,BOOK} (confidence=0.74 and support=0.03), among others.

17 Examples of generated rules: {POP,MUSIC => FANTASY,BOOK} (confidence=0.74 and sup-
port=0.33), {CLASSICAL,MUSIC => DOCUMENTARY,BOOK} (confidence=0.73 and sup-
port=0.14), {NEW AGE,MUSIC => EDUCATIONAL,BOOK} (confidence=0.72 and support=0.05),
{KIDS,BOOK => KIDS,MUSIC} (confidence=0.78 and support=0.03), among others.


4.4. Base Cross-domain Algorithm Implementation 123

context of the recommendation. This context is used to filter the users’ preferred item
categories from the source domain in order to obtain the inferred item categories in the
target domain.

Finally, it is important to remember that the PostF algorithm is applied after
the base cross-domain recommendation by filtering recommended items out in the target
domain according to the users’ item category preferences (inferred or not).

4.4 Base Cross-domain Algorithm Implementation

As mentioned in Section 3.3.2, we apply collaborative filtering algorithms as base
cross-domain algorithm. In this implementation, we adopted a neighborhood-based (user-
based similarity) algorithm, due to its simplicity and based on a preliminary battery
of experiments using other CF-based algorithms (e.g. item-based neighborhood and
matrix factorization). This algorithm also has been used as a baseline for cross-domain
recommendation purposes in (CREMONESI; TRIPODI; TURRIN, 2011), which proposed
an enhanced version of that user-based algorithm, aiming to make cross-domain CF
recommendations under user overlap conditions. The implementation of these algorithms
is described in the following.

Algorithm 4. Item rating estimation with the implementation of the NNUserNgbr algo-
rithm.

Input: ux (the user), un (the user neighborhood calculated by any traditional similarity
metric), i (the item, from the target domain)
Output: er (estimated rating)

1: procedure estimatePreferenceInNNUserNgbr(ux, un, i)
2: if un 6= ∅ then
3: p = 0.0
4: ts = 0.0
5: c = 0
6: for each u ∈ un do
7: if u 6= ux then
8: if u has preference for i then
9: Get the preference value (pv) of u for i

10: Get the similarity value (sv) between u and ux
11: p = p + (sv ×pv)
12: ts = ts + sv
13: c = c + 1
14: end if
15: end if


124 Chapter 4. CD-CARS Implementation

16: end for
17: if c > 0 and ts 6= 0.0 then
18: er = p/ts return er
19: end if
20: end if
21: end procedure

end

Algorithm 4 presents the item rating estimation with the implementation of the
NNUserNgbr algorithm. It is important to notice that the item must be from the target
domain, so, the cross-domain recommendation can be seen as a reduced version of a
traditional single-domain CF-based recommendation. Besides, the user neighborhood can
be calculated by any similarity metric described in Section 3.3.2.1.1.

In addition, we implemented the enhanced version of the user-based algorithm
proposed in (CREMONESI; TRIPODI; TURRIN, 2011) (NNUserNgbr-transClosure), as
mentioned in Section 3.3.2.1.1. This algorithm differs from the NNUserNgbr only by
the fact that the user’s neighborhood calculation is extended. Thus, the Algorithm 4
also represents the item estimation process of the NNUserNgbr-transClosure algorithm
considering the calculation of the extended user neighborhood, which is presented in
Algorithm 5.

Algorithm 5. User neighborhood calculation for the NNUserNgbr-transClosure algorithm.

Input: ux (user), un (the user neighborhood calculated by any traditional similarity
metric), mn (neighborhood size limit)
Output: unt

1: procedure userNeighborhoodInNNUserNgbr-transClosure(ux, un,mn)
2: if un 6= ∅ then
3: Create a data structure (tm) containing a list of users where each user u

has a set of similar users with their respective similarity values (sv)
4: for each uA ∈ un do
5: if uA 6= ux then
6: Get the uA neighborhood (uAn) according to any traditional similarity

metric
7: for each uB ∈ uAn do
8: if uB * un and uB 6= ux and uB 6= uA then
9: Get the similarity value (svAB) between uB and uA

10: if uB * tm then


4.4. Base Cross-domain Algorithm Implementation 125

11: Create a data structure of users (su) containing the uA
associated with the svAB value

12: Add su in tm for uB
13: else
14: Get su in tm for uB
15: Add uA associated with the svAB value in su
16: end if
17: end if
18: end for
19: end if
20: end for
21: unt = un
22: Get the minimum similarity value (msv) from unt
23: Get the less similar user ( lsu) from unt
24: for each uA ∈ tm do
25: s = 0.0
26: c = 0
27: Get su from uA
28: for each uB ∈ su do
29: Get svAB from su for uB
30: Get the similarity value between uB and ux (svBU) from unt
31: s = s + (svAB ×svBU)
32: c = c + 1
33: end for
34: if c > 0 then
35: ns = s/c
36: Get number of similar users ∈ unt (nsu)
37: if mn > nsu then
38: Add in unt the uA associated with the respective ns value
39: if ns < msv then
40: msv = ns
41: lsu = uA
42: end if
43: else
44: if ns > msv then
45: Add in unt the uA associated with the respective ns value
46: Remove lsu from unt
47: Get the minimum similarity value (msv) from unt
48: Get the less similar user ( lsu) from unt


126 Chapter 4. CD-CARS Implementation

49: end if
50: end if
51: end if
52: end for
53: end if
54: return unt
55: end procedure

end

Note that the implemented user neighborhood calculation in Algorithm 5 can be
considered as a two-step similarity path between two users, as described in Section 3.3.2.1.1.
Besides, a similarity metric is used in this calculation (see line 6 in the userNeighborhoodIn-
NNUserNgbr-transClosure procedure) once that the “transclosure” process only extends
the similarities discovery among users. Finally, the maximum number of ’nearest neighbors’
is maintained by the “transclosure” process, which eliminates the smaller user similarities
from the user neighborhood (see lines 46 to 48 from the userNeighborhoodInNNUserNgbr-
transClosure procedure).

4.5 Final Remarks

In this chapter, we presented particular details of an implementation of two
proposed CD-CARS algorithms (PreF and PostF) as well as the implementation of two
base cross-domain algorithms (NNUserNgbr and NNUserNgbr-transClosure). In addition,
we showed the properties of two CD-CARS datasets (“book-television” and “book-music”)
with different contextual information (Temporal, Location and Companion) and domains
(Book, Television and Music) and how they were obtained.

Also in this chapter, we described the process of selecting relevant contextual
attributes and values through a data mining method (InfoGainAttributeEval from Weka
tool (HALL et al., 2009)). Finally, we presented the implementation of the contextual
model through the extension of the Mahout framework (OWEN et al., 2011).

In the next chapter, we describe and discuss experimental evaluations of the
implemented algorithms through the CD-CARS datasets.


127

5 CD-CARS Evaluation

This chapter presents an experimental evaluation of two proposed CD-CARS
algorithms in comparison to cross-domain CF-based ones. For that, Section 5.1 describes
the evaluation methodology adopted and the algorithms’ settings. Section 5.2 describes
the evaluation results for each dataset used in the experiments as well as a discussion
about their findings. Finally, Section 5.3 presents the final remarks of this chapter.

5.1 Evaluation Methodology

In this section, we describe the adopted methodology to evaluate the proposed
algorithms as well as their configurations. Besides, we describe how the statistical
significance of the results is verified.

5.1.1 Settings of the Algorithms

Before evaluating the proposed CD-CARS algorithms, we performed a preliminary
battery of experiments in the two datasets mentioned in Section 4.1.3 in order to adjust
the settings of the base single-domain CF-based algorithm (NNUserNgbr) adopted in the
CD-CARS evaluation. As mentioned in Section 3.3.2.1.1, that algorithm can also be used
to perform cross-domain recommendations, thus, we intend to verify its performance for
single-domain and cross-domain scenarios.

In this way, we adjusted the NNUserNgbr settings according to several experiments
performed in the Book domain for each dataset, i.e., performing a single-domain recom-
mendation, once that Book is a common domain to the both datasets. We set the ‘n’
parameter of the NNUserNgbr algorithm to “475” and selected the Euclidian distance as
the similarity metric for it. The same configuration and similarity metric were adopted
for another base cross-domain CF-based algorithm (NNUserNgbr-transClosure) and for
evaluation in other domains (Television and Music).

As the proposed CD-CARS algorithms, PreF and PostF, can be performed in
combination with the base NNUserNgbr and NN-UserNgbr-transClosure ones, so, the base
algorithms were used with the same settings described before. In addition to these settings,
we set the PostF threshold (θ) value to “2/3” of the frequency of the most preferred
category, and only the categories of items that had good ratings (four or more in a five-star
scale) were considered in computation of the frequency of the users’ preferred categories
(see Section 3.3.1.2). In the PostF algorithm, we also set “0.7” and “0.01”, respectively,
for the association rule confidence and support levels. This decision was also made by


128 Chapter 5. CD-CARS Evaluation

considering preliminary experiments in the Book domain as target by observing the PostF
performance in the two datasets.

We evaluated the proposed algorithms in comparison to the baseline ones by using
Probabilistic and Classification measures, as described in the following sections.

5.1.2 Predictive Performance

We measured the predictive performance of the algorithms by using the Mean
Average Error (MAE) and Root Mean Squared Error (RMSE) metrics (SHANI; GU-
NAWARDANA, 2011). MAE is a measure of the deviation of recommendations from their
actual user-rating values. For each ratings-prediction pair (pi,qi) this metric treats the
absolute error between them. The MAE is computed by first summing these absolute
errors of the ‘N’ corresponding ratings-prediction pairs and then computing the average.
Formally,

MAE =
∑N
i=1 |pi − qi|

N
(5.1)

Analogously, RMSE computes the square root of the average of the square of all of
the error (punishing large errors), by means of this formula:

RMSE =
√ ∑N

i=1(pi − qi)2
N

(5.2)

These metrics evaluate the performance of a RS by comparing the numerical
recommendation scores against the actual user ratings for the user-item pairs in the test
dataset. In this way, for a single dataset adopted in the CD-CARS evaluation, we split it
into the training and test sets for each target domain (e.g. Music) and context under test
(e.g. “on sunday with friends”).

The training set is composed by 100% of ratings from source domain, 100% of
ratings from the target domain in which their contexts are not under test and 90% of
ratings from target domain in which their contexts are under test. The test set is composed
by 10% of ratings from target domain in which their contexts are under test.

Figure 29 illustrates the process of splitting training and test sets considering the
target domain and context under test. It avoids the waste of ratings in the test set for
those ones that are not used in the target domain and context under test. That process
can be seen as Hold-out, according to the partitioning ways of evaluation data described
in Section 2.2.6.1.

Finally, for each target domain and context under test, we performed each evaluated
algorithm five times in order to verify its standard deviation and apply statistical tests.


5.1. Evaluation Methodology 129

Figure 29 – Splitting training and test sets considering the target domain and context
under test.

5.1.3 Classification Performance

Regarding the classification performance of the algorithms, we adopted the F-metric
proposed by Cremonesi, Tripodi e Turrin (2011), which is calculated according to the
Precision and Recall values used for evaluating top-N recommendations and obtained
through a testing methodology described in (CREMONESI; KOREN; TURRIN, 2010).

Analogously to (CREMONESI; KOREN; TURRIN, 2010), we randomly extracted,
approximately, 1.4% of the ratings from the original dataset in order to build a probe set,
therefore, the training set was composed by 98.6% of the rating from the full dataset. On
the other hand, the test set was composed exclusively by the 5-star ratings (the maximum
rating value for that evaluation dataset) from the probe set, thus, the not 5-star ratings
from the probe set were discarded.

However, we adapted this methodology by considering the target domain and
context under test in order to fulfill the training and probe sets in a similar way than is
made in the probabilistic evaluation, as illustrated in Figure 29. This avoids the waste
of ratings in the probe set for those ones that are not used in the target domain and
context under test. Likewise the predictive performance evaluation, the dataset used in
the classification performance is partitioned as Hold-out, according to the partitioning
ways of evaluation data described in Section 2.2.6.1.

After those steps, we trained the algorithms with the training set and for each
rating in the test set, given by a user ‘u’ for an item ‘i’ from the target domain:

• We predict the ratings for the item ‘i’ and for 100 additional items1 from the target
1 The original method empirically adopts 1000 additional items, but the authors let this number free to


130 Chapter 5. CD-CARS Evaluation

domain randomly chosen from the ones unrated by the user ‘u’; and

• In decreasing order, we sort the list of 101 items2 according to the predicted ratings.
If the item ‘i’ appears in the top-N recommendation list, we have a “hit”.

In this way, Precision, Recall and F-metric values, according to (CREMONESI;
KOREN; TURRIN, 2010), are defined as:

Recall(N) =
#hits
|test set|

(5.3)

Precision(N) =
#hits

N ∗ |test set|
(5.4)

F-metric(N) =
2 ∗Recall(N) ∗Precision(N)
Recall(N) + Precision(N)

(5.5)

Likewise the probabilistic evaluation, for each target domain and context under
test, we performed each evaluated algorithm five times. The execution of several trials is
not specified by the methodology proposed in (CREMONESI; KOREN; TURRIN, 2010),
but we believe that as more executions are made the more reliable should be the results.

Finally, in the evaluation results of the algorithms for a particular target domain
and user overlap level, we show their classification performances through the F-metric
curves by varying the number of top ‘N’ items (from one to twenty3). Besides, given that
most of the online recommender systems (e.g. Amazon4, IMDB5, etc.) usually recommend
up to five items in their basic layout (CREMONESI; TRIPODI; TURRIN, 2011), we fixed
the top ‘N’ value to “five” to verify the variation of the F-metric value across different
user overlap levels (sensitivity evaluation).

5.1.4 Sensitivity Evaluation

As mentioned in Section 2.2.6.3, the performance of a cross-domain RS can be
affected by the density of the target domain data and the user overlap between source and
target domains. In this way, we evaluated the quality of the cross-domain algorithms by
varying the percentage of the user overlap (10%, 50%, and 100%). Section 4.1.3 describes
the properties of the datasets adopted in the CD-CARS evaluation regarding the different
overlap levels.

be chosen depending on the dataset used.
2 In the original method, the authors adopted 1001 items.
3 We have chosen this maximum top ‘N’ value by observing the convergence in the F-metric curves of

the algorithms.
4 http://www.amazon.com
5 http://www.imdb.com


5.2. Evaluation Results 131

In addition, we studied the impact of the density of the target domain data in
comparison to the density of the source domain data. Thus, we evaluated the quality of
the cross-domain algorithms by varying the target domain in both datasets. One of them
for more related domains (Book and Television), where the Book domain has more data
than the Television domain, and another for less related domains (Book and Music), also
where the Book domain has more data than the Music domain. Therefore, we expect that
enriching sparse user preference data in a certain domain by adding user preference data
from other domain can significantly improve the quality of cross-domain recommendations
(SAHEBI; BRUSILOVSKY, 2013) (FERNÁNDEZ-TOBÍAS et al., 2012).

5.1.5 Statistical Significance Analysis

In order to verify the statistical significance of the evaluation results, we adopted the
nonparametric Mann–Whitney U test (WINTER; DODOU, 2010), also called Mann–Whitney–
Wilcoxon (MWW) or Wilcoxon rank-sum test. This test verifies the null hypothesis, which
states that two samples are statistically the same, against an alternative hypothesis,
which especially can determine if a particular population tends to have larger values than
the other. In addition, unlike the t-test it does not require the assumption of normal
distributions (WINTER; DODOU, 2010).

In this way, we applied the statistical significance tests with a confidence level of
95% for all user overlap levels, contextual dimensions and target domains. These tests
were applied with support from “R” software tool (R Core Team, 2015). For the tests of
predictive performance, we verified if the errors of the baseline algorithms were greater
than the errors of the proposed ones6. For the tests of classification performance, we
verified if the F-metric values of the proposed algorithms were greater than the F-metric
values of the baselines ones7, considering the F-metric values for N=5. In both cases, the
applied Wilcoxon tests were not paired given that the samples were independent among
the algorithms.

5.2 Evaluation Results

According to the datasets and evaluation methodology described before, we present
and discuss the results of the proposed CD-CARS algorithms in comparison to the baseline
cross-domain CF-based algorithms. For that, we divided the experiments into two datasets.
Section 5.2.1 shows the evaluation results about two related domains (Book and Television),
whereas Section 5.2.2 presents the evaluation results about two less related domains (Book
and Music). Finally, we discuss the evaluation results in Section 5.2.3.
6 wilcox.test(baseline,proposed_algorithm, paired=FALSE,alternative = “greater”) by using the “R”

software tool.
7 wilcox.test(proposed_algorithm,baseline, paired=FALSE,alternative = “greater”) by using the “R”

software tool.


132 Chapter 5. CD-CARS Evaluation

5.2.1 Book-Television Results

As mentioned before, we evaluated the quality of the cross-domain algorithms by
varying the target domain for each dataset in order to study the impact of the density of
the target domain data in comparison to the density of the source domain data. Thus,
the following sections present the results considering a different domain as a target.

5.2.1.1 Television as Target Domain

According to the contextual dimensions present in the datasets, we describe the
evaluation results for each contextual dimension in the following sections. In addition, we
show the results for a combination of contextual dimensions in Section 5.2.1.1.4.

5.2.1.1.1 Temporal Dimension

Table 19 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Temporal dimension and different user overlap
levels for the Television domain as target. The rows 1 and 2 from the table show the
NNUserNgbr predictive performance when it is applied in the single-domain and cross-
domain recommendations. As it can be seen, the simple addition of user ratings from
other domain (Book), by using the same algorithm for cross-domain recommendation,
improved the recommendation performance in, approximately, 8–21% (MAE) and 2–10%
(RMSE) depending on the user overlap level.

Table 19 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Temporal
dimension (source domain: Book, and target domain: Television).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.721 ±
0.024

1.020 ±
0.048

0.454 ±
0.008

0.759 ±
0.020

0.412 ±
0.006

0.734 ±
0.012

NNUserNgbr
(cross-domain)

0.598 ±
0.022

0.922 ±
0.044

0.417 ±
0.007

0.742 ±
0.018

0.324 ±
0.005

0.661 ±
0.010

NNUserNgbr-
transClosure

0.251 ±
0.014

0.548 ±
0.028

0.256 ±
0.005

0.556 ±
0.012

0.217 ±
0.002

0.531 ±
0.004

PreF with
NNUserNgbr-
transClosure

0.129 ±
0.020

0.382 ±
0.057

0.132 ±
0.006

0.374 ±
0.016

0.151 ±
0.003

0.413 ±
0.007

PostF with
NNUserNgbr-
transClosure

0.216 ±
0.012

0.486 ±
0.024

0.210 ±
0.003

0.469 ±
0.005

0.173 ±
0.004

0.445 ±
0.006

In addition, Table 19 presents the overall performance of the PreF and PostF
algorithms, besides the base NNUserNgbr-transClosure algorithm, which outperformed


5.2. Evaluation Results 133

the NNUserNgbr algorithm (performed for cross-domain purposes) by achieving an im-
provement that varied in, approximately, 33–58% (MAE) and 19–40% (RMSE) depending
on the user overlap levels.

As it can be seen from table, the PreF predictive performance was better than the
NNUserNgbr-transClosure and the PostF algorithms in all user overlap levels. The im-
provement achieved by the PreF algorithm in comparison to the NNUserNgbr-transClosure
one varied in, approximately, 30–48% (MAE) and 22–32% (RMSE) depending on the user
overlap level. The PostF predictive performance was also better than the NNUserNgbr-
transClosure algorithm in all user overlap levels, however, its improvement was smaller
than the achieved by the PreF algorithm. Figure 30 illustrates the predictive performance
(MAE) of the proposed algorithms over different user overlap levels.

Figure 30 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal dimension (source domain: book, and target
domain: television).

The statistical significance tests verified that both the PreF and PostF predictive
errors (MAE) were less than the NNUserNgbr-transClosure baseline algorithm for all
user overlap levels8. Figure 31a, Figure 31b and Figure 31c show the boxplots with the
prediction performance (MAE) of the algorithms, respectively, for the 10%, 50%, and
100% user overlap levels.

Regarding the classification performance, Figures 32a, 32b, and 32c present the
results of the F-metric at different N values (between one and twenty), respectively, in
10%, 50%, and 100% of user overlap level for the Television domain as target, considering
the Temporal dimension. As it can be seen, in all user overlap levels and top ‘N’ values,
8 p-value=0.005413 and W=100 for all tests, except in the comparison between the baseline and the

PostF algorithms when the user overlap level was 10%, where the W=97 and the p-value=0.037


134 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 31 – Overall prediction performance (MAE) boxplots for television domain in the
temporal dimension with different user overlap levels (source domain: book).

the PostF classification performance was better or similar than the baseline algorithms.
The PreF performance was better than the baseline ones for low top ‘N’ values when there
were 10% and 50% of user overlap levels, and for any top ‘N’ value when there was 100%
of user overlap. In addition, the PostF classification performance was better or similar
than the PreF one.

Figure 33 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
PostF F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm
for all user overlap levels9. Besides, the PreF F-metric values were greater than the

9 p-value=0.005413 and W=99 for all tests


5.2. Evaluation Results 135

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 32 – F-metric performance x top ‘N’ items for the television domain in the temporal
dimension with different user overlap levels (source domain: book).


136 Chapter 5. CD-CARS Evaluation

Figure 33 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal dimension (target domain: television, and
source: book).

NNUserNgbr-transClosure algorithm for 50% and 100% user overlap levels10. For 10%
of user overlap, the NNUserNgbr-transClosure F-metric value was greater than the PreF
algorithm11.

5.2.1.1.2 Location Dimension

Table 20 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Location dimension and different user overlap
levels for the Television domain as target. As it can be seen from the table, the addition
of user ratings from other domain (Book), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 20–36% (MAE) and 8–20% (RMSE) depending on the
user overlap level.

Also, Table 20 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 21–58% (MAE) and 11–36% (RMSE) depending on the
user overlap levels.

As it can be seen from table, the PostF predictive performance was better than
the NNUserNgbr-transClosure algorithm in all user overlap levels, with an improvement
that varied in, approximately, 5–16% (MAE) and 4–14% (RMSE) depending on the user
overlap level. In addition, if we consider the high standard deviation (std) of the PreF
10 p-value=0.005413 and W=99 for all tests
11 p-value=0.005413 and W=100


5.2. Evaluation Results 137

Table 20 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Location
dimension (source domain: Book, and target domain: Television).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.721 ±
0.024

1.020 ±
0.048

0.454 ±
0.008

0.759 ±
0.020

0.412 ±
0.006

0.734 ±
0.012

NNUserNgbr
(cross-domain)

0.573 ±
0.022

0.865 ±
0.044

0.363 ±
0.007

0.691 ±
0.018

0.261 ±
0.005

0.582 ±
0.010

NNUserNgbr-
transClosure

0.240 ±
0.017

0.550 ±
0.039

0.247 ±
0.006

0.545 ±
0.013

0.206 ±
0.003

0.513 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.305 ±
0.385

0.433 ±
0.541

0.212 ±
0.045

0.588 ±
0.123

0.242 ±
0.027

0.602 ±
0.064

PostF with
NNUserNgbr-
transClosure

0.200 ±
0.010

0.468 ±
0.028

0.233 ±
0.004

0.519 ±
0.011

0.194 ±
0.005

0.484 ±
0.011

algorithm showed in Table 20, then we can say that the PostF outperformed the PreF
algorithm for all user overlap levels.

Figure 34 (MAE) and Figure 35 (RMSE) illustrate the predictive performance
of the proposed algorithms over different user overlap levels. Note that for this case we
showed the figures for both predictive metrics, since we observed a difference in the PreF
performance depending on the predictive metric used in the evaluation.

Figure 34 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the location dimension (source domain: book, and target
domain: television).

The statistical significance tests verified that the PostF predictive errors (MAE)


138 Chapter 5. CD-CARS Evaluation

Figure 35 – Overall prediction error (RMSE) for cross-domain algorithms by varying user
overlap level in the location dimension (source domain: book, and target
domain: television).

were less than the NNUserNgbr-transClosure baseline algorithm for all user overlap levels12.
On the other hand, the PreF predictive errors (MAE) were statistically similar to the
NNUserNgbr-transClosure algorithm for 10% and 50% of user overlap levels13. For 100% of
user overlap, the NNUserNgbr-transClosure predictive error was statistically less than the
PreF one14. Figure 36a, Figure 36b and Figure 36c show the boxplots with the prediction
performance (MAE) of the algorithms, respectively, for the 10%, 50%, and 100% user
overlap levels.

With respect to the classification performance, Figures 37a, 37b, and 37c present the
results of the F-metric at different top ‘N’ values (between one and twenty), respectively, in
10%, 50%, and 100% of user overlap levels for the Television domain as target, considering
the Location dimension. As it can be seen, in all user overlap levels considering ‘N’ up to
five, the PostF classification performance was better than the baseline algorithms, whereas
for 50% and 100% of user overlap, the PostF outperformed them for ‘N’ up to ten. On
the other hand, the PreF classification performance was worse than all other algorithms
in all user overlap levels and ‘N’ values.

Figure 38 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm for

12 For 10% of user overlap, W=100 and p-value=0.005413. For 50% of user overlap, W=95 and
p-value=0.0001028. Finally, for 100% of user overlap, W=98 and p-value=0.02165

13 For 10% of user overlap, W=30 and p-value=0.09391. For 50% of user overlap, W=71 and p-
value=0.0615

14 W=90 and p-value=0.0007523


5.2. Evaluation Results 139

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 36 – Overall prediction performance (MAE) boxplots for television domain in the
location dimension with different user overlap levels (source domain: book).

50% and 100% of user overlap levels15, whereas their performances were statistically similar
for 10% of user overlap16. Besides, the NNUserNgbr-transClosure F-metric values were
greater than the PreF algorithm for all user overlap levels17.

15 p-value=0.005413 and W=99 for all tests
16 p-value=0.2 and W=71
17 p-value=0.005413 and W=99 for all tests


140 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 37 – F-metric performance x top ‘N’ items for the television domain in the location
dimension with different user overlap levels (source domain: book).


5.2. Evaluation Results 141

Figure 38 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the location dimension (target domain: television, and
source: book).

5.2.1.1.3 Companion Dimension

Table 21 shows the overall predictive performance of the recommender algorithms,
considering all contextual values from the Companion dimension and different user overlap
levels for the Television domain as target. As it can be seen from the table, the addition
of user ratings from other domain (Book), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 16–28% (MAE) and 10–14% (RMSE) depending on the
user overlap level.

Table 21 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Companion
dimension (source domain: Book, and target domain: Television).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.721 ±
0.024

1.020 ±
0.048

0.454 ±
0.008

0.759 ±
0.020

0.412 ±
0.006

0.734 ±
0.012

NNUserNgbr
(cross-domain)

0.583 ±
0.022

0.881 ±
0.044

0.380 ±
0.007

0.680 ±
0.018

0.295 ±
0.005

0.625 ±
0.010

NNUserNgbr-
transClosure

0.249 ±
0.029

0.539 ±
0.048

0.279 ±
0.011

0.587 ±
0.024

0.246 ±
0.003

0.574 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.931 ±
0.111

1.308 ±
0.158

0.858 ±
0.021

1.204 ±
0.026

0.842 ±
0.010

1.169 ±
0.012

PostF with
NNUserNgbr-
transClosure

0.232 ±
0.035

0.492 ±
0.065

0.256 ±
0.013

0.540 ±
0.025

0.221 ±
0.006

0.523 ±
0.010


142 Chapter 5. CD-CARS Evaluation

Also, Table 21 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 16–57% (MAE) and 8–38% (RMSE) depending on the user
overlap levels.

Figure 39 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the companion dimension (source domain: book, and target
domain: television).

As it can be seen from table, the PostF predictive performance was better than
the NNUserNgbr-transClosure algorithm in all user overlap levels, with an improvement
that varied in, approximately, 6–10% (MAE) and 8–9% (RMSE) depending on the user
overlap level. In addition, the predictive performance of the PreF algorithm was worse
than all other algorithms in the Companion dimension considering all user overlap levels,
as showed in Table 21. Figure 39 illustrates the predictive performance (MAE) of the
proposed algorithms over different user overlap levels.

Except when the user overlap level was 10%, the statistical significance tests verified
that the PostF predictive errors (MAE) were less than the NNUserNgbr-transClosure
algorithm for all other user overlap levels18. On the other hand, the NNUserNgbr-
transClosure predictive errors (MAE) were statistically less than the PreF ones for all
user overlap levels19. Figure 40a, Figure 40b and Figure 40c show the boxplots with the
prediction performance (MAE) of the algorithms, respectively, for the 10%, 50%, and
100% user overlap levels.

18 For 10% of user overlap, W=65 and p-value=0.1399. For 50% of user overlap, W=89 and p-
value=0.001045. Finally, for 100% of user overlap, W=100 and p-value=0.005413

19 W=89 and p-value=0.001045 for all tests


5.2. Evaluation Results 143

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 40 – Overall prediction performance (MAE) boxplots for television domain in the
companion dimension with different user overlap levels (source domain: book).

Figures 41a, 41b, and 41c present the results of the F-metric at different top ‘N’
values (between one and twenty), respectively, in 10%, 50%, and 100% of user overlap
level for the Television domain as target, considering the Companion dimension. As it
can be seen, the proposed algorithms only outperformed the baseline for 50% and 100% of
user overlap with low values of top ‘N’.

Figure 42 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
NNUserNgbr-transClosure F-metric values were greater than the PostF ones for 10% and


144 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 41 – F-metric performance x top ‘N’ items for the television domain in the com-
panion dimension with different user overlap levels (source domain: book).


5.2. Evaluation Results 145

Figure 42 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the companion dimension (target domain: television, and
source: book).

50% of user overlap levels20, whereas their performances were statistically similar for 100%
of user overlap21. Besides, the NNUserNgbr-transClosure F-metric values were greater
than the PreF algorithm for all user overlap levels22.

5.2.1.1.4 Combining Contextual Dimensions

In the previous sections, we presented the evaluation results regarding the contextual
dimensions separately. In this section, we report the results for a combination of two
contextual dimensions considering the same evaluation metrics and methodology described
before. As mentioned in Section 4.1.2, an important aspect of context-aware recommender
systems is to determine the relevance of contextual dimensions, attributes (or even values)
in order to select only the contextual features that actually matter for evaluation (or
recommendation) purposes.

We have seen in Section 4.1.2 that two of the contextual dimensions (Temporal
and Location) provide a greater information gain than the Companion dimension, which
is confirmed by the results presented in the previous sections. In this way, we evaluated
the combination of those two contextual dimensions (Temporal and Location), aiming to
verify its performance in comparison to their own performances evaluated separately.

Table 22 reports the overall predictive performance of the recommender algorithms,
considering all contextual value combinations from the Temporal and Location dimensions
with different user overlap levels for the Television domain as target. As it was observed
20 p-value=0.005413 and W=99 for all tests
21 p-value=0.75 and W=60
22 p-value=0.003913 and W=100 for all tests


146 Chapter 5. CD-CARS Evaluation

in the previous sections, the addition of user ratings from the Book domain also improved
the predictive performance of the NNUserNgbr in, approximately, 20–37% (MAE) and
9–20% (RMSE) depending on the user overlap level (rows 1 and 2 from the table).

Table 22 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual value combinations from the
temporal and location dimensions (source domain: Book, and target domain:
Television).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.721 ±
0.024

1.020 ±
0.048

0.454 ±
0.008

0.759 ±
0.020

0.412 ±
0.006

0.734 ±
0.012

NNUserNgbr
(cross-domain)

0.571 ±
0.022

0.860 ±
0.044

0.360 ±
0.007

0.689 ±
0.018

0.259 ±
0.005

0.580 ±
0.010

NNUserNgbr-
transClosure

0.224 ±
0.017

0.482 ±
0.039

0.250 ±
0.006

0.552 ±
0.013

0.207 ±
0.003

0.515 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.396 ±
0.390

0.761 ±
0.560

0.720 ±
0.045

1.050 ±
0.123

0.333 ±
0.027

0.739 ±
0.064

PostF with
NNUserNgbr-
transClosure

0.226 ±
0.010

0.503 ±
0.028

0.190 ±
0.004

0.433 ±
0.011

0.161 ±
0.005

0.437 ±
0.011

Also, Table 22 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 20–60% (MAE) and 11–43% (RMSE) depending on the
user overlap levels.

As it can be seen from table, except when the user overlap level was 10%, the
PostF predictive performance was better than the NNUserNgbr-transClosure algorithm in
all other user overlap levels, with an improvement that varied in, approximately, 22–24%
(MAE) and 15–21% (RMSE) depending on the user overlap level. Despite the NNUserNgbr-
transClosure algorithm have outperformed the PostF when the user overlap level was 10%,
they had a similar performance, separated only by their standard deviations. Figure 43
illustrates the predictive performance (MAE) of the proposed algorithms over different
user overlap levels.

The statistical significance tests verified that the PostF predictive errors (MAE)
were less than the NNUserNgbr-transClosure algorithm for the 50% and 100% user overlap
levels23. When the user overlap level was 10%, the applied tests could not determine
any statistical difference between the NNUserNgbr-transClosure and PostF predictive

23 In both cases, with W=100 and p-value=0.005413


5.2. Evaluation Results 147

Figure 43 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal and location dimensions (source domain: book,
and target domain: television).

errors (MAE)24. On the other hand, the applied tests verified that the NNUserNgbr-
transClosure predictive errors (MAE) were less than the PreF ones for all user overlap
levels25. Figure 44a, Figure 44b and Figure 44c show the boxplots with the prediction
performance (MAE) of the algorithms, respectively, for the 10%, 50%, and 100% user
overlap levels.

Taking into account the classification performance, Figures 45a, 45b, and 45c
present the results of the F-metric at different top ‘N’ values, respectively, in 10%, 50%,
and 100% of user overlap level for the Television domain as target, considering the
combination between the Temporal and Location dimensions. For all user overlap levels
and top ‘N’ values, the PostF classification performance was better or similar than the
baseline algorithms, while the PreF classification performance was worse than all other
algorithms.

Figure 46 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
PostF F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm
for all user overlap levels26. On the other hand, the applied tests also verified that
NNUserNgbr-transClosure F-metric values were greater than the PreF ones for all user
overlap levels27.

24 W=41 and p-value=0.5
25 For all cases, with W=100 and p-value=0.003914
26 p-value=0.005413 and W=97 for all tests
27 p-value=0.003968 and W=100 for all tests


148 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 44 – Overall prediction performance (MAE) boxplots for television domain in the
temporal and location dimensions with different user overlap levels (source
domain: book).

5.2.1.2 Book as Target Domain

In the Section 5.2.1.1, we presented the results for the Television target domain,
which had fewer ratings in the cross-domain dataset in comparison to Book source domain
(as described in Section 4.1.3.1). In this section, we present the results when Book is the
target domain and Television is the source domain.

According to the contextual dimensions present in Section 4.1.3.1, we describe the
evaluation results for each contextual dimension in the following sections. In addition, we
show the results for a combination of contextual dimensions in Section 5.2.1.2.4.

5.2.1.2.1 Temporal Dimension

Table 23 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Temporal dimension and different user overlap


5.2. Evaluation Results 149

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 45 – F-metric performance x top ‘N’ items for the television domain in the temporal
and location dimensions with different user overlap levels (source domain:
book).


150 Chapter 5. CD-CARS Evaluation

Figure 46 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal and location dimensions (target domain:
television, and source: book).

Table 23 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Temporal
dimension (source domain: Television, and target domain: Book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.643 ±
0.022

0.963 ±
0.040

0.500 ±
0.008

0.794 ±
0.020

0.401 ±
0.006

0.719 ±
0.010

NNUserNgbr
(cross-domain)

0.497 ±
0.020

0.836 ±
0.042

0.367 ±
0.007

0.674 ±
0.018

0.297 ±
0.005

0.620 ±
0.008

NNUserNgbr-
transClosure

0.133 ±
0.007

0.361 ±
0.021

0.180 ±
0.002

0.437 ±
0.006

0.181 ±
0.002

0.459 ±
0.005

PreF with
NNUserNgbr-
transClosure

0.122 ±
0.009

0.340 ±
0.009

0.116 ±
0.002

0.340 ±
0.006

0.114 ±
0.001

0.330 ±
0.005

PostF with
NNUserNgbr-
transClosure

0.120 ±
0.008

0.319 ±
0.019

0.153 ±
0.003

0.375 ±
0.009

0.155 ±
0.003

0.399 ±
0.007

levels for the Book domain as target. The rows 1 and 2 from the table show the NNUser-
Ngbr predictive performance when it is applied in the single-domain and cross-domain
recommendations. As it can be seen, the simple addition of user ratings from other domain
(Television), by using the same algorithm for cross-domain recommendation, improved the
recommendation performance in, approximately, 22–26% (MAE) and 13–15% (RMSE)
depending on the user overlap level.

In addition, Table 23 presents the overall performance of the PreF and PostF
algorithms, besides the base NNUserNgbr-transClosure algorithm, which outperformed
the NNUserNgbr algorithm (performed for cross-domain purposes) by achieving an im-


5.2. Evaluation Results 151

provement that varied in, approximately, 38–73% (MAE) and 25–56% (RMSE) depending
on the user overlap levels.

As it can be seen from the table, the PreF predictive performance was better
than the NNUserNgbr-transClosure for all user overlap levels, and better than the PostF
algorithm for 50% and 100% of user overlap levels. The improvement achieved by the PreF
algorithm in comparison to the NNUserNgbr-transClosure one varied in, approximately,
8–37% (MAE) and 5–28% (RMSE) depending on the user overlap level. The PostF
predictive performance was better than the PreF algorithm for 10% of user overlap, and
better than the NNUserNgbr-transClosure for all user overlap levels. Figure 47 illustrates
the predictive performance (MAE) of the proposed algorithms over different user overlap
levels.

Figure 47 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal dimension (source domain: television, and target
domain: book).

The statistical significance tests verified that both the PreF and PostF predictive
errors (MAE) were less than the NNUserNgbr-transClosure baseline algorithm for all
user overlap levels28. Figure 48a, Figure 48b and Figure 48c show the boxplots with the
prediction performance (MAE) of the algorithms, respectively, for the 10%, 50%, and
100% user overlap levels.

Regarding the classification performance, Figures 49a, 49b, and 49c present the
results of the F-metric at different N values (between one and twenty), respectively, in
10%, 50%, and 100% of user overlap level for the Book domain as target, considering the
Temporal dimension. As it can be seen, in all user overlap levels and top ‘N’ values, the
28 p-value=0.005413 and W=100 for all tests, except when there was 10% of user overlap, where W=79

and p-value=0.0144 (PreF x NNUserNgbr-transClosure), and W=87 and p-value=0.001943 (PostF x
NNUserNgbr-transClosure)


152 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 48 – Overall prediction performance (MAE) boxplots for book domain in the tem-
poral dimension with different user overlap levels (source domain: television).

PostF classification performance was better or similar than the NNUserNgbr-transClosure
baseline algorithm, whereas the PreF was only better than that baseline for low top ‘N’
values with 50% and 100% of user overlap levels. In addition, the PostF classification
performance was better than the PreF one for all user overlap levels and top ‘N’ values.

Figure 50 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm for


5.2. Evaluation Results 153

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 49 – F-metric performance x top ‘N’ items for the book domain in the temporal
dimension with different user overlap levels (source domain: television).


154 Chapter 5. CD-CARS Evaluation

Figure 50 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal dimension (target domain: book, and source:
television).

50% and 100% of user overlap levels29, whereas their performances were statistically similar
for 10% of user overlap30. The PreF F-metric value was greater than the NNUserNgbr-
transClosure algorithm for 100% of user overlap31, whereas the opposite occurred when
the user overlap levels were 10% and 50%32.

5.2.1.2.2 Location Dimension

Table 24 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Location dimension and different user overlap
levels for the Book domain as target. As it can be seen from the table, the addition of user
ratings from other domain (Television), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 25–30% (MAE) and 13–18% (RMSE) depending on the
user overlap level.

Also, Table 24 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 38–73% (MAE) and 25–60% (RMSE) depending on the user
overlap levels. As it can be seen from table, the PostF predictive performance was better
or similar than the NNUserNgbr-transClosure algorithm in all user overlap levels, with an

29 p-value=0.003968 and W=100 for all tests
30 p-value=0.35 and W=60
31 W=77 and p-value=0.02163
32 p-value=0.003968 and W=100 for both tests


5.2. Evaluation Results 155

Table 24 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Location
dimension (source domain: Television, and target domain: Book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.643 ±
0.022

0.963 ±
0.040

0.500 ±
0.008

0.794 ±
0.020

0.401 ±
0.006

0.719 ±
0.010

NNUserNgbr
(cross-domain)

0.470 ±
0.020

0.830 ±
0.042

0.349 ±
0.007

0.645 ±
0.018

0.297 ±
0.005

0.617 ±
0.008

NNUserNgbr-
transClosure

0.125 ±
0.015

0.329 ±
0.034

0.177 ±
0.006

0.427 ±
0.013

0.182 ±
0.003

0.460 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.184 ±
0.285

0.447 ±
0.503

0.121 ±
0.045

0.314 ±
0.087

0.179 ±
0.019

0.432 ±
0.040

PostF with
NNUserNgbr-
transClosure

0.127 ±
0.004

0.330 ±
0.012

0.173 ±
0.004

0.410 ±
0.011

0.175 ±
0.003

0.438 ±
0.009

improvement that varied in, approximately, 2–4% (MAE) and 3–4% (RMSE) depending
on the user overlap level.

In addition, we can see in Table 24 that the PostF outperformed the PreF algorithm
for the majority of the user overlap levels (10% and 100%). As the results presented
in Section 5.2.1.1.2 (source domain: Book, target domain: Television, and Location
dimension), the PreF predictive performance had a high standard deviation. As mentioned
in that section, this issue may be caused by the PreF feature of filtering ratings from the
target domain for untested contexts, especially in the Location dimension, where there are
several cities with a low number of ratings. Figure 51 illustrates the predictive performance
(MAE) of the proposed algorithms over different user overlap levels.

The statistical significance tests verified that both the PreF and PostF predictive
errors (MAE) were less than the NNUserNgbr-transClosure baseline algorithm for 50% of
user overlap level33. For 10% and 100% of user overlap, there was not a significant difference
between the performance of the PostF algorithm and the NNUserNgbr-transClosure baseline
algorithm34. The same occurred between the baseline and PreF algorithms for 100% of
user overlap35, whereas the NNUserNgbr-transClosure predictive errors was less than the
PreF one36 for 10% of user overlap. Figure 52a, Figure 52b and Figure 52c show the
boxplots with the prediction performance (MAE) of the algorithms, respectively, for the
10%, 50%, and 100% user overlap levels.

With respect to the classification performance, Figures 53a, 53b, and 53c present

33 W=95 and p-value=0.0001028 (PreF x NNUserNgbr-transClosure), while W=77 and p-value=0.02163
(PostF and NNUserNgbr-transClosure)

34 W=44 and p-value=0.6847 (10% of user overlap), while W=96 and p-value=0.06495 (100% of user


156 Chapter 5. CD-CARS Evaluation

Figure 51 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the location dimension (source domain: television, and target
domain: book).

the results of the F-metric at different top ‘N’ values (between one and twenty), respectively,
in 10%, 50%, and 100% of user overlap level for the Book domain as target, considering the
Location dimension. As it can be seen, for 10% and 50% of overlap levels and low top ‘N’
values, the PostF classification performance was better or similar than the NNUserNgbr-
transClosure baseline algorithm, whereas for 100% of user overlap this can be seen for any
top ‘N’ value. On the other hand, the PreF classification performance was worse than all
other algorithms in all user overlap levels and ‘N’ values.

Figure 54 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm for
50% and 100% of user overlap levels37, whereas the opposite from this was observed for 10%
of user overlap level38. The applied tests also verified that the NNUserNgbr-transClosure
F-metric values were greater than the PreF ones for all user overlap levels39.

5.2.1.2.3 Companion Dimension

Table 25 shows the overall predictive performance of the recommender algorithms,
considering all contextual values from the Companion dimension and different user overlap
levels for the Book domain as target. As it can be seen from the table, the addition of user

overlap)
35 W=96 and p-value=0.06495
36 W=77 and p-value=0.02163
37 p-value=0.005814 and W=97 for all tests
38 p-value=0.005814 and W=97
39 p-value=0.003913 and W=100 for all tests


5.2. Evaluation Results 157

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 52 – Overall prediction performance (MAE) boxplots for book domain in the loca-
tion dimension with different user overlap levels (source domain: television).

ratings from other domain (Television), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 12% (MAE) and 11% (RMSE) for 10% of user overlap,
and in, approximately, 11% (MAE) and 7% (RMSE) for 50% of user overlap.

Also, Table 25 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 28–59% (MAE) and 17–38% (RMSE) depending on the
user overlap level.


158 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 53 – F-metric performance x top ‘N’ items for the book domain in the location
dimension with different user overlap levels (source domain: television).


5.2. Evaluation Results 159

Figure 54 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the location dimension (target domain: book, and source:
television).

Table 25 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Companion
dimension (source domain: television, and target domain: book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.643 ±
0.022

0.963 ±
0.040

0.500 ±
0.008

0.794 ±
0.020

0.401 ±
0.006

0.719 ±
0.010

NNUserNgbr
(cross-domain)

0.565 ±
0.020

0.851 ±
0.042

0.442 ±
0.007

0.739 ±
0.018

0.409 ±
0.005

0.747 ±
0.008

NNUserNgbr-
transClosure

0.229 ±
0.039

0.519 ±
0.087

0.265 ±
0.014

0.563 ±
0.031

0.293 ±
0.007

0.616 ±
0.014

PreF with
NNUserNgbr-
transClosure

0.611 ±
0.289

0.872 ±
0.346

0.757 ±
0.051

1.069 ±
0.059

0.789 ±
0.016

1.104 ±
0.022

PostF with
NNUserNgbr-
transClosure

0.241 ±
0.043

0.520 ±
0.079

0.264 ±
0.014

0.544 ±
0.039

0.277 ±
0.007

0.575 ±
0.010

As it can be seen from table, the PostF predictive performance was better than
the NNUserNgbr-transClosure algorithm in, approximately, 0.4% (MAE) and 3% (RMSE)
for 50% of user overlap, and in, approximately, 5% (MAE) and 6% (RMSE) for 100% of
user overlap, however, the predictive performance of these algorithms were similar when
the user overlap level was 10%.

In addition, the predictive performance of the PreF algorithm was worse than all
other algorithms in the Companion dimension, as showed in Table 25. Figure 55 illustrates
the predictive performance (MAE) of the proposed algorithms over different user overlap
levels.


160 Chapter 5. CD-CARS Evaluation

Figure 55 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the companion dimension (source domain: television, and
target domain: book).

The statistical significance tests verified that the PostF predictive errors (MAE)
were less than the NNUserNgbr-transClosure algorithm for 100% of user overlap40. For
10% and 50% of user overlap levels, the applied tests could not verify a statistical difference
between the NNUserNgbr-transClosure and PostF predictive errors (MAE)41. The applied
tests also verified that the NNUserNgbr-transClosure predictive errors (MAE) were less
than the PreF ones for all user overlap levels42. Figure 56a, Figure 56b and Figure 56c
show the boxplots with the prediction performance (MAE) of the algorithms, respectively,
for the 10%, 50%, and 100% user overlap levels.

Figures 57a, 57b, and 57c present the results of the F-metric at different top ‘N’
values (between one and twenty), respectively, in 10%, 50%, and 100% of user overlap
level for the Book domain as target, considering the Companion dimension. As it can be
seen, the proposed algorithms only outperformed the baseline for 100% of user overlap
with low values of top ‘N’.

Figure 58 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
NNUserNgbr-transClosure F-metric values were greater than the both proposed algorithms
for all user overlap levels43.

40 W=99 and p-value=0.01083
41 For 10% of user overlap, W=60 and p-value=0.2406, while for 50%, W=55 and p-value=0.3697
42 W=99 and p-value=0.01083 for all tests
43 p-value=0.005814 and W=97 for all tests between the PostF and baseline algorithms, while p-

value=0.003913 and W=100 for all tests between the PreF and baseline algorithms


5.2. Evaluation Results 161

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 56 – Overall prediction performance (MAE) boxplots for book domain in the com-
panion dimension with different user overlap levels (source domain: television).

5.2.1.2.4 Combining Contextual Dimensions

In the previous sections, we presented the evaluation results regarding the contextual
dimensions separately. In this section, we report the results for a combination of two
contextual dimensions considering the same evaluation metrics and methodology described
before.

We have seen in Section 4.1.2 that two of the contextual dimensions (Temporal
and Location) provide a greater information gain than the Companion dimension, which
is confirmed by the results presented in the previous sections. In this way, we evaluated


162 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 57 – F-metric performance x top ‘N’ items for the book domain in the companion
dimension with different user overlap levels (source domain: television).


5.2. Evaluation Results 163

Figure 58 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the companion dimension (target domain: book, and
source: television).

the combination of those two contextual dimensions (Temporal and Location), aiming to
verify its performance in comparison to the performance of them evaluated separately.

Table 26 reports the overall predictive performance of the recommender algorithms,
considering all contextual value combinations from the Temporal and Location dimensions
with different user overlap levels for the Book domain as target. As it was observed in the
previous sections, the addition of user ratings from the Television domain also improved
the predictive performance of the NNUserNgbr in, approximately, 25–30% (MAE) and
13–18% (RMSE) depending on the user overlap level (rows 1 and 2 from the table).

Also, Table 26 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 39–75% (MAE) and 25–60% (RMSE) depending on the
user overlap levels.

As it can be seen from table, except when the user overlap level was 10% (by
considering only the MAE metric), the PostF predictive performance was better than the
NNUserNgbr-transClosure algorithm in all other user overlap levels, with an improvement
that varied in, approximately, 10–20% (MAE) and 8–17% (RMSE) depending on the user
overlap level. Despite the NNUserNgbr-transClosure algorithm have outperformed the
PostF when the user overlap level was 10%, they had a similar performance, separated
only by their standard deviations. Figure 59 illustrates the predictive performance (MAE)
of the proposed algorithms over different user overlap levels.

The statistical significance tests verified that the PostF predictive errors (MAE)
were less than the NNUserNgbr-transClosure algorithm for the 50% and 100% user overlap


164 Chapter 5. CD-CARS Evaluation

Table 26 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual value combinations from
the temporal and location dimensions (source domain: Television, and target
domain: Book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.643 ±
0.022

0.963 ±
0.040

0.500 ±
0.008

0.794 ±
0.020

0.401 ±
0.006

0.719 ±
0.010

NNUserNgbr
(cross-domain)

0.470 ±
0.020

0.830 ±
0.042

0.349 ±
0.007

0.645 ±
0.018

0.297 ±
0.005

0.617 ±
0.008

NNUserNgbr-
transClosure

0.117 ±
0.017

0.328 ±
0.039

0.176 ±
0.006

0.418 ±
0.013

0.180 ±
0.003

0.457 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.344 ±
0.396

0.713 ±
0.545

0.200 ±
0.045

0.423 ±
0.126

0.331 ±
0.027

0.699 ±
0.064

PostF with
NNUserNgbr-
transClosure

0.121 ±
0.010

0.309 ±
0.028

0.158 ±
0.004

0.382 ±
0.011

0.143 ±
0.005

0.378 ±
0.011

Figure 59 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal and location dimensions (source domain: televi-
sion, and target domain: book).

levels44. When the user overlap level was 10%, the NNUserNgbr-transClosure predictive
error (MAE) was statistically similar to the PostF one45. On the other hand, the applied
tests verified that the NNUserNgbr-transClosure predictive errors (MAE) were less than
the PreF ones for all user overlap levels46. Figure 99a, Figure 99b and Figure 99c show
the boxplots with the prediction performance (MAE) of the algorithms, respectively, for
the 10%, 50%, and 100% user overlap levels.
44 In both cases, with W=99 and p-value=0.005413
45 W=41 and p-value=0.5
46 For all cases, with W=100 and p-value=0.003914


5.2. Evaluation Results 165

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 60 – Overall prediction performance (MAE) boxplots for book domain in the
temporal and location dimensions with different user overlap levels (source
domain: television).

Taking into account the classification performance, Figures 61a, 61b, and 61c present
the results of the F-metric at different top ‘N’ values, respectively, in 10%, 50%, and 100%
of user overlap level for the Television domain as target, considering the Temporal and
Location dimensions. For all user overlap levels and top ‘N’ values, the PreF classification
performance was worse than the all other algorithms. This result also occurred when the
evaluation was performed for the Television as target by combining the two contextual
dimensions (see Section 5.2.1.1.4). On the other hand, the PostF classification performance
was better or similar than the NNUserNgbr-transClosure baseline algorithm in all user
overlap levels and top ‘N’ values.

Figure 62 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm for


166 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 61 – F-metric performance x top ‘N’ items for the book domain in the temporal
and location dimensions with different user overlap levels (source domain:
television).


5.2. Evaluation Results 167

Figure 62 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal and location dimensions (target domain:
book, and source: television).

50% and 100% of user overlap levels47, whereas their performances were statistically similar
for 10% of user overlap48. The applied tests also verified that NNUserNgbr-transClosure
F-metric values were greater than the PreF ones for all user overlap levels49.

5.2.1.3 Summary

In this section, we provide a summary of the results from the evaluation of the
“book-television dataset”. Figure 63 shows a dispersion diagram illustrating the predictive
performance (MAE) for the algorithms by varying target domain (Book and Television),
contextual dimension and user overlap levels, whereas Figure 64 shows the same, but
considering the RMSE metric. It is important to mention that these figures do not take
into account the standard deviation and the statistical significance of the results.

Table 27 presents the predictive performance (MAE) achieved by the PreF and
PostF algorithms in comparison to the best baseline algorithm (NNUserNgbr-transClosure),
by taking into account their statistical significance50 and different target domain, contextual
dimension and user overlap levels.

Regarding the classification performance, Figure 65 presents a dispersion diagram
illustrating the F-metric performance (with N=5) for the algorithms by varying target
domain, contextual dimension and user overlap levels. Once again, it is important to
mention that we are not considering the standard deviation and the statistical significance
of the results in that figure.
47 p-value=0.005814 and W=97 for all tests
48 p-value=0.5 and W=51
49 p-value=0.003968 and W=100 for all tests
50 In the table, “**” means that the result could not be considered statistically significant.


168 Chapter 5. CD-CARS Evaluation

Table 28 shows the classification performance improvement (F-metric with N=5)
obtained by the PreF and PostF algorithms in comparison to the best baseline algorithm
(NNUserNgbr-transClosure), by taking into account their statistical significance51 and
different target domain, contextual dimension and user overlap levels.

As it can be seen, at least one proposed algorithm (PreF or PostF) achieved the
best predictive performance among the algorithms (or it was similar to the best one)
in all scenarios (with distinct target domains, contextual dimensions, and user overlap
levels). By considering the classification metric, the PostF algorithm achieved the best
performance among the algorithms (or it was similar to the best one) in the majority of
the scenarios. By summarizing the main findings from the evaluation results described in
this section, we can say that:

• In all scenarios (with different target domains, contextual dimensions and user overlap
levels), the addition of user ratings from an auxiliary (source) domain improved the
predictive performance of the NNUserNgbr algorithm, which was not designed for
making cross-domain recommendations. This fact can be also observed in almost all
scenarios regarding the classification performance of that algorithm. Note that this
occurred even when a source domain had less ratings than the target domain.

• In all scenarios (with different target domains, contextual dimensions and user overlap
levels), the NNUserNgbr-transClosure algorithm outperformed the NNUserNgbr
one by considering their predictive performances. Regarding their classification
performances this fact also occurred in almost all scenarios.

• The proposed algorithms (PreF and PostF) had better predictive and classification
performances in the Temporal dimension than others dimensions. This fact contrasts
to the information gain calculated in Section 4.1.2. In this contextual dimension,
the PostF outperformed the NNUserNgbr-transClosure algorithm in all scenarios
(user overlap levels and target domains) by considering either their predictive or
classification performances. The PreF outperformed the NNUserNgbr-transClosure
algorithm in all scenarios (user overlap levels and target domains) by considering
the predictive performance. With respect to the classification performance, the
PreF outperformed the NNUserNgbr-transClosure algorithm for 100% of user overlap
(regardless the target domain) and for 50% of user overlap when the Television was
the target domain.

• If we make a comparison between the proposed algorithms in the Temporal dimension
considering different evaluation metrics (predictive or classification), we see distinct
relations between the algorithms’ results. While the PostF algorithm outperforms

51 In the table, “**” means that the result could not be considered statistically significant.


5.2. Evaluation Results 169

the PreF one by considering the classification performance in almost all scenarios
(user overlap levels and target domains), the opposite happens when we take the
predictive performance into account.

• The more is the user overlap level the better is the classification performance
of the PostF algorithm, especially in the Temporal and Location dimensions (or
their combinations). The same can be observed for the PreF algorithm, but only
considering the Temporal dimension. Note that this fact did not seem to occur when
we considered the predictive performance of the algorithms.

• The predictive and classification performances of the PreF algorithm were more
affected than the PostF ones by considering the quantity of the contextual information
present in the user ratings (see Section 4.1.3). The more particular were the tested
contexts the worse were the PreF performances (e.g. in the Location dimension and
in the combination of Location and Temporal dimensions). In this way, the PreF
performances had a high variation, depending on the contextual information present
in the user ratings, whereas the PostF performances were more uniform and similar
to the NNUserNgbr-transClosure algorithm.

• Regarding the low quality of the contextual information obtained in the Companion
dimension (see Section 4.1.1.3), we can see that both proposed algorithms had low
predictive and classification performances in comparison to other dimensions. This
could also have occurred due to the low quantity of contextual information present
in the user ratings for that contextual dimension. In particular, the PostF algorithm
achieved a good performance by considering only the predictive metrics.

• With respect to the combination of contextual dimensions (Temporal and Location),
we can see the PostF predictive and classification performances in that combination
were close to their own performances using only the Temporal dimension as single
source of contextual information, whereas the PreF predictive and classification
performances were similar to their own performances using only the Location di-
mension. The classification performances of both algorithms were reduced with the
addition of contextual information from other contextual dimension. In particular,
the predictive performance of the PostF algorithm was slightly improved depending
on the user overlap level and target domain, whereas the predictive performance of
the PreF algorithm was decreased in any case.


170 Chapter 5. CD-CARS Evaluation

Figure 63 – Predictive performance (MAE) for the algorithms by varying target domain
(book and TV), contextual dimension and user overlap levels (dispersion
diagram).


5.2. Evaluation Results 171

Figure 64 – Predictive performance (RMSE) for the algorithms by varying target domain
(book and TV), contextual dimension and user overlap levels (dispersion
diagram).


172 Chapter 5. CD-CARS Evaluation

Figure 65 – Classification performance (F-metric with N=5) for the algorithms by varying
target domain (book and TV), contextual dimension and user overlap levels
(dispersion diagram).


5.2. Evaluation Results 173

Table 27 – Overall predictive performance (MAE) of the proposed algorithms in comparison
to the best baseline one by varying target domain (book and TV), contextual
dimension and user overlap levels.

Contextual
dimension

Target
Domain

User Overlap
Level

PreF Improve-
ment

PostF Improve-
ment

Temporal TV 10% 48.6% 13.9%
Book 10% 8% 9.7%
TV 50% 48.4% 18%
Book 50% 35.7% 15%
TV 100% 30.4% 20.3%
Book 100% 37.4% 14.6%

Location TV 10% -26.8%** 16.7%**
Book 10% -46.7% -1.8%**
TV 50% 14.2%** 5.7%
Book 50% 31.9% 2.5%
TV 100% -17.2% 5.9%
Book 100% 1.7%** 4%**

Companion TV 10% -273.7% 6.7%**
Book 10% -166.8% -5.2%**
TV 50% -207.9% 8.2%
Book 50% -185.2% 0.4%**
TV 100% -242.4% 10.2%
Book 100% -169.4% 5.6%

Temporal
and Location

TV 10% -76.8% -0.9%**

Book 10% -194% -3.4%**
TV 50% -188% 24%
Book 50% -13.6% 10.2%
TV 100% -60.9% 22.2%
Book 100% -83.9% 20.6%

5.2.2 Book-Music Results

As mentioned before, we evaluated the quality of the cross-domain algorithms by
varying the target domain for each dataset in order to study the impact of the density of
the target domain data in comparison to the density of the source domain data. Thus,
the following sections present the results considering a different domain as a target (Music
and Book).

5.2.2.1 Music as Target Domain

According to the contextual dimensions present in the datasets, we describe the
evaluation results for each contextual dimension in the following sections. In addition, we
show the results for a combination of contextual dimensions in Section 5.2.2.1.4.


174 Chapter 5. CD-CARS Evaluation

Table 28 – Overall classification performance (F-metric with N=5) of the proposed algo-
rithms in comparison to the best baseline one by varying target domain (book
and TV), contextual dimension and user overlap levels.

Contextual
dimension

Target
Domain

User Overlap
Level

PreF Improve-
ment

PostF Improve-
ment

Temporal TV 10% -38% 22.4%
Book 10% -113.4% 4.7%**
TV 50% 35.4% 38%
Book 50% -27.2% 16.7%
TV 100% 45% 41.2%
Book 100% 7.3% 26.7%

Location TV 10% -435.2% 1.9%**
Book 10% -329.4% -8.1%
TV 50% -491.7% 30.4%
Book 50% -496.7% 3.8%
TV 100% -414% 32.7%
Book 100% -467.1% 18.7%

Companion TV 10% -148.4% -39.1%
Book 10% -142.4% -26.8%
TV 50% -39% -2.8%
Book 50% -112.2% -36.9%
TV 100% -6.2% -5.8%**
Book 100% -60.3% -7.5%

Temporal
and Location

TV 10% -457.2% 20%

Book 10% -404.2% 0.05%**
TV 50% -532.5% 35.7%
Book 50% -482.4% 18.1%
TV 100% -488.4% 41.6%
Book 100% -456% 28.4%

5.2.2.1.1 Temporal Dimension

Table 29 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Temporal dimension and different user overlap
levels for the Music domain as target. The rows 1 and 2 from the table show the
NNUserNgbr predictive performance when it is applied in the single-domain and cross-
domain recommendations. As it can be seen, the simple addition of user ratings from
other domain (Book), by using the same algorithm for cross-domain recommendation,
improved the recommendation performance in, approximately, 5–7% (MAE) and 1–2%
(RMSE) depending on the user overlap level.

In addition, Table 29 presents the overall performance of the PreF and PostF
algorithms, besides the base NNUserNgbr-transClosure algorithm, which outperformed
the NNUserNgbr algorithm (performed for cross-domain purposes) by achieving an im-


5.2. Evaluation Results 175

Table 29 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Temporal
dimension (source domain: Book, and target domain: Music).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.684 ±
0.026

0.972 ±
0.053

0.654 ±
0.016

0.970 ±
0.030

0.588 ±
0.010

0.903 ±
0.017

NNUserNgbr
(cross-domain)

0.634 ±
0.022

0.949 ±
0.051

0.610 ±
0.013

0.943 ±
0.028

0.557 ±
0.008

0.891 ±
0.014

NNUserNgbr-
transClosure

0.307 ±
0.010

0.631 ±
0.013

0.458 ±
0.013

0.793 ±
0.020

0.480 ±
0.006

0.816 ±
0.011

PreF with
NNUserNgbr-
transClosure

0.185 ±
0.054

0.616 ±
0.166

0.171 ±
0.005

0.476 ±
0.007

0.212 ±
0.007

0.550 ±
0.016

PostF with
NNUserNgbr-
transClosure

0.257 ±
0.031

0.519 ±
0.038

0.400 ±
0.004

0.707 ±
0.002

0.423 ±
0.005

0.725 ±
0.010

provement that varied in, approximately, 13–51% (MAE) and 8–33% (RMSE) depending
on the user overlap levels.

As it can be seen from table, the PreF predictive performance was better than the
NNUserNgbr-transClosure and PostF algorithms in all user overlap levels, except when
the user overlap level was 10% if we only consider the RMSE metric instead of the MAE
one. The improvement achieved by the PreF algorithm in comparison to the NNUserNgbr-
transClosure one varied in, approximately, 39–55% (MAE) and 2–40% (RMSE) depending
on the user overlap level. The PostF predictive performance was also better than the
NNUserNgbr-transClosure algorithm in all user overlap levels, however, its improvement
was smaller than the achieved by the PreF algorithm.

Figure 66 (MAE) and Figure 67 (RMSE) illustrate the predictive performance
of the proposed algorithms over different user overlap levels. Note that for this case we
showed the figures for both predictive metrics, since we observed a difference in the PreF
performance depending on the predictive metric used in the evaluation.

The statistical significance tests verified that both the PreF and PostF predictive
errors (MAE) were less than the NNUserNgbr-transClosure baseline algorithm for all
user overlap levels52. Figure 68a, Figure 68b and Figure 68c show the boxplots with the
prediction performance (MAE) of the algorithms, respectively, for the 10%, 50%, and
100% user overlap levels.

Regarding the classification performance, Figures 69a, 69b, and 69c present the
results of the F-metric at different N values (between one and twenty), respectively, in

52 p-value=0.005413 and W=97 for all tests.


176 Chapter 5. CD-CARS Evaluation

Figure 66 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal dimension (source domain: book, and target
domain: Music).

Figure 67 – Overall prediction error (RMSE) for cross-domain algorithms by varying user
overlap level in the temporal dimension (source domain: book, and target
domain: Music).

10%, 50%, and 100% of user overlap level for the Music domain as target, considering
the Temporal dimension. As it can be seen, in all user overlap levels and top ‘N’ values,
the PostF classification performance was better or similar than the baseline algorithms,
whereas the PreF was only better than the baseline ones for 50% and 100% of user overlap
levels with low top ‘N’ values. In addition, the PostF classification performance was better
or similar than the PreF one for all user overlap levels and top ‘N’ values.


5.2. Evaluation Results 177

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 68 – Overall prediction performance (MAE) boxplots for Music domain in the
temporal dimension with different user overlap levels (source domain: book).

Figure 70 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm for
50% and 100% user overlap levels53, whereas for 10% of user overlap the tests could not
verify any statistical difference between their performances 54. Besides, the PreF F-metric
value was greater than the NNUserNgbr-transClosure one for 100% of user overlap55,
53 p-value=0.005413 and W=99 in both cases
54 W=80 and p-value=0.1
55 p-value=0.005413 and W=99


178 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 69 – F-metric performance x top ‘N’ items for the Music domain in the temporal
dimension with different user overlap levels (source domain: book).


5.2. Evaluation Results 179

Figure 70 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal dimension (target domain: Music, and source:
book).

whereas the opposite from this was observed for 10% and 50% of user overlap levels56.

5.2.2.1.2 Location Dimension

Table 30 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Location dimension and different user overlap
levels for the Music domain as target. As it can be seen from the table, the addition of
user ratings from other domain (Book), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 6–7% (MAE) and 3–4% (RMSE) depending on the user
overlap level (50% or 100%). Note that the NNUserNgbr predictive performance was
better than its own performance, considering the cross-domain scenario, when the user
overlap level was 10%.

Also, Table 30 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 11–54% (MAE) and 6–32% (RMSE) depending on the
user overlap levels. As it can be seen from table, the PostF predictive performance was
better than the NNUserNgbr-transClosure algorithm in all user overlap levels, with an
improvement that varied in, approximately, 3–16% (MAE) and 4–19% (RMSE) depending
on the user overlap level.

In addition, if we consider the high standard deviation of the PreF algorithm
showed in Table 30, then we can say that the PostF outperformed the PreF algorithm for
56 p-value=0.006812 and W=97 for both tests


180 Chapter 5. CD-CARS Evaluation

Table 30 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Location
dimension (source domain: Book, and target domain: Music).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.684 ±
0.026

0.972 ±
0.053

0.654 ±
0.016

0.970 ±
0.030

0.588 ±
0.010

0.903 ±
0.017

NNUserNgbr
(cross-domain)

0.710 ±
0.022

1.082 ±
0.051

0.602 ±
0.013

0.932 ±
0.028

0.552 ±
0.008

0.877 ±
0.014

NNUserNgbr-
transClosure

0.326 ±
0.065

0.731 ±
0.084

0.468 ±
0.008

0.808 ±
0.013

0.487 ±
0.004

0.822 ±
0.008

PreF with
NNUserNgbr-
transClosure

0.246 ±
0.238

0.528 ±
0.321

0.284 ±
0.022

0.514 ±
0.020

0.208 ±
0.033

0.541 ±
0.130

PostF with
NNUserNgbr-
transClosure

0.273 ±
0.011

0.591 ±
0.027

0.437 ±
0.007

0.775 ±
0.003

0.473 ±
0.005

0.791 ±
0.014

Figure 71 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the location dimension (source domain: book, and target
domain: Music).

10% of user overlap level. For 50% and 100% of user overlap levels, the PreF algorithm
had a low standard deviation and achieved the best predictive performance among the
algorithms. Figure 71 illustrates the predictive performance (MAE) of the proposed
algorithms over different user overlap levels.

The statistical significance tests verified that both the PreF and PostF predictive
errors (MAE) were less than the NNUserNgbr-transClosure baseline algorithm for 50%


5.2. Evaluation Results 181

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 72 – Overall prediction performance (MAE) boxplots for Music domain in the
location dimension with different user overlap levels (source domain: book).

and 100% of user overlap levels57. For 10% of user overlap, the applied tests could not
verify a statistical difference between the performance of the proposed algorithms and
NNUserNgbr-transClosure58. Figure 72a, Figure 72b and Figure 72c show the boxplots
with the prediction performance (MAE) of the algorithms, respectively, for the 10%, 50%,
and 100% user overlap levels.

57 For all tests, W=99 and p-value=0.005413
58 W=60 and p-value=0.35 (PreF x NNUserNgbr-transClosure), while W=80 and p-value=0.1 (PostF x

NNUserNgbr-transClosure)


182 Chapter 5. CD-CARS Evaluation

With respect to the classification performance, Figures 73a, 73b, and 73c present
the results of the F-metric at different top ‘N’ values (between one and twenty), respectively,
in 10%, 50%, and 100% of user overlap level for the Music domain as target, considering
the Location dimension. As it can be seen, for 50% and 100% of user overlap the PostF
outperformed the baseline algorithms for ‘N’ up to ten. On the other hand, the PreF
classification performance was worse than all other algorithms in all user overlap levels
and ‘N’ values.

Figure 74 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
PostF F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm
for 50% and 100% of user overlap levels59, whereas the opposite from this was observed
for 10% of user overlap60. On the other hand, the applied tests also verified that the
NNUserNgbr-transClosure F-metric values were greater than the PreF ones for all user
overlap levels61.

5.2.2.1.3 Companion Dimension

Table 31 shows the overall predictive performance of the recommender algorithms,
considering all contextual values from the Companion dimension and different user overlap
levels for the Music domain as target. As it can be seen from the table, the addition of
user ratings from other domain (Book), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 3–17% (MAE) and 6–16% (RMSE) depending on the user
overlap level.

Also, Table 31 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 9–68% (MAE) and 4–47% (RMSE) depending on the user
overlap levels. As it can be seen from table, the PostF predictive performance was better
than the NNUserNgbr-transClosure algorithm for 50% and 100% of user overlap levels
if we do not consider the standard deviation of the algorithms for 50% of user overlap.
Its improvement in comparison to the NNUserNgbr-transClosure performance varied in,
approximately, 3–11% (MAE) and 6–11% (RMSE) depending on the user overlap level.

In addition, the predictive performance of the PreF algorithm was worse than all
other algorithms in the Companion dimension for 50% and 100% of user overlap levels, as
showed in Table 31. Figure 75 (MAE) and Figure 76 (RMSE) illustrate the predictive
59 p-value=0.005413 and W=99 for all tests
60 p-value=0.006314 and W=97
61 p-value=0.003912 and W=100 for all tests


5.2. Evaluation Results 183

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 73 – F-metric performance x top ‘N’ items for the Music domain in the location
dimension with different user overlap levels (source domain: book).


184 Chapter 5. CD-CARS Evaluation

Figure 74 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the location dimension (target domain: Music, and source:
book).

Table 31 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Companion
dimension (source domain: Book, and target domain: Music).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.684 ±
0.026

0.972 ±
0.053

0.654 ±
0.016

0.970 ±
0.030

0.588 ±
0.010

0.903 ±
0.017

NNUserNgbr
(cross-domain)

0.660 ±
0.022

0.818 ±
0.051

0.537 ±
0.013

0.823 ±
0.028

0.535 ±
0.008

0.845 ±
0.014

NNUserNgbr-
transClosure

0.205 ±
0.122

0.430 ±
0.188

0.479 ±
0.004

0.785 ±
0.014

0.486 ±
0.018

0.797 ±
0.022

PreF with
NNUserNgbr-
transClosure

0.438 ±
0.221

0.540 ±
0.283

0.666 ±
0.058

1.076 ±
0.024

0.711 ±
0.052

1.079 ±
0.090

PostF with
NNUserNgbr-
transClosure

0.381 ±
0.076

0.807 ±
0.279

0.465 ±
0.044

0.736 ±
0.102

0.431 ±
0.016

0.707 ±
0.016

performance of the proposed algorithms over different user overlap levels. Note that for
this case we showed the figures for both predictive metrics, since we observed a difference
in the PreF and PostF performances depending on the predictive metric used in the
evaluation.

The statistical significance tests verified that the PostF predictive error (MAE)
was less than the NNUserNgbr-transClosure algorithm for 100% of user overlap level62,
whereas none statistical difference between their performances was observed when the user

62 W=100 and p-value=0.003968


5.2. Evaluation Results 185

Figure 75 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the companion dimension (source domain: book, and target
domain: Music).

Figure 76 – Overall prediction error (RMSE) for cross-domain algorithms by varying user
overlap level in the companion dimension (source domain: book, and target
domain: Music).

overlap levels were 10%63 and 50%64. The applied tests also verified that the NNUserNgbr-
transClosure predictive errors (MAE) were less than the PreF ones for all user overlap
levels65. Figure 77a, Figure 77b and Figure 77c show the boxplots with the prediction
performance (MAE) of the algorithms, respectively, for the 10%, 50%, and 100% user
overlap levels.
63 W=55 and p-value=0.5
64 W=30 and p-value=0.8
65 W=100 and p-value=0.003968 for all tests


186 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 77 – Overall prediction performance (MAE) boxplots for Music domain in the
companion dimension with different user overlap levels (source domain: book).

Figures 78a, 78b, and 78c present the results of the F-metric at different top ‘N’
values (between one and twenty), respectively, in 10%, 50%, and 100% of user overlap
level for the Music domain as target, considering the Companion dimension. As it can be
seen, the proposed algorithms only outperformed the baseline ones for 50% and 100% of
user overlap with very low values of top ‘N’. Figure 79 shows the variation of the F-metric
value in different user overlap levels by fixing the top ‘N’ value to five.

The statistical significance tests verified that the NNUserNgbr-transClosure F-
metric values were greater than the both proposed algorithms for all user overlap levels66.

66 p-value=0.005814 and W=97 for all tests between the PostF and baseline algorithms, while p-


5.2. Evaluation Results 187

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 78 – F-metric performance x top ‘N’ items for the Music domain in the companion
dimension with different user overlap levels (source domain: book).

value=0.003913 and W=100 for all tests between the PreF and baseline algorithms


188 Chapter 5. CD-CARS Evaluation

Figure 79 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the companion dimension (target domain: Music, and
source: book).

5.2.2.1.4 Combining Contextual Dimensions

In the previous sections, we presented the evaluation results regarding the contextual
dimensions separately. In this section, we report the results for a combination of two
contextual dimensions considering the same evaluation metrics and methodology described
before.

Table 32 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual value combinations from the
temporal and location dimensions (source domain: Book, and target domain:
Music).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.684 ±
0.026

0.972 ±
0.053

0.654 ±
0.016

0.970 ±
0.030

0.588 ±
0.010

0.903 ±
0.017

NNUserNgbr
(cross-domain)

0.708 ±
0.022

1.076 ±
0.051

0.597 ±
0.013

0.929 ±
0.028

0.548 ±
0.008

0.874 ±
0.014

NNUserNgbr-
transClosure

0.302 ±
0.060

0.628 ±
0.078

0.474 ±
0.009

0.818 ±
0.014

0.489 ±
0.004

0.825 ±
0.008

PreF with
NNUserNgbr-
transClosure

0.302 ±
0.278

0.755 ±
0.426

0.485 ±
0.040

0.741 ±
0.036

0.265 ±
0.036

0.642 ±
0.156

PostF with
NNUserNgbr-
transClosure

0.304 ±
0.021

0.632 ±
0.036

0.338 ±
0.005

0.621 ±
0.002

0.376 ±
0.004

0.706 ±
0.010

Table 32 reports the overall predictive performance of the recommender algorithms,
considering all contextual value combinations from the Temporal and Location dimensions


5.2. Evaluation Results 189

with different user overlap levels for the Music domain as target. As it was observed in the
previous sections, the addition of user ratings from the Book domain also improved the
predictive performance of the NNUserNgbr algorithm in, approximately, 6–8% (MAE) and
3–4% (RMSE) depending on the user overlap level (for 50% and 100% of user overlap).

Figure 80 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal and location dimensions (source domain: book,
and target domain: music).

Figure 81 – Overall prediction error (RMSE) for cross-domain algorithms by varying user
overlap level in the temporal and location dimensions (source domain: book,
and target domain: music).

Furthermore, Table 32 presents the overall performance of the NNUserNgbr-
transClosure, PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm
outperformed the NNUser-Ngbr one (performed for cross-domain purposes) by achieving


190 Chapter 5. CD-CARS Evaluation

an improvement that varied in, approximately, 10–57% (MAE) and 5–41% (RMSE) de-
pending on the user overlap levels. As it can be seen from table, except when the user
overlap level was 10%, the PostF predictive performance was better than the NNUserNgbr-
transClosure algorithm in all other user overlap levels, with an improvement that varied
in, approximately, 23–28% (MAE) and 14–24% (RMSE) depending on the user overlap
level. Despite the NNUserNgbr-transClosure algorithm have outperformed the PostF when
the user overlap level was 10%, they had a similar performance, separated only by their
standard deviations.

Note that the PreF predictive performance had a high standard deviation, observed
in Table 32, as occurred in the evaluation of the Location dimension alone. In addition, if
we consider the high standard deviation of the PreF algorithm showed in Table 32, then
we can say that the PostF outperformed the PreF algorithm for 10% and 50% of user
overlap level. For 100% of user overlap, the PreF algorithm had a low standard deviation
and achieved the best predictive performance among the algorithms.

Figure 80 (MAE) and Figure 81 (RMSE) illustrate the predictive performance
of the proposed algorithms over different user overlap levels. Note that for this case we
showed the figures for both predictive metrics, since we observed a difference in the PreF
and PostF performances depending on the predictive metric used in the evaluation.

The statistical significance tests verified that the PostF predictive errors (MAE)
were less than the NNUserNgbr-transClosure algorithm for the 50% and 100% user overlap
levels67. The applied tests also verified that the PreF predictive errors (MAE) were
less than the NNUserNgbr-transClosure algorithm for 100% user overlap68, whereas the
opposite from this was observed for 50% of user overlap69. When the user overlap level was
10%, the applied tests could not verify a statistical difference between the performance
of the proposed algorithms and NNUserNgbr-transClosure70. Figure 82a, Figure 82b and
Figure 82c show the boxplots with the prediction performance (MAE) of the algorithms,
respectively, for the 10%, 50%, and 100% user overlap levels.

Taking into account the classification performance, Figures 83a, 83b, and 83c
present the results of the F-metric at different top ‘N’ values, respectively, in 10%, 50%,
and 100% of user overlap level for the Music domain as target, considering the Temporal
and Location dimensions. For 50% and 100% of user overlap levels, the PostF classification
performance was better or similar than the baseline algorithms for all top ‘N’ values,
while for 10% of user overlap the PostF classification performance was better or similar
than them for very low top ‘N’ values (1 to 3). On other hand, the PreF classification

67 In both cases, with W=99 and p-value=0.005413
68 W=100 and p-value=0.003968
69 W=99 and p-value=0.005413
70 W=61 and p-value=0.5 (PostF x NNUserNgbr-transClosure), whereas W=71 and p-value=0.8 (PreF

x NNUserNgbr-transClosure)


5.2. Evaluation Results 191

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 82 – Overall prediction performance (MAE) boxplots for Music domain in the
temporal and location dimensions with different user overlap levels (source
domain: book).

performance was worse than all other algorithms.

Figure 84 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
PostF F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm
for all user overlap levels71. On the other hand, the applied tests also verified that the
NNUserNgbr-transClosure F-metric values were greater than the PreF ones for all user

71 For 10% of user overlap, with W=97 and the p-value=0.037, and p-value=0.005413 and W=97 for
the other user overlap levels


192 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 83 – F-metric performance x top ‘N’ items for the Music domain in the temporal
and location dimensions with different user overlap levels (source domain:
book).


5.2. Evaluation Results 193

Figure 84 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal and location dimensions (target domain:
Music, and source: book).

overlap levels72.

5.2.2.2 Book as Target Domain

In the Section 5.2.2.1, we presented the results for the Music target domain, which
had fewer ratings in the cross-domain dataset in comparison to Book source domain (as
described in Section 4.1.3.2). In this section, we present the results when Book is the
target domain and Music is the source domain.

According to the contextual dimensions present in Section 4.1.3.2, we describe the
evaluation results for each contextual dimension in the following sections. In addition, we
show the results for a combination of contextual dimensions in Section 5.2.2.2.4.

5.2.2.2.1 Temporal Dimension

Table 33 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Temporal dimension and different user overlap
levels for the Book domain as target. The rows 1 and 2 from the table show the NNUser-
Ngbr predictive performance when it is applied in the single-domain and cross-domain
recommendations. As it can be seen, likewise the results showed in Section 5.2.2.1.1 (source
domain: Book, target domain: Music, and Temporal dimension), the simple addition of
user ratings from other domain (Music), by using the same algorithm for cross-domain
recommendation, improved the recommendation performance in, approximately, 9–13%
(MAE) and 6–7% (RMSE) depending on the user overlap level.
72 p-value=0.003914 and W=99 for all tests


194 Chapter 5. CD-CARS Evaluation

Table 33 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Temporal
dimension (source domain: Music, and target domain: Book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.671 ±
0.024

1.002 ±
0.048

0.505 ±
0.008

0.792 ±
0.020

0.419 ±
0.006

0.735 ±
0.012

NNUserNgbr
(cross-domain)

0.610 ±
0.022

0.942 ±
0.044

0.443 ±
0.007

0.740 ±
0.018

0.363 ±
0.005

0.679 ±
0.010

NNUserNgbr-
transClosure

0.130 ±
0.012

0.368 ±
0.034

0.197 ±
0.001

0.451 ±
0.005

0.202 ±
0.003

0.473 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.109 ±
0.018

0.346 ±
0.063

0.087 ±
0.005

0.340 ±
0.006

0.090 ±
0.002

0.300 ±
0.005

PostF with
NNUserNgbr-
transClosure

0.114 ±
0.004

0.349 ±
0.011

0.176 ±
0.002

0.399 ±
0.004

0.175 ±
0.003

0.418 ±
0.007

In addition, Table 33 presents the overall performance of the PreF and PostF
algorithms, besides the base NNUserNgbr-transClosure algorithm, which outperformed
the NNUserNgbr algorithm (performed for cross-domain purposes) by achieving an im-
provement that varied in, approximately, 44–78% (MAE) and 30–60% (RMSE) depending
on the user overlap levels.

As it can be seen from the table, the PreF predictive performance was better than
the NNUserNgbr-transClosure for all user overlap levels. Besides, it was better than the
PostF algorithm for all user overlap levels if we do not consider the standard deviation.
The improvement achieved by the PreF algorithm in comparison to the NNUserNgbr-
transClosure one varied in, approximately, 16–56% (MAE) and 6–36% (RMSE) depending
on the user overlap level. The PostF predictive performance was also better than the
NNUserNgbr-transClosure for all user overlap levels. Figure 85 illustrates the predictive
performance (MAE) of the proposed algorithms over different user overlap levels.

The statistical significance tests verified that the PostF predictive errors (MAE)
were less than the NNUserNgbr-transClosure baseline algorithm for all user overlap levels73.
Also, the tests verified that, except for 10% of user overlap74, the PreF predictive errors
(MAE) were less than the NNUserNgbr-transClosure algorithm for all other user overlap
levels75. Figure 86a, Figure 86b and Figure 86c show the boxplots with the prediction
performance (MAE) of the algorithms, respectively, for the 10%, 50%, and 100% user
overlap levels.

73 p-value=0.003968 and W=97 for all tests
74 W=79 and p-value=0.1
75 p-value=0.003968 and W=97 for all tests


5.2. Evaluation Results 195

Figure 85 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal dimension (source domain: Music, and target
domain: book).

Regarding the classification performance, Figures 87a, 87b, and 87c present the
results of the F-metric at different N values (between one and twenty), respectively, in
10%, 50%, and 100% of user overlap level for the Book domain as target, considering the
Temporal dimension. As it can be seen, the PostF and PreF classification performances
were better or similar than the NNUserNgbr-transClosure baseline algorithm for 50% and
100% of user overlap levels (with any top ‘N’ value for the PostF algorithm and with very
low top ‘N’ values for the PreF one).

In addition, for 10% of user overlap, the PostF classification performance was better
than the NNUserNgbr-transClosure baseline algorithm only for very low top ‘N’ values (1
and 2), whereas the PreF classification performance was worse than all algorithms. Finally,
the PostF classification performance was better than the PreF one for all user overlap
levels and top ‘N’ values.

Figure 88 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm for 50%
and 100% of user overlap levels76, whereas their performances were statistically similar for
10% of user overlap77. The applied tests also verified that the NNUserNgbr-transClosure
F-metric values were greater than the PreF ones for all user overlap levels78.

76 p-value=0.005413 and W=97 for all tests
77 p-value=0.35 and W=60
78 W=77 and p-value=0.02163 for 10% of user overlap. For 50% and 100%, p-value=0.005413 and

W=97


196 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 86 – Overall prediction performance (MAE) boxplots for book domain in the
temporal dimension with different user overlap levels (source domain: Music).

5.2.2.2.2 Location Dimension

Table 34 reports the overall predictive performance of the recommender algorithms,
considering all contextual values from the Location dimension and different user overlap
levels for the Book domain as target. As it can be seen from the table, the addition of
user ratings from other domain (Music), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 10–12% (MAE) and 6–10% (RMSE) depending on the
user overlap level.


5.2. Evaluation Results 197

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 87 – F-metric performance x top ‘N’ items for the book domain in the temporal
dimension with different user overlap levels (source domain: Music).


198 Chapter 5. CD-CARS Evaluation

Figure 88 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the temporal dimension (target domain: book, and source:
Music).

Table 34 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Location
dimension (source domain: Book, and target domain: Music).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.671 ±
0.024

1.002 ±
0.048

0.505 ±
0.008

0.792 ±
0.020

0.419 ±
0.006

0.735 ±
0.012

NNUserNgbr
(cross-domain)

0.587 ±
0.022

0.897 ±
0.044

0.450 ±
0.007

0.739 ±
0.018

0.369 ±
0.005

0.678 ±
0.010

NNUserNgbr-
transClosure

0.120 ±
0.010

0.362 ±
0.036

0.200 ±
0.002

0.443 ±
0.006

0.199 ±
0.004

0.467 ±
0.007

PreF with
NNUserNgbr-
transClosure

0.122 ±
0.206

0.349 ±
0.480

0.149 ±
0.102

0.406 ±
0.276

0.096 ±
0.050

0.293 ±
0.151

PostF with
NNUserNgbr-
transClosure

0.109 ±
0.016

0.317 ±
0.036

0.193 ±
0.004

0.428 ±
0.009

0.193 ±
0.003

0.449 ±
0.010

Also, Table 34 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 45–79% (MAE) and 31–59% (RMSE) depending on the
user overlap levels. As it can be seen from table, the PostF predictive performance was
better than the NNUserNgbr-transClosure algorithm in all user overlap levels, with an
improvement that varied in, approximately, 3–9% (MAE) and 3–12% (RMSE) depending
on the user overlap level.

In addition, we can see in Table 34 that the PostF outperformed the PreF algorithm


5.2. Evaluation Results 199

for the majority of the user overlap levels (10% and 50%) if we consider the high standard
deviation of the PreF algorithm. Figure 89 illustrates the predictive performance (MAE)
of the proposed algorithms over different user overlap levels.

Figure 89 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the location dimension (source domain: book, and target
domain: Music).

The statistical significance tests verified that, except when the user overlap level
was 10%79, the PostF predictive errors (MAE) were less than the NNUserNgbr-transClosure
baseline algorithm for all other user overlap levels80. The applied tests also verified that
the PreF predictive error (MAE) was less than the NNUserNgbr-transClosure one for
100% of user overlap81, whereas their performances were statistically similar when the
user overlap level were 10%82 and 50%83. Figure 90a, Figure 90b and Figure 90c show the
boxplots with the prediction performance (MAE) of the algorithms, respectively, for the
10%, 50%, and 100% user overlap levels.

With respect to the classification performance, Figures 91a, 91b, and 91c present
the results of the F-metric at different top ‘N’ values (between one and twenty), respectively,
in 10%, 50%, and 100% of user overlap level for the Book domain as target, considering
the Location dimension.

As it can be seen, the PostF classification performance was better or similar than
the NNUserNgbr-transClosure baseline algorithm for 100% of user overlap with any top
‘N’ value, whereas for 50% of user overlap it was better than the baseline algorithm only
79 W=76 and p-value=0.2
80 For all tests, W=22 and p-value=0.02778
81 W=97 and p-value=0.003968
82 W=40 and p-value=0.65
83 W=30 and p-value=0.8


200 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 90 – Overall prediction performance (MAE) boxplots for Music domain in the
location dimension with different user overlap levels (source domain: book).

for low top ‘N’ values (1 to 3). For 10% of user overlap the PostF was outperformed by
the baseline algorithm. On the other hand, the PreF classification performance was worse
than all other algorithms in all user overlap levels and ‘N’ values.

Figure 92 shows the variation of the F-metric value in different user overlap levels


5.2. Evaluation Results 201

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 91 – F-metric performance x top ‘N’ items for the book domain in the location
dimension with different user overlap levels (source domain: Music).


202 Chapter 5. CD-CARS Evaluation

Figure 92 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the location dimension (target domain: book, and source:
Music).

by fixing the top ‘N’ value to five. The statistical significance tests verified that the PostF
F-metric value was greater than the NNUserNgbr-transClosure baseline algorithm for
100% of user overlap84, whereas the opposite from this was observed for 10% and 50% of
user overlap levels85. The applied tests also verified that the NNUserNgbr-transClosure
F-metric values were greater than the PreF ones for all user overlap levels86.

5.2.2.2.3 Companion Dimension

Table 35 shows the overall predictive performance of the recommender algorithms,
considering all contextual values from the Companion dimension and different user overlap
levels for the Book domain as target. As it can be seen from the table, the addition of
user ratings from other domain (Music), by using the same algorithm for cross-domain
recommendation (corresponding to the two first rows of the table), improved the predictive
performance in, approximately, 2% (MAE) and 1% (RMSE) for 10% of user overlap.

Also, Table 35 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement that
varied in, approximately, 33–74% (MAE) and 23–52% (RMSE) depending on the user over-
lap level. The PostF predictive performance was better than the NNUserNgbr-transClosure
one for all user overlap levels with an improvement that varied in, approximately, 5–10%
(MAE) and 5–11% (RMSE) depending on the user overlap level.
84 W=97 and p-value=0.037
85 W=97 and p-value=0.005413 for both cases
86 W=99 and p-value=0.003914 for all tests


5.2. Evaluation Results 203

Table 35 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual values from the Companion
dimension (source domain: Music, and target domain: Book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.671 ±
0.024

1.002 ±
0.048

0.505 ±
0.008

0.792 ±
0.020

0.419 ±
0.006

0.735 ±
0.012

NNUserNgbr
(cross-domain)

0.653 ±
0.022

0.994 ±
0.044

0.513 ±
0.007

0.814 ±
0.018

0.461 ±
0.005

0.785 ±
0.010

NNUserNgbr-
transClosure

0.168 ±
0.035

0.470 ±
0.085

0.290 ±
0.013

0.576 ±
0.017

0.306 ±
0.004

0.604 ±
0.136

PreF with
NNUserNgbr-
transClosure

0.984 ±
0.251

1.232 ±
0.154

0.693 ±
0.011

1.035 ±
0.030

0.721 ±
0.014

1.041 ±
0.016

PostF with
NNUserNgbr-
transClosure

0.151 ±
0.028

0.417 ±
0.119

0.264 ±
0.017

0.522 ±
0.032

0.289 ±
0.006

0.573 ±
0.019

In addition, the predictive performance of the PreF algorithm was worse than all
other algorithms in the Companion dimension, as showed in Table 35. Figure 93 illustrates
the predictive performance (MAE) of the proposed algorithms over different user overlap
levels.

Figure 93 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the companion dimension (source domain: Music, and target
domain: book).

The statistical significance tests verified that the PostF predictive error (MAE)
was less than the NNUserNgbr-transClosure algorithm for 100% of user overlap87. For
87 W=99 and p-value=0.003968


204 Chapter 5. CD-CARS Evaluation

10% and 50% of user overlap, the NNUserNgbr-transClosure predictive errors (MAE) were
statistically similar to the PostF ones88. On the other hand, the applied tests also verified
that the NNUserNgbr-transClosure predictive errors (MAE) were less than the PreF ones
for all user overlap levels89. Figure 94a, Figure 94b and Figure 94c show the boxplots with
the prediction performance (MAE) of the algorithms, respectively, for the 10%, 50%, and
100% user overlap levels.

Figures 95a, 95b, and 95c present the results of the F-metric at different top ‘N’
values (between one and twenty), respectively, in 10%, 50%, and 100% of user overlap
level for the Book domain as target, considering the Companion dimension. As it can
be seen, the proposed algorithms were outperformed by the NNUserNgbr-transClosure
baseline algorithm for all user overlap levels and any values of top ‘N’. Figure 96 shows the
variation of the F-metric value in different user overlap levels by fixing the top ‘N’ value
to five.

The statistical significance tests verified that the NNUserNgbr-transClosure F-
metric values were greater than the both proposed algorithms for all user overlap levels90.

5.2.2.2.4 Combining Contextual Dimensions

In the previous sections, we presented the evaluation results regarding the contextual
dimensions separately. In this section, we report the results for a combination of two
contextual dimensions considering the same evaluation metrics and methodology described
before.

Table 36 reports the overall predictive performance of the recommender algorithms,
considering all contextual value combinations from the Temporal and Location dimensions
with different user overlap levels for the Book domain as target. As it was observed in the
previous sections, the addition of user ratings from the Music domain also improved the
predictive performance of the NNUserNgbr in, approximately, 10–12% (MAE) and 6–10%
(RMSE) depending on the user overlap level (rows 1 and 2 from the table).

Also, Table 36 presents the overall performance of the NNUserNgbr-transClosure,
PreF and PostF algorithms. The NNUserNgbr-transClosure algorithm outperformed the
NNUserNgbr one (performed for cross-domain purposes) by achieving an improvement
that varied in, approximately, 46–81% (MAE) and 31–59% (RMSE) depending on the
user overlap levels. As it can be seen from table, the PostF predictive performance was
better than the NNUserNgbr-transClosure algorithm in all user overlap levels, with an
88 For 10% of user overlap, W=60 and p-value=0.35, while for 50% of user overlap, W=71 and

p-value=0.2
89 W=99 and p-value=0.003968 for all tests
90 p-value=0.005814 and W=97 for all tests between the PostF and baseline algorithms, while p-

value=0.003913 and W=100 for all tests between the PreF and baseline algorithms


5.2. Evaluation Results 205

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 94 – Overall prediction performance (MAE) boxplots for book domain in the
companion dimension with different user overlap levels (source domain: Music).

improvement that varied in, approximately, 7–23% (MAE) and 8–18% (RMSE) depending
on the user overlap level.

In addition, if we do not consider the standard deviation of the PreF and PostF
algorithms showed in Table 32, then we can say that the PreF predictive performance


206 Chapter 5. CD-CARS Evaluation

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 95 – F-metric performance x top ‘N’ items for the book domain in the companion
dimension with different user overlap levels (source domain: Music).


5.2. Evaluation Results 207

Figure 96 – Overall classification performance (F-metric at 5) for the algorithms by varying
user overlap level in the companion dimension (target domain: book, and
source: Music).

Table 36 – Overall predictive performance (MAE/RMSE) with standard deviation (std)
by varying the user overlap level for all contextual value combinations from the
temporal and location dimensions (source domain: Music, and target domain:
Book).

Algorithm 10% overlap 50% overlap Full overlap
MAE±std RMSE±std MAE±std RMSE±std MAE±std RMSE±std

NNUserNgbr
(single-domain)

0.671 ±
0.024

1.002 ±
0.048

0.505 ±
0.008

0.792 ±
0.020

0.419 ±
0.006

0.735 ±
0.012

NNUserNgbr
(cross-domain)

0.587 ±
0.022

0.897 ±
0.044

0.450 ±
0.007

0.739 ±
0.018

0.369 ±
0.005

0.678 ±
0.010

NNUserNgbr-
transClosure

0.111 ±
0.009

0.361 ±
0.034

0.199 ±
0.002

0.434 ±
0.005

0.197 ±
0.003

0.464 ±
0.006

PreF with
NNUserNgbr-
transClosure

0.180 ±
0.226

0.480 ±
0.465

0.209 ±
0.122

0.510 ±
0.296

0.140 ±
0.060

0.405 ±
0.161

PostF with
NNUserNgbr-
transClosure

0.103 ±
0.014

0.295 ±
0.030

0.175 ±
0.003

0.396 ±
0.007

0.150 ±
0.002

0.378 ±
0.006

(measured by the MAE metric) was better than the PostF one when the user overlap level
was 100%. Figure 97 (MAE) and Figure 98 (RMSE) illustrate the predictive performance
of the proposed algorithms over different user overlap levels. Note that for this case we
showed the figures for both predictive metrics, since we observed a difference in the PreF
performance depending on the predictive metric used in the evaluation.

The statistical significance tests verified that the PostF predictive errors (MAE)
were less than the NNUserNgbr-transClosure algorithm for the 50% and 100% user overlap


208 Chapter 5. CD-CARS Evaluation

Figure 97 – Overall prediction error (MAE) for cross-domain algorithms by varying user
overlap level in the temporal and location dimensions (source domain: music,
and target domain: book).

Figure 98 – Overall prediction error (RMSE) for cross-domain algorithms by varying user
overlap level in the temporal and location dimensions (source domain: music,
and target domain: book).

levels91. When the user overlap levels was 10%, the applied tests could not verify a
statistical difference between the performance of the PostF and NNUserNgbr-transClosure92

algorithms. On the other hand, the applied tests verify that the NNUserNgbr-transClosure
predictive errors (MAE) were less than the PreF ones for 10% and 50% of user overlap

91 In both cases, with W=100 and p-value=0.003968
92 W=51 and p-value=0.35


5.2. Evaluation Results 209

levels93, whereas the opposite from this was observed for 100% of user overlap94.

(a) 10% of user overlap. (b) 50% of user overlap.

(c) 100% of user overlap.

Figure 99 – Overall prediction performance (MAE) boxplots for book domain in the
temporal and location dimensions with different user overlap levels (source
domain: Music).

Taking into account the classification performance, Figures 61a, 61b, and 61c
present the results of the F-metric at different top ‘N’ values, respectively, in 10%, 50%,
and 100% of user overlap level for the Music domain as target, considering the combination
between the Temporal and Location dimensions. For all user overlap levels and top
‘N’ values, the PreF classification performance was worse than the all other algorithms.
93 For 10% of user overlap, W=99 and p-value=0.003968. For 50% of user overlap, W=97 and

p-value=0.005413
94 W=99 and p-value=0.003968


210 Chapter 5. CD-CARS Evaluation

On the other hand, the PostF classification performance was better or similar than the
NNUserNgbr-transClosure baseline algorithm in the majority of the user overlap levels
(50% and 100%) for any top ‘N’ values.

Figure 101 shows the variation of the F-metric value in different user overlap levels
by fixing the top ‘N’ value to five. The statistical significance tests verified that the
PostF F-metric values were greater than the NNUserNgbr-transClosure baseline algorithm
for 50% and 100% of user overlap levels95, whereas the opposite from this was observed
for 10% of user overlap96. On the other hand, the applied tests also verified that the
NNUserNgbr-transClosure F-metric values were greater than the PreF ones for all user
overlap levels97.

5.2.2.3 Summary

In this section, we provide a summary of the results from the evaluation of the
“book-music dataset”. Figure 102 shows a dispersion diagram illustrating the predictive
performance (MAE) for the algorithms by varying target domain (Book and Music),
contextual dimension and user overlap levels, whereas Figure 103 shows the same, but
considering the RMSE metric. It is important to mention that these figures do not take
into account the standard deviation and the statistical significance of the results.

Table 37 presents the predictive performance (MAE) achieved by the PreF and
PostF algorithms in comparison to the best baseline algorithm (NNUserNgbr-transClosure),
by taking into account their statistical significance98 and different target domain, contextual
dimension and user overlap levels.

Regarding the classification performance, Figure 104 presents a dispersion diagram
illustrating the F-metric performance (with N=5) for the algorithms by varying target
domain (Book and Music), contextual dimension and user overlap levels. Once again,
it is important to mention that we are not considering the standard deviation and the
statistical significance of the results in that figure.

Table 38 shows the classification performance improvement (F-metric with N=5)
obtained by the PreF and PostF algorithms in comparison to the best baseline algorithm
(NNUserNgbr-transClosure), by taking into account their statistical significance99 and
different target domain, contextual dimension and user overlap levels.

As it can be seen, at least one proposed algorithm (PreF or PostF) achieved the
best predictive performance among the algorithms (or it was similar to the best one) in
almost all scenarios (with distinct target domains, contextual dimensions, and user overlap
95 p-value=0.005814 and W=97 for all tests
96 p-value=0.035 and W=75
97 p-value=0.003913 and W=99 for all tests
98 In the table, “**” means that the result could not be considered statistically significant.
99 In the table, “**” means that the result could not be considered statistically significant.


5.2. Evaluation Results 211

(a) 10% of user overlap.

(b) 50% of user overlap.

(c) 100% of user overlap.

Figure 100 – F-metric performance x top ‘N’ items for the book domain in the temporal
and location dimensions with different user overlap levels (source domain:
Music).


212 Chapter 5. CD-CARS Evaluation

Figure 101 – Overall classification performance (F-metric at 5) for the algorithms by
varying user overlap level in the temporal and location dimensions (target
domain: book, and source: Music).

levels). By considering the classification metric, the PostF algorithm achieved the best
performance among the algorithms (or it was similar to the best one) in the majority of
the scenarios.

Most of the findings mentioned in the summary of the evaluation results for the
“book-television dataset” (see Section 5.2.1.3) can also be mentioned in this summary
(“book-music dataset”). In this way, we only highlight the main differences found in this
summary in comparison to those findings:

• The addition of user ratings from an auxiliary (source) domain also improved the
predictive performance of the NNUserNgbr algorithm, but in this dataset this fact
has occurred in less scenarios than in the “book-television dataset”. This can also be
observed for the classification performance of that algorithm. Likewise the results
in the “book-television dataset”, that improvement occurred even when a source
domain had less ratings than the target domain.

• Likewise the results in the “book-television dataset”, the proposed algorithms (PreF
and PostF) had better predictive and classification performances in the Temporal
dimension than others dimensions. The same findings mentioned for the predictive
and classification performances of the PostF algorithm can be observed in this
summary, as well as the PreF predictive performance. On the other hand, the
PreF algorithm outperformed the NNUserNgbr-transClosure one in less scenarios
(user overlap levels and target domains) than in that dataset by considering the
classification performance. The PreF algorithm outperformed the NNUserNgbr-
transClosure one only when the Music was the target domain with 100% of user


5.2. Evaluation Results 213

overlap.

• Likewise the results in the “book-television dataset” regarding the combination of con-
textual dimensions (Temporal and Location), the PostF predictive and classification
performances in that combination were close to their own performances using only
the Temporal dimension as single source of contextual information, whereas the PreF
predictive and classification performances were similar to their own performances
using only the Location dimension. In addition, the predictive and classification
performances of the PreF algorithm was also decreased with the addition of contex-
tual information from other contextual dimension. Besides, the PostF predictive
performance was also increased, however, its classification performance was increased
with the addition of contextual information from other contextual dimension in
opposite to the results from that dataset.

Table 37 – Overall predictive performance (MAE) of the proposed algorithms in comparison
to the best baseline one by varying target domain (book and music), contextual
dimension and user overlap levels.

Contextual
dimension

Target
Domain

User Overlap
Level

PreF Improve-
ment

PostF Improve-
ment

Temporal Music 10% 39.7% 16.2%
Book 10% 16.1%** 12.4%
Music 50% 62.6% 12.8%
Book 50% 56% 10.8%
Music 100% 55.8% 11.9%
Book 100% 55.4% 13.3%

Location Music 10% 24.5%** 16.1%**
Book 10% -2.9%** 8.9%**
Music 50% 39.2% 6.7%
Book 50% 25.5%** 3.7%
Music 100% 57.4% 2.9%
Book 100% 52% 3.8%

Companion Music 10% -113.5% -85.7%**
Book 10% -484% 10.3%**
Music 50% -39.1% 2.9%**
Book 50% -139.3% 8.7% **
Music 100% -46.5% 11.2%
Book 100% -136% 5.4%

Temporal
and Location

Music 10% -0.2%** -0.9%**

Book 10% -62.1% 7.3%**
Music 50% -2.3% 28.6%
Book 50% -4.9% 12.1%
Music 100% 45.9% 23.1%
Book 100% 29.3% 24%


214 Chapter 5. CD-CARS Evaluation

Figure 102 – Predictive performance (MAE) for the algorithms by varying target domain
(book and music), contextual dimension and user overlap levels (dispersion
diagram).


5.2. Evaluation Results 215

Figure 103 – Predictive performance (RMSE) for the algorithms by varying target domain
(book and music), contextual dimension and user overlap levels (dispersion
diagram).


216 Chapter 5. CD-CARS Evaluation

Figure 104 – Classification performance (F-metric with N=5) for the algorithms by varying
target domain (book and music), contextual dimension and user overlap
levels (dispersion diagram).


5.2. Evaluation Results 217

Table 38 – Overall classification performance (F-metric with N=5) of the proposed algo-
rithms in comparison to the best baseline one by varying target domain (book
and music), contextual dimension and user overlap levels.

Contextual
dimension

Target
Domain

User Overlap
Level

PreF Improve-
ment

PostF Improve-
ment

Temporal Music 10% -86.6% 0.5%**
Book 10% -121.6% -4.6%**
Music 50% -22.4% 22.1%
Book 50% -41.1% 6.9%
Music 100% 13% 37.9%
Book 100% -13.8% 13.9%

Location Music 10% -157.4% -11.3%
Book 10% -234% -15.9%
Music 50% -311% 17.1%
Book 50% -420.5% -3.4%
Music 100% -223.9% 33.3%
Book 100% -455% 6%

Companion Music 10% -58.7% -23.3%
Book 10% -126.4% -51.9%
Music 50% -92% -54.1%
Book 50% -197% -39.6%
Music 100% -29.7% -35.2%
Book 100% -162.2% -34.3%

Temporal
and Location

Music 10% -168% 9.1%

Book 10% -292.3% -7.1%
Music 50% -339.3% 23.3%
Book 50% -408% 12%
Music 100% -270.7% 42.1%
Book 100% -444% 17.2%

5.2.3 Discussion

Given the evaluation results presented in the previous sections, we can say that the
use of context-aware techniques has proven to be a good approach in order to improve the
cross-domain recommendation quality in comparison to traditional cross-domain recom-
mender systems based on collaborative filtering techniques, which do not take contextual
information into account. This finding was observed in the presented experiments, in
which we evaluated two CD-CARS algorithms performed in two different datasets (“Book-
television” and “Book-music”) by varying their target domains (Television, Music and
Book), contextual dimensions (Temporal, Location and Companion), and user overlap
levels (10%, 50%, and 100%).

As we could see in the experiments, Temporal was the contextual dimension in
which the proposed algorithms had a better performance for all datasets, target domains


218 Chapter 5. CD-CARS Evaluation

and user overlap levels. This may have happened due to the great amount of contextual
information obtained in that contextual dimension (100% of the ratings had temporal
information) in comparison to other ones (Location with, approximately, a half of ratings,
and Companion with, approximately, 20% from the ratings, as described in Section 4.1.3).
This fact contrasts to the information gain verified in Section 4.1.2, where the Location
dimension with the City attribute had the greater value for all target domains. In this way,
more studies and experiments may be made in the future in order to determine the best
contextual dimensions, attributes and values before evaluating the proposed algorithms,
especially in the combination of contextual dimensions. In addition, in these studies we
could verify the quality of recommendation of the proposed algorithms by reducing the
number of temporal information present in the user ratings (“contextual sensitivity”).

In addition, the quality of the contextual information may also have influenced on
the recommendation quality for the proposed algorithms. As we have seen in Section 4.1.1.3,
the Companion dimension has a poor quality of contextual information. Especially for the
PreF algorithm, which filters ratings out from the target domain that are not from the
recommendation context, the recommendation quality was more impaired than the PostF
one, which uses the same set of ratings of the baseline algorithm for prediction calculations,
and only in the end of the recommendation process ignores predictions (instead of initial
ratings).

The combination of two contextual dimensions (Temporal and Location) in the
recommendation process generated controversial results for the proposed algorithms. Inde-
pendently of the dataset used, while the PreF had worse results in that combination than
using only one contextual dimension (Temporal or Location), the PostF recommendation
quality was improved for some situations, as it could be seen in the result summaries
(Section 5.2.1.3 and Section 5.2.2.3). Again, this may have happened given the PreF
feature, in which might be more susceptible to problems in a situation with just a few
number of ratings, generated by the contextual specialization from the combination of two
contextual dimensions.

It is important to remember that all contextual information used in the CD-CARS
was obtained implicitly or by inference (see Section 4.1.1). In this way, there is no assurance
that the contextual information acquired reflects the actual contextual information of the
ratings. For instance, a user could watch a movie on Saturday and rate it only on Sunday,
when the rating timestamp was observed. Thus, the actual temporal information of that
rating might have been compromised. However, even considering this issue, it was possible
to verify that the proposed algorithms had a good performance in the Temporal dimension.

Besides, the Location context is extracted from the users’ accounts through their
original IDs (i.e. static and single location of the users are obtained from their website
accounts). In this way, the contextual information is little exploited when a user receives


5.2. Evaluation Results 219

the recommendation of items for a location different from his/her location (e.g. a user that
has all ratings in the United States and receives recommendations in Brazil), especially
for the PreF algorithm. For it, the recommendation would be fully based on the user
similarities from the source domain, since the user would not have any rating in the target
domain (pre-filtered in a location that the user does not have any information). So, it would
not be possible to calculate the similarity between the user and other users with ratings
in the target domain. On the other hand, for the PostF algorithm, no recommendation
would be possible without the association rules once that the user would not have any
contextual preferences in that location.

Taking into account the PostF and PreF recommendation performances, we could
observe that they had distinct results depending on the evaluation metric adopted. For
example, considering the Temporal dimension, the PostF classification performance (F-
metric) was better than the PreF one, whereas the opposite was observed when they were
evaluated by means of predictive error metrics (MAE and RMSE). In other contextual
dimensions, independently of the dataset, we have seen that the PostF had a better
recommendation quality than the PreF algorithm, which had worse results than the
baseline algorithms. In fact, besides the PreF’s feature of filtering ratings in a preliminary
way before its model training, it also differs from the PostF algorithm in relation to the use
of information about item categories, once that the PostF uses a category preference tensor
in its recommendation process, whereas the PreF does not. For alleviating this disparity,
we could combine both algorithms in order to try having the best of their features in a
single hybrid algorithm, for example (as described in Section 3.3.1.4).

By varying the target domain in the two datasets used in the experiments, we
studied the impact of the density of the target domain data in comparison to the density
of the source domain data. In both cases, we have seen that the addition of ratings from a
source domain improved the recommendation quality of the cross-domain based algorithms,
independently of its amount of ratings in relation to the target domain. In addition, even
for domains less related among themselves (Book and Music), we could see a improvement
on the recommendation quality of the cross-domain based algorithms.

Considering distinct user overlap levels (10%, 50% and 100%), we could see in the
experiments that the proposed algorithms had a better recommendation quality as the
user overlap level was higher. For the PreF algorithm, more user overlap may significate
more ratings in filtered contexts, expanding the similarities among users in these contexts,
whereas for the PostF, more user overlap level may expand the category preference tensor,
with more contextual information about item category preferences of users. On the other
hand, the baseline algorithms, especially the NNUserNgbr-transClosure, had a similar
performance independently of the user overlap level.

As mentioned in Section 5.1.1, the proposed algorithms used the baseline ones


220 Chapter 5. CD-CARS Evaluation

with the same recommendation settings (e.g. n=475 for the CF-based algorithm as base).
However, note that more tests can be done in order to achieve a “optimal” setting for each
algorithm, especially in the PreF algorithm, which uses a contextual sub-dataset with
smaller amount of data in relation to other algorithms. Besides, the PostF’s threshold can
be adjusted to an “optimal” value for other datasets and its minimal rating value for an
item to be considered “good” can also be changed depending on the datasets, as described
in Section 5.1.1. For the two datasets used in the experiments, we considered this value as
“four” in a five-star scale.

However, it may be possible that for other datasets the ratings from the contextual
user-rating tensors can have different scales or forms in distinct domains. For example,
ratings of music could be represented as a binary form such as “Like” and “Dislike” while
the ratings of movie and books could be represented, respectively, by five-star and ten-star
scales. As mentioned before, the base recommendation algorithms have to deal with this
issue. For instance, an algorithm could normalize the different scales from ratings among
distinct domains (SANTOS et al., 2012).

Due to the lack of publicly real datasets available with cross-domain and contextual
information, a feasible alternative to our experiments without having to produce contextual
synthetic data was extract contextual information implicitly or by inference for the three
contextual dimensions used in the experiments. However, other contextual dimensions still
can be extracted from the user reviews, as for example, the Task dimension. In addition,
other contextual attributes of the same contextual dimensions can be used in order to
verify the recommendation quality of the proposed algorithms (e.g. by using “countries”
instead of “cities” in the Location dimension).

Finally, other important aspect of recommender systems is the execution per-
formance. Although we did not evaluate the proposed CD-CARS by considering this
aspect, intuitively, we can say that the PreF algorithm may be capable of recommending
items consuming less resources (e.g. time and memory consumption) than the baseline
algorithms, once that the PreF uses these algorithms as base for a small set of ratings in
comparison to them. On the other hand, the PostF algorithm may demand more resources,
since it initially uses the baseline algorithms as base, and then, applies an addition step to
them by using an additional data structure (category preference tensor).

5.3 Final Remarks

In this chapter, we presented and discussed experimental evaluations of two proposed
CD-CARS algorithms in comparison to cross-domain CF-based ones. The experimental
evaluation was made by considering two distinct datasets (described in Section 4.1.3)
with three different contextual dimensions (Temporal, Location and Companion), target
domains (Television, Music and Book) and user overlap levels (10%, 50%, and 100%).


5.3. Final Remarks 221

The algorithms were evaluated regarding their predictive and classification performances,
which were analyzed by means of statistical significance tests.

Finally, the conclusions and future works of this thesis are described in the next
chapter.


222

6 Conclusion

In this thesis, we have found that context-aware techniques can be used in order
to improve the accuracy of cross-domain recommendations. A traditional cross-domain
CF-based algorithm provided better recommendations when used in combination with the
implemented CD-CARS algorithms (Pre-Filtering and Post-Filtering).

By considering contextual information from three dimensions (Temporal, Location
and Companion), experimental evaluations conducted in two real datasets, one with two
more related domains (Book and Television) and another with two less related domains
(Book and Music), showed that generating predictions exploiting knowledge from a source
domain improved predictive and classification performances in the target domain. For both
datasets, we made experiments by swapping source and target domains and, regardless
of these domains evaluated as source or target, the proposed algorithms achieved better
results in comparison to the baseline ones, especially by using contextual information from
the Temporal and Location dimensions.

For the Companion contextual dimension, only one of the implemented algorithms
had a good predictive performance, whereas its classification performance was not as good.
As discussed in Chapter 5, the low quality and quantity of contextual information from
that contextual dimension may have influenced on the negative classification performance,
mainly for the Pre-Filtering algorithm, which also had a bad predictive performance by
taking into account the Companion dimension.

With respect to the user overlap level variation (10%, 50% and 100%), we conclude
that it influenced the proposed algorithms that, in general, are more accurate as higher is
the user overlap level.

Finally, through a novel approach, we expect that the findings from this study con-
tribute to the cross-domain RS area towards future research in cross-domain context-aware
recommendations. In the following sections, we describe the contributions, limitations and
future works of this thesis.

6.1 Contributions

One of the contributions of this thesis is the novel study about the successful
integration of two emergent and relevant approaches of the recommender system (RS)
area: cross-domain and context-awareness. This integration can lead to further research
in the area in order to improve the quality of recommender systems. While the proposed
CD-CARS algorithms address, mainly, the accuracy aspect of RSs quality, the cross-domain
collaborative filtering algorithms, which are adopted in combination with the proposed


6.1. Contributions 223

ones, may address other aspects such as cold-start and sparsity. Therefore, the proposed
CD-CARS takes the best aspects of those RS approaches into account.

Other contributions of this thesis are:

• The formalization of the cross-domain context-aware recommendation problem from
the survey of two relevant research fields: cross-domain and context-aware RS. For
that, we considered user ratings as a function of three dimensions (ADOMAVICIUS;
TUZHILIN, 2015): User, Item and Context. Thus, the user ratings can be stored
in multidimensional user-rating-context tensors for each item domain (e.g. books,
movies, music, among others). In addition, it is necessary that there are user and
contextual overlap among distinct domains. At least, the proposed contextual feature
modelling is based on the “Key-Value” model, since it is simple and relatively easy
to implement and use (VIEIRA; TEDESCO; SALGADO, 2009)(BETTINI et al.,
2010);

• Proposal of novel CD-CARS algorithms based on three distinct and systematic
paradigms of context-aware recommendation (Pre-Filtering, Post-Filtering and
Modelling), which were chosen rather than ad-hoc context-aware approaches. One
of the advantages of the proposed CD-CARS algorithms is the possibility of using
traditional single-domain and cross-domain CF-based algorithms as a base algorithm,
which is used in combination with the proposed ones. In addition, the proposed
algorithms can be directly combined such as Pre-Filtering and Post-Filtering, or
Modelling and Post-Filtering, generating hybrid versions of the proposed CD-CARS
algorithms;

• Providing systematic CD-CARS algorithms that can be useful to recommend items
for several domains (e.g. books, music, movies, etc.) in a simple way, since little
information about users and items is required. It allows generating cross selling or
bundle recommendations for items from multiple domains (e.g. the recommendation
of a music accompanied of a movie to watch or a book to read);

• Provision of two real datasets for evaluating CD-CARS1, taking into account dif-
ferent domains and contextual information. One of them for evaluating CD-CARS
algorithms in two more related domains (Book and Television) and another con-
sidering two less related domains (Book and Music). These datasets were adapted
and extracted from (LESKOVEC; ADAMIC; HUBERMAN, 2007), which contains
ratings (five-star scale), product metadata and review information about different
Amazon products2. In addition to these data, we included contextual information

1 https://github.com/douglasveras/cd-cars-datasets
2 https://snap.stanford.edu/data/amazon-meta.html


224 Chapter 6. Conclusion

regarding three contextual dimensions: Temporal, Location and Companion, respec-
tively, inferred from the ratings’ dates, users’ static addresses (obtained from their
account on Amazon), and users’ rating reviews.

6.2 Limitations

The main limitations of this thesis are:

• Absence of a mechanism in the proposed CD-CARS to handle ratings from the
contextual user-rating tensors that have different scales or forms in distinct domains.
For example, ratings of music could be represented as a binary form such as “Like”
or “Dislike” while the ratings of movies and books may be represented, respectively,
by five-star or ten-star scales. As mentioned in the CD-CARS proposal, the proposed
CD-CARS algorithms let the responsibility of dealing with this issue with the cross-
domain algorithms used as base. For instance, the base algorithm could normalize the
different scales from ratings among distinct domains (SANTOS et al., 2012). Another
solution is to normalize the ratings in these domains before the recommendation
process.

• A deeper experimentation in the combination of different contextual dimensions
(Temporal, Location and Companion) could be made. As previously mentioned, we
just performed experiments in the combination of Temporal and Location dimensions,
since the quality of the Companion dimension is low.

• Lack of realization of a concrete CD-CARS for evaluating the satisfaction of real
users, and its capabilities in terms of execution time and memory required.

6.3 Lines for Further Work

The CD-CARS proposed in this thesis allows further investigation in multiple
research lines such as:

1. Improvement of the implemented algorithms and implementation of other proposed
CD-CARS algorithms such as Modelling or the combination between Pre-Filtering
and Post-Filtering, for example, as well as other state-of-the-art CD-CFRS algorithms,
which could be based on different cross-domain approaches in order to expand the
findings of this thesis (e.g. Linking and transferring knowledge instead of the
Aggregating knowledge, adopted in this thesis). For instance, we could compare the
proposed CD-CARS with algorithms from the Linking and transferring knowledge
approach, such as: CodeBook Transfer (CBT)(LI; YANG; XUE, 2009a), (GAO et
al., 2013), among others;


6.3. Lines for Further Work 225

2. Exploration of further contextual information in different domains:

a) Other algorithms could be used, or proposed, to infer more precise contextual
information from user reviews (e.g. use of supervised text mining techniques
for a better inference quality of the Companion contextual dimension) (CHEN;
CHEN, 2015)(DOMINGUES et al., 2014)(LAHLOU et al., 2013), which may
lead to a better recommendation quality and information gain calculated;

b) Creation of techniques for inference of other contextual dimensions (e.g. Task,
Mood, etc.) from user reviews, independently on the item domain (e.g. music,
books, games, etc.);

c) Building mechanisms to explicitly collect contextual information from user
ratings over multiple domains;

d) Making the contextual modelling (adopted in the CD-CARS) more representa-
tive in order to describe semantic relations between contexts and domains, for
example.

3. CD-CARS evaluation:

a) Building a concrete CD-CARS for evaluating the satisfaction of real users taking
into account the use of real resources (e.g. time, memory, etc.);

b) Development of a benchmark for a rapid and efficient evaluation of the CD-
CARS;

c) Improving the current evaluation methodology used in this thesis taking into
account different evaluation metrics (e.g. Breese score, Normalized Discounted
Cumulative Gain, etc.) and partitioning of training and test sets, for example;

d) Investigating and providing data mining techniques in order to select the most
relevant contextual dimensions, attributes and values (or their combination)
before performing recommendation or evaluation, once that the verification of
all possible situations is costly;

e) Combining other domains (e.g. Music and Television) or contextual dimensions
(e.g. Location and Companion), as well as evaluating the algorithm performances
by considering different user overlap levels (e.g. 0%, 25% and 75%);

f) To verify the impact of the amount of ratings with contextual information
in distinct contextual dimensions (“contextual sensitivity”), by reducing the
ratings from the Temporal dimension for that it have a similar number of ratings
in comparison to the Location dimension, for example;

g) In order to verify the use of association rules by the PostF algorithm, we
can made an evaluation for specific situations where the use of these rules is
required by it. For instance, when a user receives a recommendation in the


226 Chapter 6. Conclusion

target domain and the category preferences tensor (used by the PostF) does not
have information about his/her rated item categories in that domain yet. In
this case, the association rules are used for enhancing the category preferences
tensor, as described in Section 3.3.1.2.

4. Development of CD-CARS applications. Since the proposed algorithms are not
domain-specific, a myriad of cross-domain (or cross-selling) applications can be
developed for multiple domains (e.g. a recommender system on the TV that, beyond
TV shows, also could recommend books, sites, or other relevant information to the
TV programs) (FERRAZ; SILVA; SILVA, 2015). Besides, these applications could
be developed to be accessed in a ubiquitous way, depending on the users’ contexts
and their domains of interest (e.g. when a user watches TV in his/her bedroom,
then the CD-CARS application could recommend movies, whereas when the user is
in the living room with his/her friends, then the same application could recommend
music for them).


227

References

ABBAR, S.; BOUZEGHOUB, M.; LOPEZ, S. Context-aware recommender systems: A
service-oriented approach. In: VLDB PersDB workshop. [S.l.: s.n.], 2009. p. 1–6. Cited
on page 36.

ABEL, F. et al. Analyzing cross-system user modeling on the social web. In: Web
Engineering. [S.l.]: Springer, 2011. p. 28–43. Cited on page 42.

ABEL, F. et al. Cross-system user modeling and personalization on the social web. User
Modeling and User-Adapted Interaction, Springer, v. 23, n. 2-3, p. 169–209, 2013. Cited 2
many times on page 46 and 47.

ABOWD, G. D. et al. Towards a better understanding of context and context-awareness.
In: SPRINGER. Handheld and ubiquitous computing. [S.l.], 1999. p. 304–307. Cited on
page 34.

ADOMAVICIUS, G. et al. Incorporating contextual information in recommender systems
using a multidimensional approach. ACM Transactions on Information Systems (TOIS),
ACM, v. 23, n. 1, p. 103–145, 2005. Cited 7 many times on page 27, 47, 51, 54, 70, 80,
and 81.

ADOMAVICIUS, G.; TUZHILIN, A. Toward the next generation of recommender systems:
A survey of the state-of-the-art and possible extensions. Knowledge and Data Engineering,
IEEE Transactions on, IEEE, v. 17, n. 6, p. 734–749, 2005. Cited 4 many times on page
23, 25, 32, and 33.

ADOMAVICIUS, G.; TUZHILIN, A. Context-aware recommender systems. In:
Recommender systems handbook (Second Edition). [S.l.]: Springer, 2015. p. 191–226.
Cited 20 many times on page 9, 26, 27, 29, 30, 47, 48, 49, 51, 52, 53, 54, 55, 56, 66, 68, 70,
74, 82, and 223.

AGRAWAL, R.; IMIELIŃSKI, T.; SWAMI, A. Mining association rules between sets of
items in large databases. In: ACM. ACM SIGMOD Record. [S.l.], 1993. v. 22, n. 2, p.
207–216. Cited 2 many times on page 79 and 121.

ALHAMID, M. et al. Recam: a collaborative context-aware framework for multimedia
recommendations in an ambient intelligence environment. Multimedia Systems,
Springer Berlin Heidelberg, online, p. 1–15, 2015. ISSN 0942-4962. Disponível em:
<http://dx.doi.org/10.1007/s00530-015-0469-2>. Cited on page 55.

AMATRIAIN, X. et al. Data mining methods for recommender systems. In: Recommender
Systems Handbook. [S.l.]: Springer, 2011. p. 39–71. Cited on page 60.

ANAND, S. S.; MOBASHER, B. Contextual recommendation. [S.l.]: Springer, 2006.
Cited on page 53.

ANSARI, A.; ESSEGAIER, S.; KOHLI, R. Internet recommendation systems. Journal of
Marketing research, American Marketing Association, v. 37, n. 3, p. 363–375, 2000. Cited
on page 54.

http://dx.doi.org/10.1007/s00530-015-0469-2


228 References

AZAK, M. CrosSing: A framework to develop knowledge-based recommenders in cross
domains. Dissertação (Mestrado) — MIDDLE EAST TECHNICAL UNIVERSITY, 2010.
Cited 2 many times on page 25 and 44.

BALTRUNAS, L. et al. Incarmusic: Context-aware music recommendations in a car. In:
SPRINGER. EC-Web. [S.l.], 2011. v. 11, p. 89–100. Cited on page 47.

BALTRUNAS, L.; LUDWIG, B.; RICCI, F. Matrix factorization techniques for
context aware recommendation. In: ACM. Proceedings of the fifth ACM conference on
Recommender systems. [S.l.], 2011. p. 301–304. Cited 2 many times on page 54 and 81.

BALTRUNAS, L.; MAKCINSKAS, T.; RICCI, F. Group recommendations with rank
aggregation and collaborative filtering. In: Proceedings of the fourth ACM conference on
Recommender systems. [S.l.: s.n.], 2010. p. 119–126. ISBN 9781605589060. Cited on page
37.

BAUMAN, K.; TUZHILIN, A. Discovering contextual information from user reviews for
recommendation purposes. In: CBRecSys. [S.l.: s.n.], 2014. p. 1–8. Cited 4 many times
on page 98, 99, 100, and 101.

BAZIRE, M.; BRÉZILLON, P. Understanding context before using it. In: Modeling and
using context. [S.l.]: Springer, 2005. p. 29–40. Cited on page 48.

BELL, R.; KOREN, Y.; VOLINSKY, C. Modeling relationships at multiple scales to
improve accuracy of large recommender systems. In: ACM. Proceedings of the 13th ACM
SIGKDD international conference on Knowledge discovery and data mining. [S.l.], 2007. p.
95–104. Cited on page 87.

BENNETT, J.; LANNING, S. The netflix prize. In: Proceedings of KDD cup and
workshop. [S.l.: s.n.], 2007. v. 2007, p. 35–36. Cited 2 many times on page 32 and 38.

BERKOVSKY, S.; KUFLIK, T.; RICCI, F. Cross-domain mediation in collaborative
filtering. In: User Modeling 2007. [S.l.]: Springer, 2007. p. 355–359. Cited 4 many times
on page 42, 44, 57, and 58.

BERKOVSKY, S.; KUFLIK, T.; RICCI, F. Mediation of user models for enhanced
personalization in recommender systems. User Modeling and User-Adapted Interaction,
Springer, v. 18, n. 3, p. 245–286, 2008. Cited on page 47.

BERNERS-LEE, T.; HENDLER, J. The semantic web. a new form of web content that is
meaningful to computers will unleash a revolution of new possibilities. Scientific American
Magazine, v. 1, p. 34–43, maio 2001. Cited on page 35.

BETTINI, C. et al. A survey of context modelling and reasoning techniques. Pervasive
and Mobile Computing, Elsevier, v. 6, n. 2, p. 161–180, 2010. Cited 3 many times on page
49, 90, and 223.

BEZERRA, B. L.; CARVALHO, F. d. A. de. A symbolic approach for content-based
information filtering. Information Processing Letters, Elsevier, v. 92, n. 1, p. 45–52, 2004.
Cited on page 86.

BEZERRA, B. L. D.; CARVALHO, F. D. A. T. D. Symbolic data analysis tools for
recommendation systems. Knowledge and Information Systems, Springer, v. 26, n. 3, p.
385–418, 2011. Cited on page 86.


References 229

BLANCO-FERNÁNDEZ, Y. et al. Tripfromtv+: Exploiting social networks to arrange
cut-price touristic packages. In: IEEE. IEEE International Conference on Consumer
Electronics (ICCE). [S.l.], 2011. p. 223–224. Cited 3 many times on page 62, 63, and 67.

BLANCO-FERNÁNDEZ, Y. et al. Exploiting digital tv users’ preferences in a tourism
recommender system based on semantic reasoning. Consumer Electronics, IEEE
Transactions on, IEEE, v. 56, n. 2, p. 904–912, 2010. Cited 3 many times on page 62, 63,
and 67.

BLANCO-FERNÁNDEZ, Y. et al. Tripfromtv+: targeting personalized tourism to
interactive digital tv viewers by social networking and semantic reasoning. IEEE
Transactions on Consumer Electronics, IEEE, v. 57, n. 2, p. 953–961, 2011. Cited 7 many
times on page 29, 35, 61, 62, 63, 64, and 67.

BLANCO-FERNÁNDEZ, Y.; PAZOS-ARIAS, J. An MHP framework to provide
intelligent personalized recommendations about digital TV contents. Software: Practice
and Experience, v. 38, n. October 2007, p. 925–960, 2008. Cited on page 35.

BLANCO-FERNáNDEZ, Y. et al. Exploiting synergies between semantic reasoning and
personalization strategies in intelligent recommender systems: A case study. Journal of
Systems and Software, Elsevier Inc., v. 81, n. 12, p. 2371–2385, dez. 2008. ISSN 01641212.
Cited on page 37.

BLEI, D. M.; NG, A. Y.; JORDAN, M. I. Latent dirichlet allocation. the Journal of
machine Learning research, JMLR. org, v. 3, p. 993–1022, 2003. Cited on page 99.

BOUNEFFOUF, D. Situation-aware approach to improve context-based recommender
system. arXiv preprint arXiv:1303.0481, 2013. Cited on page 52.

BOURKE, S.; MCCARTHY, K.; SMYTH, B. Power to the people: exploring
neighbourhood formations in social recommender system. In: Proceedings of the fifth ACM
conference on Recommender systems. [S.l.: s.n.], 2011. p. 337–340. ISBN 9781450306836.
Cited on page 34.

BOYTSOV, A. et al. Situation awareness meets ontologies: A context spaces case study.
In: SPRINGER. International and Interdisciplinary Conference on Modeling and Using
Context. [S.l.], 2015. p. 3–17. Cited on page 52.

BRAUNHOFER, M.; KAMINSKAS, M.; RICCI, F. Location-aware music recommendation.
International Journal of Multimedia Information Retrieval, Springer, v. 2, n. 1, p. 31–44,
2013. Cited 5 many times on page 61, 62, 63, 65, and 67.

BREESE, J. S.; HECKERMAN, D.; KADIE, C. Empirical analysis of predictive
algorithms for collaborative filtering. In: MORGAN KAUFMANN PUBLISHERS INC.
Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. [S.l.],
1998. p. 43–52. Cited on page 37.

BRÉZILLON, P. Context modeling: Task model and practice model. In: Modeling and
Using Context. [S.l.]: Springer, 2007. p. 122–135. Cited 2 many times on page 49 and 52.

BRUSILOVSKY, P.; KOBSA, A.; NEJDL, W. The adaptive web: methods and strategies
of web personalization. [S.l.]: Springer, 2007. v. 4321. Cited on page 35.


230 References

BURKE, R. Hybrid recommender systems: Survey and experiments. User modeling and
user-adapted interaction, Springer, v. 12, n. 4, p. 331–370, 2002. Cited on page 32.

BURKE, R. Hybrid web recommender systems. In: The adaptive web. [S.l.]: Springer,
2007. p. 377–408. Cited 2 many times on page 32 and 33.

CAMPOS, P. G.; DÍEZ, F.; CANTADOR, I. Time-aware recommender systems: a
comprehensive survey and analysis of existing evaluation protocols. User Modeling and
User-Adapted Interaction, Springer, v. 24, n. 1-2, p. 67–119, 2014. Cited on page 56.

CANTADOR, I.; CREMONESI, P. Tutorial on cross-domain recommender systems. In:
ACM. Proceedings of the 8th ACM Conference on Recommender systems. [S.l.], 2014. p.
401–402. Cited on page 68.

CANTADOR, I. et al. Cross-domain recommender systems. In: Recommender Systems
Handbook. [S.l.]: Springer, 2015. p. 919–959. Cited 13 many times on page 9, 26, 30, 38,
39, 40, 41, 43, 44, 45, 46, 47, and 57.

CAO, B.; LIU, N. N.; YANG, Q. Transfer learning for collective link prediction in multiple
heterogenous domains. In: Proceedings of the 27th International Conference on Machine
Learning (ICML-10). [S.l.: s.n.], 2010. p. 159–166. Cited 2 many times on page 38 and 47.

CARMAGNOLA, F.; CENA, F. User identification for cross-system personalisation.
Information Sciences, Elsevier, v. 179, n. 1, p. 16–32, 2009. Cited on page 40.

CARMAGNOLA, F.; CENA, F.; GENA, C. User model interoperability: a survey. User
Modeling and User-Adapted Interaction, Springer, v. 21, n. 3, p. 285–331, 2011. Cited on
page 40.

CHAARI, T. et al. A comprehensive approach to model and use context for adapting
applications in pervasive environments. Journal of Systems and Software, Elsevier, v. 80,
n. 12, p. 1973–1992, 2007. Cited on page 49.

CHATTERJEE, S.; HADI, A. S. Regression analysis by example. [S.l.]: John Wiley &
Sons, 2015. Cited on page 52.

CHEN, G.; CHEN, L. Augmenting service recommender systems by incorporating
contextual opinions from user reviews. User Modeling and User-Adapted Interaction,
Springer, v. 25, n. 3, p. 295–329, 2015. Cited on page 225.

CHUNG, R.; SUNDARAM, D.; SRINIVASAN, A. Integrated personal recommender
systems. In: ACM. Proceedings of the ninth international conference on Electronic
commerce. [S.l.], 2007. p. 65–74. Cited on page 44.

CHURCH, K. et al. Mobile information access: A study of emerging search behavior on
the mobile internet. ACM Transactions on the Web (TWEB), ACM, v. 1, n. 1, p. 4, 2007.
Cited on page 47.

COLOMBO-MENDOZA, L. O. et al. Recommetz: A context-aware knowledge-based
mobile recommender system for movie showtimes. Expert Systems with Applications,
Elsevier, v. 42, n. 3, p. 1202–1222, 2015. Cited on page 51.


References 231

CREMONESI, P.; GARZOTTO, F.; TURRIN, R. Investigating the Persuasion Potential
of Recommender Systems from a Quality Perspective. ACM Transactions on Interactive
Intelligent Systems, v. 2, n. 2, p. 1–41, jun. 2012. ISSN 21606455. Cited on page 33.

CREMONESI, P.; KOREN, Y.; TURRIN, R. Performance of recommender algorithms
on top-n recommendation tasks. In: ACM. Proceedings of the fourth ACM conference
on Recommender systems. [S.l.], 2010. p. 39–46. Cited 3 many times on page 59, 129,
and 130.

CREMONESI, P.; TRIPODI, A.; TURRIN, R. Cross-domain recommender systems. In:
IEEE. IEEE 11th International Conference on Data Mining Workshops (ICDMW). [S.l.],
2011. p. 496–503. Cited 20 many times on page 9, 24, 25, 41, 42, 43, 47, 58, 59, 61, 68,
84, 86, 88, 89, 90, 123, 124, 129, and 130.

CREMONESI, P.; TURRIN, R. Controlling Consistency in Top-N Recommender Systems.
In: IEEE International Conference on Data Mining Workshops. [S.l.]: Ieee, 2010. p.
919–926. ISBN 978-1-4244-9244-2. Cited on page 37.

DAS, A. S. et al. Google news personalization: scalable online collaborative filtering. In:
ACM. Proceedings of the 16th international conference on World Wide Web. [S.l.], 2007. p.
271–280. Cited on page 32.

DEY, A. K.; ABOWD, G. D.; SALBER, D. A conceptual framework and a toolkit
for supporting the rapid prototyping of context-aware applications. Human-computer
interaction, L. Erlbaum Associates Inc., v. 16, n. 2, p. 97–166, 2001. Cited on page 48.

DIDAY, E.; BOCK, H.-H. Analysis of symbolic data: Exploratory methods for extracting
statistical information from complex data. [S.l.]: Springer-Verlag, 2000. Cited on page 86.

DOMINGUES, M. A. et al. Exploiting text mining techniques for contextual
recommendations. In: IEEE. Web Intelligence (WI) and Intelligent Agent Technologies
(IAT), 2014 IEEE/WIC/ACM International Joint Conferences on. [S.l.], 2014. v. 2, p.
210–217. Cited on page 225.

DOURISH, P. What we talk about when we talk about context. Personal and ubiquitous
computing, Springer, v. 8, n. 1, p. 19–30, 2004. Cited on page 53.

ENRICH, M.; BRAUNHOFER, M.; RICCI, F. Cold-start management with cross-domain
collaborative filtering and tags. In: E-Commerce and Web Technologies. [S.l.]: Springer,
2013. p. 101–112. Cited 2 many times on page 39 and 44.

EYKE, J. W. Temporal Problems, with a Focus on Mood, in Music Recommendation
Within Last. FM. Tese (Doutorado) — University of Sheffield, Department of Information
Studies, 2009. Cited 2 many times on page 32 and 38.

FERNÁNDEZ-TOBÍAS, I. et al. Cross-domain recommender systems: A survey of the
state of the art. In: Spanish Conference on Information Retrieval. [S.l.: s.n.], 2012. Cited
10 many times on page 24, 25, 26, 30, 38, 43, 61, 68, 90, and 131.

FERRAZ, C. A.; SILVA, D. V. e; SILVA, J. S. da. A collaborative tv-internet application
model to enrich tv viewing experience in a pervasive way. In: IEEE. IEEE International
Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).
[S.l.], 2015. p. 148–153. Cited on page 226.


232 References

FREYNE, J.; BERKOVSKY, S. Evaluating recommender systems for supportive
technologies. In: User Modeling and Adaptation for Daily Routines. [S.l.]: Springer, 2013.
p. 195–217. Cited on page 45.

GAO, S. et al. Cross-domain recommendation via cluster-level latent factor model. In:
Machine Learning and Knowledge Discovery in Databases. [S.l.]: Springer, 2013. p.
161–176. Cited 3 many times on page 39, 44, and 224.

GIVON, S.; LAVRENKO, V. Predicting social-tags for cold start book recommendations.
In: ACM. Proceedings of the third ACM conference on Recommender systems. [S.l.], 2009.
p. 333–336. Cited on page 44.

GOGA, O. et al. Exploiting innocuous activity for correlating users across sites. In:
INTERNATIONAL WORLD WIDE WEB CONFERENCES STEERING COMMITTEE.
Proceedings of the 22nd international conference on World Wide Web. [S.l.], 2013. p.
447–458. Cited on page 46.

GU, T.; PUNG, H. K.; ZHANG, D. Q. A service-oriented middleware for building
context-aware services. Journal of Network and computer applications, Elsevier, v. 28,
n. 1, p. 1–18, 2005. Cited on page 49.

GUPTA, K. M. Taxonomic conversational case-based reasoning. In: Case-Based Reasoning
Research and Development. [S.l.]: Springer, 2001. p. 219–233. Cited on page 64.

GUYON, I.; ELISSEEFF, A. An introduction to variable and feature selection. The
Journal of Machine Learning Research, JMLR. org, v. 3, p. 1157–1182, 2003. Cited on
page 52.

HALL, M. et al. The weka data mining software: an update. ACM SIGKDD explorations
newsletter, ACM, v. 11, n. 1, p. 10–18, 2009. Cited 2 many times on page 103 and 126.

HAN, X. et al. Alike people, alike interests? inferring interest similarity in online social
networks. Decision Support Systems, Elsevier, v. 69, p. 92–106, 2015. Cited on page 34.

HENRICKSEN, K.; INDULSKA, J. Developing context-aware pervasive computing
applications: Models and approach. Pervasive and mobile computing, Elsevier, v. 2, n. 1,
p. 37–64, 2006. Cited on page 49.

HERLOCKER, J. L. et al. Evaluating collaborative filtering recommender systems. ACM
Transactions on Information Systems (TOIS), ACM, v. 22, n. 1, p. 5–53, 2004. Cited on
page 37.

HIDASI, B.; TIKK, D. Fast als-based tensor factorization for context-aware
recommendation from implicit feedback. In: Machine Learning and Knowledge Discovery
in Databases. [S.l.]: Springer, 2012. p. 67–82. Cited 2 many times on page 54 and 82.

HILL, W. et al. Recommending and evaluating choices in a virtual community of use.
In: ACM PRESS/ADDISON-WESLEY PUBLISHING CO. Proceedings of the SIGCHI
conference on Human factors in computing systems. [S.l.], 1995. p. 194–201. Cited on
page 23.

HIPP, J.; GÜNTZER, U.; NAKHAEIZADEH, G. Algorithms for association rule
mining—a general survey and comparison. ACM sigkdd explorations newsletter, ACM,
v. 2, n. 1, p. 58–64, 2000. Cited on page 79.


References 233

HOPFGARTNER, F.; JOSE, J. Semantic user profiling techniques for personalised
multimedia recommendation. Multimedia systems, v. 16, p. 255–274, 2010. Cited on page
37.

HU, L. et al. Personalized recommendation via cross-domain triadic factorization. In:
INTERNATIONAL WORLD WIDE WEB CONFERENCES STEERING COMMITTEE.
Proceedings of the 22nd international conference on World Wide Web. [S.l.], 2013. p.
595–606. Cited on page 39.

HU, Y.; KOREN, Y.; VOLINSKY, C. Collaborative filtering for implicit feedback datasets.
In: IEEE. Eighth IEEE International Conference on Data Mining (ICDM). [S.l.], 2008. p.
263–272. Cited on page 36.

JADIDI, O.; FIROUZI, F.; BAGLIERY, E. Topsis method for supplier selection problem.
World Academy of Science, Engineering and Technology, Citeseer, v. 47, p. 956–958, 2010.
Cited on page 64.

JAIN, A. K. Data clustering: 50 years beyond k-means. Pattern recognition letters,
Elsevier, v. 31, n. 8, p. 651–666, 2010. Cited on page 99.

JAIN, P.; KUMARAGURU, P.; JOSHI, A. @ i seek’fb. me’: Identifying users across
multiple online social networks. In: INTERNATIONAL WORLD WIDE WEB
CONFERENCES STEERING COMMITTEE. Proceedings of the 22nd international
conference on World Wide Web companion. [S.l.], 2013. p. 1259–1268. Cited on page 46.

JI, K.; SHEN, H. Making recommendations from top-n user-item subgroups.
Neurocomputing, Elsevier, v. 165, p. 228–237, 2015. Cited 5 many times on page 61, 62,
63, 66, and 67.

JOJIC, O.; SHUKLA, M.; BHOSAREKAR, N. A probabilistic definition of item similarity.
In: Proceedings of the fifth ACM conference on Recommender systems. New York, New
York, USA: ACM Press, 2011. p. 229–236. ISBN 9781450306836. Cited on page 37.

KAMAHARA, J. et al. A Community-Based Recommendation System to Reveal
Unexpected Interests. In: 11th International Multimedia Modelling Conference. [S.l.]:
IEEE, 2005. p. 433–438. ISBN 0-7695-2164-9. Cited on page 34.

KAMINSKAS, M. et al. Knowledge-based identification of music suited for places of
interest. Information Technology & Tourism, Springer, v. 14, n. 1, p. 73–95, 2014. Cited
9 many times on page 29, 30, 51, 55, 61, 62, 63, 65, and 67.

KAMINSKAS, M.; RICCI, F. Contextual music information retrieval and recommendation:
State of the art and challenges. Computer Science Review, Elsevier, v. 6, n. 2, p. 89–119,
2012. Cited on page 47.

KARATZOGLOU, A. et al. Multiverse recommendation: n-dimensional tensor
factorization for context-aware collaborative filtering. In: ACM. Proceedings of the fourth
ACM conference on Recommender systems. [S.l.], 2010. p. 79–86. Cited 2 many times on
page 54 and 82.

KIM, S.; YOON, Y. Recommendation system for sharing economy based on
multidimensional trust model. Multimedia Tools and Applications, Springer, p. 1–14, 2014.
Cited 3 many times on page 54, 82, and 88.


234 References

KOREN, Y. Factorization meets the neighborhood: a multifaceted collaborative filtering
model. In: ACM. Proceedings of the 14th ACM SIGKDD international conference on
Knowledge discovery and data mining. [S.l.], 2008. p. 426–434. Cited 2 many times on
page 53 and 87.

LAHLOU, F. Z. et al. A text classification based method for context extraction from
online reviews. In: IEEE. 8th International Conference on Intelligent Systems: Theories
and Applications (SITA). [S.l.], 2013. p. 1–5. Cited on page 225.

LAROSE, D. T. k-nearest neighbor algorithm. In: . Discovering Knowledge in
Data. John Wiley and Sons, Inc., 2005. p. 90–106. ISBN 9780471687542. Disponível em:
<http://dx.doi.org/10.1002/0471687545.ch5>. Cited on page 60.

LEE, H.; KWON, J. Personalized tv contents recommender system using collaborative
context tagging-based user’s preference prediction technique. International Journal of
Multimedia & Ubiquitous Engineering, v. 9, n. 5, p. 231–240, 2014. Cited on page 51.

LEE, H. J.; PARK, S. J. Moners: A news recommender for the mobile web. Expert
Systems with Applications, Elsevier, v. 32, n. 1, p. 143–150, 2007. Cited on page 47.

LEE, W.; YANG, T.-H. Personalizing information appliances: a multi-agent framework
for TV programme recommendations. Expert Systems with Applications, v. 25, n. 3, p.
331–341, out. 2003. ISSN 09574174. Cited on page 37.

LEKAKOS, G.; CARAVELAS, P. A hybrid approach for movie recommendation.
Multimedia Tools and Applications, v. 36, n. December 2006, p. 55–70, 2008. Cited on
page 37.

LEKAKOS, G.; GIAGLIS, G. A Lifestyle‚ÄêBased Approach for Delivering Personalized
Advertisements in Digital Interactive Television. Journal of Computer-Mediated
Communication, v. 6, n. 1, p. 00–00, 2004. Cited on page 37.

LESKOVEC, J.; ADAMIC, L. A.; HUBERMAN, B. A. The dynamics of viral marketing.
ACM Transactions on the Web (TWEB), ACM, v. 1, n. 1, p. 5, 2007. Cited 3 many
times on page 61, 92, and 223.

LI, B.; YANG, Q.; XUE, X. Can movies and books collaborate? cross-domain collaborative
filtering for sparsity reduction. In: IJCAI. [S.l.: s.n.], 2009. v. 9, p. 2052–2057. Cited 2
many times on page 66 and 224.

LI, B.; YANG, Q.; XUE, X. Transfer learning for collaborative filtering via a rating-matrix
generative model. In: ACM. Proceedings of the 26th Annual International Conference on
Machine Learning. [S.l.], 2009. p. 617–624. Cited 3 many times on page 44, 46, and 47.

LI, L. et al. A contextual-bandit approach to personalized news article recommendation.
In: ACM. Proceedings of the 19th international conference on World wide web. [S.l.], 2010.
p. 661–670. Cited on page 66.

LIAW, A.; WIENER, M. Classification and regression by randomforest. R news, v. 2, n. 3,
p. 18–22, 2002. Cited on page 60.

LINDEN, G.; SMITH, B.; YORK, J. Amazon.com recommendations: Item-to-item
collaborative filtering. Internet Computing, IEEE, IEEE, v. 7, n. 1, p. 76–80, 2003. Cited
on page 32.

http://dx.doi.org/10.1002/0471687545.ch5


References 235

LIU, H.; MOTODA, H. Feature Selection for Knowledge Discovery and Data Mining.
[S.l.]: Springer Science & Business Media, 1998. Cited on page 73.

LIU, H.; MOTODA, H. Feature selection for knowledge discovery and data mining. [S.l.]:
Springer Science & Business Media, 2012. v. 454. Cited on page 52.

LONI, B. et al. Cross-domain collaborative filtering with factorization machines. In:
Advances in Information Retrieval. [S.l.]: Springer, 2014. p. 656–661. Cited 3 many times
on page 39, 58, and 60.

LóPEZ-NORES, M. et al. MiSPOT: dynamic product placement for digital TV through
MPEG-4 processing and semantic reasoning. Knowledge and Information Systems, v. 22,
n. 1, p. 101–128, mar. 2009. ISSN 0219-1377. Cited on page 37.

LOVÁSZ, L. et al. Random walks on graphs: A survey. Combinatorics, Paul Erdos is
Eighty, v. 2, p. 353–398, 1996. Cited on page 59.

LOW, Y.; AGARWAL, D.; SMOLA, A. J. Multiple domain user personalization. In:
ACM. Proceedings of the 17th ACM SIGKDD international conference on Knowledge
discovery and data mining. [S.l.], 2011. p. 123–131. Cited on page 40.

MAHMOOD, T.; RICCI, F.; VENTURINI, A. Improving recommendation effectiveness:
Adapting a dialogue strategy in online travel planning. Information Technology & Tourism,
Cognizant Communication Corporation, v. 11, n. 4, p. 285–302, 2009. Cited on page 47.

MARILLY, E. et al. Community-based applications. Bell Labs Technical Journal, v. 15,
n. 4, p. 93–109, mar. 2011. ISSN 10897089. Cited on page 35.

MCJONES, P. Eachmovie collaborative filtering data set. DEC Systems Research Center,
v. 249, 1997. Cited on page 57.

MILLER, G. A. Wordnet: a lexical database for english. Communications of the ACM,
ACM, v. 38, n. 11, p. 39–41, 1995. Cited on page 100.

MOE, H. H.; AUNG, W. T. Building ontologies for cross-domain recommendation
on facial skin problem and related cosmetics. International Journal of Information
Technology and Computer Science (IJITCS), v. 6, n. 6, p. 33, 2014. Cited 4 many times
on page 61, 62, 63, and 67.

MOE, H. H.; AUNG, W. T. Context aware cross-domain based recommendation. In:
International Conference on Advances in Engineering and Technology. [S.l.: s.n.], 2014.
Cited 7 many times on page 29, 55, 61, 62, 63, 64, and 67.

MOE, H. H.; AUNG, W. T. et al. Cross-domain recommendations for personalized
semantic services. International Journal of Computer Applications Technology and
Research, v. 2, n. 1, p. 72–76, 2013. Cited 4 many times on page 61, 62, 63, and 67.

MOON, A. et al. Designing CAMUS based context-awareness for pervasive home
environments. In: International Conference on Hybrid Information Technology. [S.l.: s.n.],
2006. v. 1, p. 666–672. ISBN 0769526748. Cited on page 54.

MOON, A. et al. Two-step recommendation based personalization for future services. In:
International Conference on Advanced Communication Technology. [S.l.: s.n.], 2009. v. 03,
p. 2268–2272. ISBN 9788955191394. Cited 2 many times on page 34 and 55.


236 References

MORENO, O. et al. Talmud: transfer learning for multiple domains. In: ACM. Proceedings
of the 21st ACM international conference on Information and knowledge management.
[S.l.], 2012. p. 425–434. Cited 2 many times on page 40 and 41.

MUKHERJEE, D. et al. A context-aware recommendation system considering both
user preferences and learned behavior. In: 7th International Conference on Information
Technology in Asia. [S.l.: s.n.], 2011. p. 1–7. ISBN 9781612841304. Cited on page 36.

NAKATSUJI, M. et al. Recommendations over domain specific user graphs. In: ECAI.
[S.l.: s.n.], 2010. p. 607–612. Cited 2 many times on page 58 and 59.

NETO, B.; FREITAS, R. de. Um processo de software e um modelo ontológico para apoio
ao desenvolvimento de aplicações sensíveis a contexto. Tese (Doutorado) — Universidade
de São Paulo, 2007. Cited on page 51.

OH, S. et al. Comparison of techniques for time aware tv channel recommendation. In:
IEEE. Soft Computing and Intelligent Systems (SCIS), 2014 Joint 7th International
Conference on and Advanced Intelligent Systems (ISIS), 15th International Symposium on.
[S.l.], 2014. p. 989–992. Cited on page 51.

OKU, K. et al. Context-aware svm for context-dependent information recommendation.
In: IEEE COMPUTER SOCIETY. Proceedings of the 7th international Conference on
Mobile Data Management. [S.l.], 2006. p. 109. Cited on page 54.

O’SULLIVAN, D.; SMYTH, B.; WILSON, D. Improving the quality of the personalized
electronic program guide. User Modeling and User-Adapted Interaction, v. 14, p. 5–36,
2004. Cited on page 37.

OWEN, S. et al. Mahout in action. [S.l.]: Manning, 2011. Cited 3 many times on page
109, 111, and 126.

PALMISANO, C.; TUZHILIN, A.; GORGOGLIONE, M. Using context to improve
predictive modeling of customers in personalization applications. Knowledge and Data
Engineering, IEEE Transactions on, IEEE, v. 20, n. 11, p. 1535–1549, 2008. Cited 3
many times on page 48, 51, and 53.

PAN, W. et al. Transfer learning in collaborative filtering for sparsity reduction. In:
AAAI. [S.l.: s.n.], 2010. v. 10, p. 230–235. Cited 2 many times on page 44 and 47.

PAN, W.; XIANG, E. W.; YANG, Q. Transfer learning in collaborative filtering with
uncertain ratings. In: AAAI. [S.l.: s.n.], 2012. Cited on page 39.

PAN, W.; YANG, Q. Transfer learning in heterogeneous collaborative filtering domains.
Artificial intelligence, Elsevier, v. 197, p. 39–55, 2013. Cited 2 many times on page 39
and 45.

PANNIELLO, U.; TUZHILIN, A.; GORGOGLIONE, M. Comparing context-aware
recommender systems in terms of accuracy and diversity. User Modeling and User-Adapted
Interaction, Springer, v. 24, n. 1-2, p. 35–65, 2014. Cited on page 56.

PANNIELLO, U. et al. Experimental comparison of pre-vs. post-filtering approaches in
context-aware recommender systems. In: ACM. Proceedings of the third ACM conference
on Recommender systems. [S.l.], 2009. p. 265–268. Cited on page 54.


References 237

PARAMESWARAN, A.; VENETIS, P.; GARCIA-MOLINA, H. Recommendation systems
with complex constraints: A course recommendation perspective. ACM Transactions on
Information Systems (TOIS), ACM, v. 29, n. 4, p. 20, 2011. Cited on page 64.

PARK, D. H. et al. A literature review and classification of recommender systems research.
Expert Systems with Applications, Elsevier, v. 39, n. 11, p. 10059–10072, 2012. Cited 2
many times on page 32 and 34.

PAZZANI, M. J. A framework for collaborative, content-based and demographic filtering.
Artificial Intelligence Review, Springer, v. 13, n. 5-6, p. 393–408, 1999. Cited on page 33.

PESSEMIER, T. D.; DOOMS, S.; MARTENS, L. Context-aware recommendations
through context and activity recognition in a mobile environment. Multimedia Tools and
Applications, Springer, v. 72, n. 3, p. 2925–2948, 2014. Cited on page 47.

PHAM, X. H.; JUNG, J. J.; VU, S.-B. P. L. A. Exploiting social contexts for movie
recommendation. Malaysian Journal of Computer Science, v. 27, n. 1, p. 68–79, 2014.
Cited on page 51.

QUEIROZ, S. R. d. M.; CARVALHO, F. d. A. de. Making collaborative group
recommendations based on modal symbolic data. In: SPRINGER. Brazilian Symposium
on Artificial Intelligence. [S.l.], 2004. p. 307–316. Cited on page 35.

R Core Team. R: A Language and Environment for Statistical Computing. Vienna,
Austria, 2015. Disponível em: <http://www.R-project.org/>. Cited on page 131.

REICHLING, T.; WULF, V. Expert recommender systems in practice: evaluating
semi-automatic profile generation. In: ACM. Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems. [S.l.], 2009. p. 59–68. Cited on page 36.

RENDLE, S. Factorization machines with libfm. ACM Transactions on Intelligent
Systems and Technology (TIST), ACM, v. 3, n. 3, p. 57, 2012. Cited on page 60.

RESNICK, P. et al. Grouplens: an open architecture for collaborative filtering of netnews.
In: ACM. Proceedings of the 1994 ACM conference on Computer supported cooperative
work. [S.l.], 1994. p. 175–186. Cited on page 23.

RESNICK, P.; VARIAN, H. R. Recommender systems. Communications of the ACM,
ACM, v. 40, n. 3, p. 56–58, 1997. Cited on page 32.

RICCI, F.; ROKACH, L.; SHAPIRA, B. Introduction to recommender systems handbook.
[S.l.]: Springer, 2011. Cited 11 many times on page 23, 24, 33, 34, 35, 37, 81, 84, 85, 86,
and 87.

SAHEBI, S.; BRUSILOVSKY, P. Cross-domain collaborative recommendation in a
cold-start context: The impact of user profile size on the quality of recommendation. In:
User Modeling, Adaptation, and Personalization. [S.l.]: Springer, 2013. p. 289–295. Cited
9 many times on page 42, 44, 45, 47, 57, 58, 60, 61, and 131.

SAHEBI, S.; COHEN, W. W. Community-based recommendations: a solution to the cold
start problem. In: Workshop on Recommender Systems and the Social Web, RSWEB. [S.l.:
s.n.], 2011. Cited on page 60.

http://www.R-project.org/


238 References

SANTOS, V. dos et al. A recommender system architecture for an inter-application
environment. In: IEEE. 12th International Conference on Intelligent Systems Design and
Applications (ISDA). [S.l.], 2012. p. 472–477. Cited 12 many times on page 9, 25, 41, 58,
59, 61, 69, 73, 74, 90, 220, and 224.

SETTEN, M. V.; POKRAEV, S.; KOOLWAAIJ, J. Context-aware recommendations
in the mobile tourist application compass. In: SPRINGER. Adaptive hypermedia and
adaptive web-based systems. [S.l.], 2004. p. 235–244. Cited on page 47.

SHANI, G.; GUNAWARDANA, A. Evaluating recommendation systems. In: Recommender
systems handbook. [S.l.]: Springer, 2011. p. 257–297. Cited 2 many times on page 37
and 128.

SHAPIRA, B.; ROKACH, L.; FREILIKHMAN, S. Facebook single and cross domain data
for recommendation systems. User Modeling and User-Adapted Interaction, Springer, v. 23,
n. 2-3, p. 211–247, 2013. Cited 8 many times on page 38, 41, 44, 47, 57, 58, 60, and 61.

SHARDANAND, U.; MAES, P. Social information filtering: algorithms for automating
“word of mouth”. In: ACM PRESS/ADDISON-WESLEY PUBLISHING CO. Proceedings
of the SIGCHI conference on Human factors in computing systems. [S.l.], 1995. p. 210–217.
Cited on page 23.

SHEPSTONE, S.; TAN, Z.-H.; JENSEN, S. Using audio-derived affective offset to
enhance tv recommendation. Multimedia, IEEE Transactions on, v. 16, n. 7, p. 1999–2010,
Nov 2014. ISSN 1520-9210. Cited 2 many times on page 47 and 52.

SHI, Y.; LARSON, M.; HANJALIC, A. Tags as bridges between domains: Improving
recommendation with tag-induced cross-domain collaborative filtering. User Modeling,
Adaption and Personalization, Springer, v. 6787, p. 305–316, 2011. Cited 2 many times
on page 44 and 47.

SONG, S.; MOUSTAFA, H.; AFIFI, H. Enriched IPTV services personalization. In: IEEE
International Conference on Communications. [S.l.]: Ieee, 2012. p. 1911–1916. ISBN
978-1-4577-2053-6. Cited on page 54.

SOUZA, D. et al. Towards a context ontology to enhance data integration processes.
In: Proceedings of the 4th Workshop on Ontologies-based Techniques for Databases (in
VLDB’08). [S.l.: s.n.], 2008. Cited on page 49.

STEWART, A. et al. Cross-tagging for personalized open social networking. In: ACM.
Proceedings of the 20th ACM conference on Hypertext and hypermedia. [S.l.], 2009. p.
271–278. Cited 3 many times on page 40, 41, and 44.

STRANG, T.; LINNHOFF-POPIEN, C. A context modeling survey. In: Workshop
Proceedings. [S.l.: s.n.], 2004. Cited on page 49.

SZOMSZOR, M. et al. Semantic modelling of user interests based on cross-folksonomy
analysis. [S.l.]: Springer, 2008. Cited on page 42.

TANG, J. et al. Cross-domain collaboration recommendation. In: ACM. Proceedings
of the 18th ACM SIGKDD international conference on Knowledge discovery and data
mining. [S.l.], 2012. p. 1285–1293. Cited on page 92.


References 239

TANG, X.; WAN, X.; ZHANG, X. Cross-language context-aware citation recommendation
in scientific articles. In: ACM. Proceedings of the 37th international ACM SIGIR
conference on Research & development in information retrieval. [S.l.], 2014. p. 817–826.
Cited 5 many times on page 61, 62, 63, 65, and 67.

TEKIN, C.; SCHAAR, M. van der. Contextual online learning for multimedia content
aggregation. Multimedia, IEEE Transactions on, IEEE, v. 17, n. 4, p. 549–561, 2015.
Cited 6 many times on page 61, 62, 63, 65, 66, and 67.

TIROSHI, A. et al. Cross social networks interests predictions based ongraph features. In:
ACM. Proceedings of the 7th ACM conference on Recommender systems. [S.l.], 2013. p.
319–322. Cited 3 many times on page 58, 59, and 60.

TIROSHI, A.; KUFLIK, T. Domain ranking for cross domain collaborative filtering. In:
User Modeling, Adaptation, and Personalization. [S.l.]: Springer, 2012. p. 328–333. Cited
2 many times on page 40 and 42.

TREWIN, S. Knowledge-based recommender systems. Encyclopedia of library and
information science, v. 69, n. Supplement 32, p. 180, 2000. Cited 2 many times on page
24 and 25.

TUCKER, L. R. Some mathematical notes on three-mode factor analysis. Psychometrika,
Springer, v. 31, n. 3, p. 279–311, 1966. Cited on page 64.

UBERALL, C.; MUTTUKRISHNAN, R. Recommendation index for DVB content using
service information. In: IEEE International Conference on Multimedia and Expo. [S.l.:
s.n.], 2009. p. 1–4. Cited 2 many times on page 35 and 36.

VÉRAS, D. et al. A literature review of recommender systems in the television domain.
Expert Systems with Applications, Elsevier, v. 42, n. 22, p. 9046–9076, 2015. Cited 4
many times on page 33, 35, 36, and 54.

VERAS, D. et al. Context-aware techniques for cross-domain recommender systems.
In: IEEE. 2015 Brazilian Conference on Intelligent Systems (BRACIS). [S.l.], 2015. p.
282–287. Cited 2 many times on page 40 and 54.

VIEIRA, V. et al. A context-oriented model for domain-independent context management.
Revue d’intelligence artificielle, v. 22, n. 5, p. 609–627, 2008. Cited on page 49.

VIEIRA, V.; TEDESCO, P.; SALGADO, A. C. Towards an ontology for context
representation in groupware. In: Groupware: Design, Implementation, and Use. [S.l.]:
Springer, 2005. p. 367–375. Cited on page 49.

VIEIRA, V.; TEDESCO, P.; SALGADO, A. C. Modelos e processos para o
desenvolvimento de sistemas sensíveis ao contexto. In: Jornadas de Atualização em
Informática. [S.l.]: André Ponce de Leon F. de Carvalho, Tomasz Kowaltowski.(Org.),
2009. p. 381–431. Cited 6 many times on page 17, 48, 49, 50, 90, and 223.

VILDJIOUNAITE, E. et al. Unobtrusive dynamic modelling of tv programme preferences
in a finnish household. Multimedia systems, v. 15, p. 143–157, 2009. Cited on page 55.


240 References

WANG, F.; LI, D.; XU, M. A location-aware tv show recommendation with localized
sementaic analysis. Multimedia Systems, Springer Berlin Heidelberg, online, p. 1–8, 2015.
ISSN 0942-4962. Disponível em: <http://dx.doi.org/10.1007/s00530-015-0451-z>. Cited
2 many times on page 52 and 55.

WINOTO, P.; TANG, T. If you like the devil wears prada the book, will you also
enjoy the devil wears prada the movie? a study of cross-domain recommendations. New
Generation Computing, Springer, v. 26, n. 3, p. 209–225, 2008. Cited 5 many times on
page 24, 25, 41, 58, and 61.

WINTER, J. C. D.; DODOU, D. Five-point likert items: t test versus mann-whitney-
wilcoxon. Practical Assessment, Research & Evaluation, Dr. Lawrence M. Rudner, v. 15,
n. 11, p. 1–12, 2010. Cited on page 131.

YUAN, Z. et al. Structural context-aware cross media recommendation. In: Advances in
Multimedia Information Processing–PCM 2012. [S.l.]: Springer, 2012. p. 790–800. Cited 5
many times on page 62, 63, 64, 66, and 67.

ZHANG, H.; ZHENG, S. Personalized TV program recommendation based on TV-anytime
metadata. In: IEEE International Symposium on Consumer Electronics. [S.l.: s.n.], 2005.
di, p. 242–246. ISBN 0780389204. Cited on page 37.

ZHANG, J.; YUAN, Z.; YU, K. Cross media recommendation in digital library. In: The
Emergence of Digital Libraries–Research and Practices. [S.l.]: Springer, 2014. p. 208–217.
Cited 4 many times on page 61, 62, 63, and 67.

ZHAO, L. et al. Active transfer learning for cross-system recommendation. In: CITESEER.
AAAI. [S.l.], 2013. Cited on page 47.

ZHIWEN, Y.; XINGSHE, Z. Design, implementation, and evaluation of an agent-based
adaptive program personalization system. In: Fifth International Symposium on
Multimedia Software Engineering. [S.l.: s.n.], 2003. p. 140–147. ISBN 0769520316. Cited
on page 37.

ZHUANG, F. et al. Cross-domain learning from multiple sources: a consensus
regularization perspective. Knowledge and Data Engineering, IEEE Transactions on,
IEEE, v. 22, n. 12, p. 1664–1678, 2010. Cited on page 44.

http://dx.doi.org/10.1007/s00530-015-0451-z

	Dedication
	Acknowledgements
	Epigraph
	Resumo
	Abstract
	List of Figures
	List of Tables
	Contents
	Introduction
	Contextualization
	Motivation
	Problem Statement
	Objectives
	Proposal Overview
	Contributions
	Thesis Outline

	Background and Related Work
	Recommender Systems
	Strategies
	User Profiling
	Evaluation

	Cross-Domain Recommender Systems
	Definition of Domain
	Cross-Domain Recommendation Tasks
	Cross-Domain Recommendation Goals
	Cross-Domain Recommendation Scenarios
	Cross-Domain Approaches
	Cross-Domain Evaluation
	Evaluation Data Partitioning
	Evaluation Metrics
	Sensitivity Analysis


	Context-Aware Recommender Systems
	Definition of Context
	Modelling Contextual Information
	Obtaining Contextual Information
	Contextual Information Relevance
	Context-Aware Approaches
	CARS Evaluation

	Related Works
	Cross-Domain Recommendation based on Collaborative Filtering
	Cross-Domain Recommendation based on Context-Awareness

	Final Remarks

	CD-CARS Proposal
	CD-CARS Problem Formalization
	Modelling Contextual Information
	Contextual Features Formalization
	Obtaining and Selecting Relevant Contextual Information

	CD-CARS Algorithms
	Proposed Algorithms
	Cross-Domain PreF Algorithm
	Cross-Domain PostF Algorithm
	Cross-Domain Modelling Algorithm
	Cross-Domain Hybrid Contextual Algorithms

	Base Cross-Domain Algorithms
	Single-Domain as Cross-domain Algorithms
	Neighborhood-based Algorithms
	Matrix factorization algorithms

	Cross-Domain Algorithm


	Final Remarks

	CD-CARS Implementation
	Dataset Acquisition
	Obtaining Contextual Information
	Temporal Dimension
	Location Dimension
	Companion Dimension

	Selecting Relevant Contextual Attributes and Values
	Cross-Domain Datasets Description
	Book-Television dataset
	Book-Music dataset


	Contextual Model Implementation
	Proposed Algorithms Implementation
	Pre-filtering Implementation
	Post-filtering Implementation

	Base Cross-domain Algorithm Implementation
	Final Remarks

	CD-CARS Evaluation
	Evaluation Methodology
	Settings of the Algorithms
	Predictive Performance
	Classification Performance
	Sensitivity Evaluation
	Statistical Significance Analysis

	Evaluation Results
	Book-Television Results
	Television as Target Domain
	Temporal Dimension
	Location Dimension
	Companion Dimension
	Combining Contextual Dimensions

	Book as Target Domain
	Temporal Dimension
	Location Dimension
	Companion Dimension
	Combining Contextual Dimensions

	Summary

	Book-Music Results
	Music as Target Domain
	Temporal Dimension
	Location Dimension
	Companion Dimension
	Combining Contextual Dimensions

	Book as Target Domain
	Temporal Dimension
	Location Dimension
	Companion Dimension
	Combining Contextual Dimensions

	Summary

	Discussion

	Final Remarks

	Conclusion
	Contributions
	Limitations
	Lines for Further Work

	References