Submitted 17 August 2020
Accepted 16 November 2020
Published 21 December 2020

Corresponding author
Nurfadhlina Mohd Sharef,
nurfadhlina@upm.edu.my

Academic editor
Faizal Khan

Additional Information and
Declarations can be found on
page 22

DOI 10.7717/peerj-cs.331

Copyright
2020 Al-Hadi et al.

Distributed under
Creative Commons CC-BY 4.0

OPEN ACCESS

Latent based temporal optimization
approach for improving the performance
of collaborative filtering
Ismail Ahmed Al-Qasem Al-Hadi1, Nurfadhlina Mohd Sharef2, Md Nasir
Sulaiman2, Norwati Mustapha2 and Mehrbakhsh Nilashi3

1 Faculty of Ocean Engineering Technology and Informatics, Universiti Malaysia Terengganu, Kuala Nerus,
Terengganu, Malaysia

2 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Selangor,
Malaysia

3 Faculty of Computing, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia

ABSTRACT
Recommendation systems suggest peculiar products to customers based on their past
ratings, preferences, and interests. These systems typically utilize collaborative filtering
(CF) to analyze customers’ ratings for products within the rating matrix. CF suffers
from the sparsity problem because a large number of rating grades are not accurately
determined. Various prediction approaches have been used to solve this problem by
learning its latent and temporal factors. A few other challenges such as latent feedback
learning, customers’ drifting interests, overfitting, and the popularity decay of products
over time have also been addressed. Existing works have typically deployed either
short or long temporal representation for addressing the recommendation system
issues. Although each effort improves on the accuracy of its respective benchmark, an
integrative solution that could address all the problems without trading off its accuracy
is needed. Thus, this paper presents a Latent-based Temporal Optimization (LTO)
approach to improve the prediction accuracy of CF by learning the past attitudes of
users and their interests over time. Experimental results show that the LTO approach
efficiently improves the prediction accuracy of CF compared to the benchmark schemes.

Subjects Artificial Intelligence, Data Mining and Machine Learning, Data Science
Keywords Temporal factorization, Recommender Systems, Collaborative Filtering, Drift,
Decay, Matrix Factorization

INTRODUCTION
Recommendation systems are some of the most powerful methods for suggesting products
to customers based on their interests and online purchases (Jonnalagedda et al., 2016;
Lin, Li & Lian, 2020; Nilashi, bin Ibrahim & Ithnin, 2014; Nilashi et al., 2015; Zhang et al.,
2020b). In terms of personalization of recommendations, one of the most prevalently used
methods is collaborative filtering (CF) (Nilashi, bin Ibrahim & Ithnin, 2014; Sardianos,
Ballas Papadatos & Varlamis, 2019; Nilashi et al., 2015; Wu et al., 2019). In CF, personalized
prediction of products depends on the latent features of users in a rating matrix. However,
the CF prediction accuracy decreases if the rating matrix is sparse (Zhang et al., 2020a; Li &
Chi, 2018; Idrissi & Zellou, 2020). Several types of factorization techniques such as baseline,

How to cite this article Al-Hadi IAA-Q, Sharef NM, Sulaiman MN, Mustapha N, Nilashi M. 2020. Latent based temporal optimization
approach for improving the performance of collaborative filtering. PeerJ Comput. Sci. 6:e331 http://doi.org/10.7717/peerj-cs.331

https://peerj.com/computer-science
mailto:nurfadhlina@upm.edu.my
https://peerj.com/academic-boards/editors/
https://peerj.com/academic-boards/editors/
http://dx.doi.org/10.7717/peerj-cs.331
http://creativecommons.org/licenses/by/4.0/
http://creativecommons.org/licenses/by/4.0/
http://doi.org/10.7717/peerj-cs.331


singular value decomposition (SVD), matrix factorization (MF), and neighbors-based
baseline have been exploited to address the problem of data sparsity (Mirbakhsh & Ling,
2013; Al-Hadi et al., 2017b) by predicting the missing rating scores in the rating matrix.
Similarly, various factorization-based techniques including the use of latent (Vo, Hong
& Jung, 2020; Nguyen & Do, 2018) and baseline factors (Koenigstein, Dror & Koren, 2011)
(such as SVD (Wang et al., 2019)) have been proposed to improve the recommendation
accuracy. Nevertheless, an unaddressed problem is that a part of the rating scores is
misplaced from its original cells while streaming into the memory. This misplacement
decreases the meticulousness of the latent feedback.

A method based on ensemble divide and conquer (Al-Hadi et al., 2016) was adopted
to solve the misplacement problem besides addressing the customers’ preferences drift
and popularity decay. Integration of temporal preferences with factorization methods to
solve the sparsity issue has yielded a better performance compared to basic factorization
approaches (Al-Hadi et al., 2017b; Li, Xu & Cao, 2016; Nilashi et al., 2019; Nilashi, bin
Ibrahim & Ithnin, 2014). The temporal dynamics approach (Koren, 2009) separates the
time period of preferences into static digit of bins and extracts a universal weight according
to the stochastic gradient descent method to reduce overfitting. Nonetheless, the learned
universal weight using the temporal dynamics approach has limitations with respect to
how it personalizes and represents the fluctuating temporal preferences.

The temporal interaction approach (Ye & Eskenazi, 2014) enhanced the effectiveness
of CF recommender systems by combining the latent factors, short-term preferences,
and long-term preferences. The shrunk neighbor approach is applied to obtain clients’
short-term feedbacks (Koren, 2008). This approach detects overfitting when there is a
fluctuating scale in the rating scores. For example, in the rating matrix from the MovieLens
dataset, the actual score is smaller than the predicted values. In this theory, the risk of
each asset is measured with its beta value (which is the criterion of systematic risk).
The fluctuating scale is in the range of 0–5, whereas the anticipated rating scores are
[5.75; 6.11; 5.9; 7]. Compared to other temporal approaches (e.g., the short-term based
latent technique (Yang et al., 2012)), the temporal interaction approach (Ye & Eskenazi,
2014) efficiently anticipates performance of CF. Nevertheless, problems such as drifting
customers’ preferences and popularity decay (e.g., deterioration of marketability of goods)
still pose a significant challenge (Ye & Eskenazi, 2014).

The long temporal-based factorization approach addresses the popularity decay issue (Al-
Hadi et al., 2018b) while the short temporal-based factorization approach (Al-Hadi et al.,
2017a) addresses the drift issue not solved by previous short-term based approaches. These
temporal approaches improve the performance of CF but they are characterized by low
accuracy. In view of the aforementioned, this paper presents a latent-based temporal
optimization (LTO) approach to solve the significant limitations of these temporal
approaches. As optimization algorithms have proven successful in various areas such
as healthcare (Zainal et al., 2020)and document processing Al-Badarneh Amer (2016),
we extend our earlier work (Al-Hadi et al., 2018a) and provide a detailed analysis of the
proposed approach. The contributions of this paper are summarised below.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 2/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


• A comprehensive review of CF-based recommender system techniques.
• A proposed LTO approach that minimizes overfitting and learns by integrating long
and short-term features with the baseline and factorization features.
• An LTO approach that learns the drift in the users’ interests through an improved
rating score prediction. This is achieved by integrating the long and short-term features
of users and items with their baseline and factorization features.
• An LTO approach that solves the sparsity issue by combining the learning output of the
overfitting, drift, and decay.
• A comparison of LTO’s performances with other factorization and temporal-based
factorization approaches.

In summary, the proposed approach has superior performance as it improves the
prediction accuracy in the CF technique by learning accurate latent effects of the temporal
preferences of users. The novel features of the LTO approach are as follows:

• It provides a personalized temporal weighting which is incorporated in matrix
factorization to reduce data sparsity error.
• It combines time convergence and personalized duration to accommodate consumer’s
preferences drifting in the personalized recommendation system.
• It utilizes the bacterial foraging optimization algorithm (BFOA) to accurately learn
the personalized temporal weights by regularizing the overfitted predicted scores in the
rating matrix and to track the factors of drift and decay.

The rest of this paper is sectioned as follows: ‘Related works’ reviews the past
works related to factorization approaches and temporal preferences. In ‘Latent-based
temporal optimization approch’, LTO is elaborated, followed by experimental analysis in
‘Experimental Settings’. ‘Experimental results’ discusses the experimental results. The final
section (‘Conclusion’) provides a summary and indicates possible future works.

RELATED WORKS
Collaborative Filtering
CF is a technique developed to make automated predictions (filtering) about the interests
of a customer by gathering preferences or rating scores from several other customers
(collaborating). The primary idea of the CF approach is that if a user (say X) shares
an attitude with another user (Y) on a subject, X is more likely to share Y’s attitude
on a different issue when compared to other randomly chosen users. CF is one of the
most implemented techniques used in the design of recommendation systems due to its
low computational requirement (Jonnalagedda et al., 2016; Sardianos, Ballas Papadatos &
Varlamis, 2019; Alhijawi & Kilani, 2020).

It utilizes to find similar users or items and calculate predicted rating scores according
to ratings of similar users. In addition, CF provides customized recommendations using
the similarity values of customers and common preferences while the score of the active
customer is placed in the rating score matrix. Changes are being made in the personalized

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 3/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


recommendation to suggest products to customers based on their tastes. This constitutes
a well-established methodology with a wide range of application.

In the CF technique, a forecast is achieved in three steps. The main step is estimating
the values of similarity amongst common clients and the target customer with the use of
similarity functions, such as the Cosine function (Nilashi et al., 2019; Alhijawi & Kilani,
2020). The rating scores supplied by the target client and the similarity values are applied
in the next procedure to estimate the expected score of the product using the prediction
function. The final step estimates the precision of the forecast by applying the root mean
squared error (RMSE) function (Nilashi et al., 2019).

CF suffers from the data sparsity problem which occurs due to a soaring proportion
of undetermined scores in the users’ voting matrix. This problem is solved using several
prediction methods such as neighbors-based baseline (Bell & Koren, 2007) and matrix
factorization (Koren, 2009; Nguyen & Do, 2018). However, these factorization-based
methods do not address temporal issues such as the drift in users’ preferences and the
popularity decay of products. This results in low prediction accuracy.

One of the most effective approaches for solving the data sparsity issue is MF (Koenigstein,
Dror & Koren, 2011; Al-Hadi et al., 2017b). A few MF methods use mathematical formulae
to combine hidden feedbacks of customers and products. The hidden feedback of
customers, products, and baseline properties are incorporated in the formulae. Equation
(1) forecasts the lost scores in the ranking matrix.

<̂ui=µ+Bu+Bi+puq
T
i , (1)

where <̂ui is the predicted value for the sparse score, µ is the global rate of all rating
scores, pu is the latent-feedback matrix of customers, qTi is the transpose latent-feedback
matrix of products, and Bu and Bi are the observed deviations of customer u and product
i, respectively.

To anticipate the sparse scores rating, µ, Bu, Bi, pu, and qTi are integrated in numerous
mathematical equations such as those in temporal approaches (Koren, 2009; Ye & Eskenazi,
2014) and factorization methods (Al-Hadi et al., 2016; Han et al., 2018; Yuan, Zahir &
Yang, 2019). For instance, the baseline factor and the distance between rating scores and
baseline values of neighbors who supply their rating scores for each product are combined
by the neighbors based baseline method (Bell & Koren, 2007) as presented in Eq. (2).

<̂ui=Bu+

∑
x∈Ni simxrxiBxi∑

x∈Ni simx
, (2)

where simx is the similarity rate of customer x with the target customer, N is the set of
customers who rate product i, rxi is the rating score provided by user x for item i, and
Bxi is the baseline value. Currently, the temporal recommendation methods are used to
suggest products to customers at an appropriate time. These are applied in many prediction
techniques to make an accurate forecast. Note that time is considered an important factor
in making final decisions.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 4/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Temporal-based approaches
Time is a very important factor in learning customers’ interests and tracking products’
popularity decay. The temporal preferences with matrix factorization have been used
to develop efficient collaborative-based schemes in addressing the issues of sparsity,
drift, and decay. For example, the temporal dynamics approach (Koren, 2009) utilizes the
factorization factors, bins (static temporal periods), and global weight to learn the temporal
preferences and minimize the overfitted predicted scores. However, it neglects the fact that
users’ preferences change over time. Hence, the overall weights are not accurate as a result
of personalization.

The long-term preferences
Long-term preferences differ from short-term preferences with regards to how they
are applied. In a session (i.e., a week, month, or season), the recorded preferences are
considered short-term preferences. On the other hand, the baseline factors and the long-
term preferences are exploited in the long-term approach (Ye & Eskenazi, 2014). This is
expressed in Eq. (3), where τs, τe and τui are the first, last and current time that a product
i is rated by a customer u, respectively. This approach addresses the drift in customers’
preferences over the long-term but it does not address the popularity decay of products.

<̂ui=µ+Bu
τe−τui

τe−τs
+Bi

τui−τs

τe−τs
+puq

T
i . (3)

<̂ui=
(
µ+Buωu+Biωi+puq

T
i
)2
+Gix

[
(Buωu)

2
+
∥∥pu∥∥2+(Biωi)2+∥∥qTi ∥∥2], (4)

where Gix is the weight of cluster x for item i that is updated by BFOA, pu and
∥∥pu∥∥ are

the latent factor and norm of latent factor of customer u, while qTi and
∥∥qTi ∥∥ are the latent

factor and the norm of latent factor of product i, respectively. Moreover, the personal
long-term factors are defined by ωu and ωi in Eqs. (5) and (6), respectively.

ωu=exp
(
−
τue −τ

u
s

τue

)
, (5)

ωi=exp
(
−
τ ie−τ

i
s

τ ie

)
, (6)

where τue and τ
u
s are the last and the first time customer u provided a rating scores, and

τ is and τ
i
e are the first and the last time the group of customers offers scores for product

i, respectively. Nevertheless, the long temporal approaches (Al-Hadi et al., 2018b; Ye &
Eskenazi, 2014) have not addressed issues such as the drift and the popularity decay by
considering the short-term preferences. This lowers the prediction performance of the CF
technique. The long temporal-based factorization approach (Al-Hadi et al., 2018b) learns
the long-term preferences by integrating genres with factorization features to address
sparsity and decay issues. However, this approach falls short of incorporating the drift in
customers’ preferences which lowers the prediction accuracy of the CF.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 5/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


The short-term preferences
The temporal dynamics approach (Koren, 2009) is used for predicting missing ratings by
integrating the temporal weights with the different factorization factors. This approach
minimizes the overfitted predicted scores during the optimization process using a global
weight. However, it does not properly characterize personalized feedback. The short-term
based latent method (Yang et al., 2012) learns the short-term preferences from the hidden
feedback of neighbors’ preferences during a session. However, this approach is not a lasting
solution, especially due to long-term, drift, and popularity decay. Similarly, the temporal
integration approach (Ye & Eskenazi, 2014) integrates the long and short preferences with
the baseline features to solve the drift issue. This approach is also limited by personalization,
understanding the drift in users’ preferences, and items’ popularity decay over time.

The short-term based baseline (Ye & Eskenazi, 2014) incorporates the baseline values of
neighbors during a session with other factorization factors as shown in Eq. (7).

<̂ui(t)=Bui+

∑
j∈ν(u,t)[(ruj−Buj)$ij]
√
|ν(u,t)|

+puq
T
i , (7)

where $ij is the applied weight that decreases the overfitting predicted values, ν(u,t) is the
set of products ranked by customer u during time interval t (e.g., the month of July), and∑

j∈ν(u,t)[(ruj−Buj)$ij] shows the whole difference between the rating scores by customer u
for a set of products during time t and the baseline values. Given the soaring ratio in the
sparse values in the ranking matrix, the short-term methods are not efficient in learning
short-term preferences.

Products and costumers’ preferences are learned through the short temporal-based
factorization method (Al-Hadi et al., 2017a) to address the drift issue and improve
the prediction accuracy of CF. However, product popularity decay is ignored in this
approach. Short-term preferences are represented using the temporal convergence among
the customers. These are exploited for the minimization of overfitting for the predicted
rating scores as shown in Eq. (8), where >xu is the temporal weight that is optimized
according to the location of cluster number x that represents the short-term period.
However, the prediction function of CF decreases due to the inability of the short-term
methods to cover the drift and decay problems during the period. As such, the long and
short-term preferences must be integrated to address all the issues in the recommendation
system.

<̂ui=
(
µ+>

x
uBu+Bi+puq

T
i
)2
+>

x
u

[
B2u+

∥∥pu∥∥2+B2i +∥∥qTi ∥∥2]. (8)
Summarily, the existing temporal-based approaches have addressed several limitations

of recommender systems such as sparsity (Zhang et al., 2020a; Idrissi & Zellou, 2020; Chu et
al., 2020), drift issue (Rabiu et al., 2020; Al-Hadi et al., 2017a) and time decay issue (Koren,
2009; Ye & Eskenazi, 2014; Al-Hadi et al., 2018b). Each reviewed approach in this article
has one or two research gaps, e.g., learning the personalized features, the drift preferences,
and the popularity decay. There is currently no approach that considers all these issues
(Table 1). Therefore, this work introduces the LTO approach for learning the features
related to all these issues.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 6/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Table 1 Comparison of temporal-based approaches according to the solved issues.

Temporal-based Approach Sh
or
t-
T
er
m

Lo
n
g-
T
er
m

Sp
ar
si
ty

D
ri
ft

D
ec
ay

Neighbors-based Baseline (Bell & Koren, 2007) X
Temporal Dynamics (Koren, 2009) X X X X
Ensemble Divide and Conquer (Al-Hadi et al., 2016) X
Short-Term based Latent (Yang et al., 2012) X
Temporal Integration (Ye & Eskenazi, 2014) X X X
Long Temporal-based Factorization (Al-Hadi et al., 2018b) X X X
Short Temporal-based Factorization (Al-Hadi et al., 2017a) X X X

LATENT-BASED TEMPORAL OPTIMIZATION APPROACH
The LTO approach addresses both long and short temporal preferences by using
factorization to solve issues of preference drift and popularity decay (alg1). LTO applies
RMSE, Cosine, and Prediction functions to assess the temporal preference representation.
The key empirical setting of the temporal-based factorization method and the proposed
solution framework are presented in Fig. 1.

BFOA is exploited to capture the preferences of a short duration. By applying
k-means, the timestamp convergence deals with short durations in the time matrix. The
number of clusters k is recognized based on the number of short durations in the entire
period. Generally, bacteria cannot track the drift and the time decay perfectly during the
short-term without considering the long-term. Therefore, the integration of the long and
short durations represents the accurate solution for solving the limitations of the drift and
the time decay.

In Fig. 2, we present an example of how to create the bacteria members by applying the
k-means method. In this example, the clusters’ number is assigned 2 for each of the users’
and items’ features. Based on the time convergence between the products (columns) and
customers (rows), four bacteria members are shown in Fig. 2.

The standard BFOA is utilized in the LTO process to detect the temporal conducts of
customers and products. BFOA members initialize the short-term weights using random
values. The LTO approach changes these weights throughout the lifecycle of BFOA
dynamically based on the positive effect it has on learning stages. The weights of bacteria
members >ux and >

i
y are updated dynamically throughout the learning iteration. This

provides a novel tracking of users’ interests in the items. The LTO uses Eq. (9) to reduce
the overfitted predicted scores throughout the learning iteration.

Dui=
(
µ+>

u
xBuωu+>

i
yBiωi+puq

T
i

)2
, (9)

where>ux and>
i
y indicate the short temporal weights indexed by clusters x and y. User u is

indexed by cluster x and item i is indexed by cluster y. The values of>ux and>
i
y are updated

in each iteration according to the positive effect in developing the accuracy prediction of the

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 7/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Algorithm 1 Latent-based Temporal Optimization
Data preparation based on personalization

RatingMatrix,TimeMatrix ←Assign the active user ←Data set
Learning latent and baseline vectors

Matrixvec ←µ,Bu,Bi,pu,qTi ←BaseLine,SVD←RatingMatrix
Matrixvec ←

∥∥pu∥∥2,∥∥qTi ∥∥2←∥∥pu∥∥=√∑mu=1(pu)2,∥∥qTi ∥∥2,∥∥qTi ∥∥=√∑ni=1(qTi )2
Matrixvec ←ωu←ωu=exp

(
−
τue −τ

u
s

τue

)
←τus ,τ

u
e ←TimeMatrix

Matrixvec ←ωi←ωi=exp
(
−
τ ie−τ

i
s

τ ie

)
←τ is,τ

i
e ←TimeMatrix

Assign the number of short durations
#D←Number of days←Total duration of dataset
#OneWeek←#D/7,#TwoWeeks←#D/14,#OneMonth←#D/30
#OneSeason←#D/90,#OneYear ←#D/365

Learning short-term features of users by #OneMonth
Assign the number of clusters based on the number of short duration of users
x ←#OneMonth
Temporal index of users←k−means(TimeMatrix,x)
[>

x
1,>

x
2,...,>

x
u]←Temporal index of users

Learning short-term features of items by #OneMonth
Assign the number of clusters based on the number of short duration of items
y ←#OneMonth
Temporal index of items←k−means(TimeMatrix,y)
[>

y
1,>

y
2,...,>

y
i ]←Temporal index of items

Create Bacteria
Combining the temporal features of users and the temporal features of items in one

matrix
[>

x
1,>

x
2,...,>

x
u,>

y
1,>

y
2,...,>

y
i ]←[>

x
1,>

x
2,...,>

x
u]and[>

y
1,>

y
2,...,>

y
i ]

Assign the variables of the training process
Assign RMSEoptimum and Iteration number
Initialize random values for S Bacteria

Repeat
Predicting the sparse scores in RatingMatrix
UpdatedBacteria ←Training the weights of short-term [>x1,...,>

x
u,>

y
1,...,>

y
i ]

using BFOA
Dui,Fu,Gi←Equations 9, 10, 11←Updated Bacteria, Matrixvec, Rating Matrix
Rating Matrix with predictions←<̂ui=Dui+Fu+Gi
CF technique←Rating Matrix with predictions

similarity values←Cosine Function←CF technique
predicted values for items←prediction function← similarity values
RMSE value←RMSE function←predicted values for items and scores of ac-

tive user
Until RMSE <=RMSEoptimum or complete the iteration loop

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 8/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Figure 1 Latent-based Temporal Optimization Framework.
Full-size DOI: 10.7717/peerjcs.331/fig-1

Figure 2 An example of bacteria members initialization.
Full-size DOI: 10.7717/peerjcs.331/fig-2

CF method. The vectors ωu and ωi are the long-temporal independent weights of customer
u and product i while Bu, Bi, pu, qTi represent the baseline and factorization variables.

The second contribution of LTO is tracking users’ drifting interests. This is learned by
focusing on the time associated with users’ interests as represented by Eq. (10).

Fu=>
u
x

[
(Buωu)

2
+
∥∥pu∥∥2], (10)

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 9/25

https://peerj.com
https://doi.org/10.7717/peerjcs.331/fig-1
https://doi.org/10.7717/peerjcs.331/fig-2
http://dx.doi.org/10.7717/peerj-cs.331


where
∥∥pu∥∥2 represents the norm value of user’s latent factor and>ux is updated according

to the positive effects of changing users’ interests throughout the learning process.
The third contribution of LTO is tracking the popularity decay of items throughout the

learning process by focusing on the time popularity of items as shown in Eq. (11).

Gi=>
i
y

[
(Biωi)

2
+
∥∥qTi ∥∥2], (11)

where
∥∥qTi ∥∥2 is the norm factorization variable of items and >iy is updated according to

the improvement achieved through the learning iteration which affects the baseline values
and norm factorization features of items. Furthermore, BFOA learns the significance of
each short-term period by applying the RMSE (which acts as the fitness value).

These contributions are combined in Eq. (12) to predict the unknown values within the
rating matrix.

<̂ui=Dui+Fu+Gi. (12)

The BFOA operates in three stages: chemotaxis, reproduction, and removal and
distribution. The first stage involves seeking the closest nutrition source by the bacteria.
This is accomplished by swimming or tumbling or alternating between swimming and
tumbling to change direction during the generation. In this process, the flagella of the
bacteria make clockwise rotations to choose another path so that rich nutrients in the
surrounding can be obtained. The tumbling stage is expressed in Eq. (13).

τ
i(j+1,k,l)=τ i(j,k,l)+Ci+

θ√
θti θi

, (13)

where τ is the short-term features of one bacterium, the variables i, j, k and l symbolize i-th
bacterium at j-th chemotactic, k-th reproduction and l-th elimination and dispersal steps.
Ci is the walk length in an irregular direction, 2i is a random value oncterium number i at
j-th chemotactic, k-th reproduction and [-1,1], and 2√

2ti 2i
is the unit walk in the irregular

direction.
Swimming follows tumbling whenever the flagella of bacteria make the counterclockwise

rotation to move in a particular direction. The bacteria continue swimming in the same
direction if the nutrients are rich and the alternation between tumbling and swimming is
repeated until the chemotactic stage is complete. The swarming function utilizes a sensor to
provide signals in a nutrient-rich environment. When the signal indicates a poor nutrient
or a dangerous location, the bacteria shift from the center to the outward direction with
a moving ring of the members. If the nutrient has a high level of succinate, the bacteria
subsequently neglect aspartate attractant and concentrate in groups.

The bacteria provide an attraction signal for the all members so that they swim together.
The bacteria move in a concentric pattern with a high density. The outward movement of
the ring and the native releases of attractant constitute the spatial order (Kim & Abraham,
2007). The swarming stage is represented mathematically as shown in Eq. (14).

τ
i(j+1,k,l)=

S∑
i=1

−dattr exp(wattrβ)+
S∑

i=1

−hrepexp(wrepβ), (14)

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 10/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


where S denotes the number of bacteria and β is the summation of short-term features
that can be learned by the members of bacteria i as shown in Eq. (15). The attractant depth
dattr denotes the magnitude of excretion by a cell, attractant width wattr denotes how the
chemical cohesion signal spreads, repellent height hrep and width wrep determine the size
of optimization space where the cell is related to the dispersal of chemical signal.

β=

P∑
m=1

(τm−τ
i
m)

2
, (15)

where P denotes the number of short-term features, τm denotes the short-term feature
number m learned during the chemotaxis process whileτ im is the short-term feature number
m learned during the chemotaxis process by bacteria i.

In the reproduction stage, the health of bacteria is calculated according to the fitness
value of each bacterium. The bacteria values are then sorted in an ascending order in the
array. The fitness value provided by RMSE is extracted from the optimization area in the
recommendation system (that is based on collaborative filtering). The lower half bacteria
with poor foraging die and the upper half bacteria (having better foraging) are copied into
two parts. Each part has the same values Al-Hadi et al. (2017a). This procedure keeps the
bacterial population constant. The bacterial health can be calculated using Eq. (16).

J ihealth=
Nc+1∑
j=1

J(i,j,k,l), (16)

where J ihealth is the healthy score of the short-term preference that can be learned by bacteria
i, the number of chemotactic, reproduction and removal steps are j, k and l, respectively.

Provision for the possibility of the ever-changing status of a few bacteria is carried out
in the third step (removal and distribution). Here, the rise order involves the arrangement
and generation of the random vector. The bacteria are organized based on their health
values. Moreover, the randomly generated locations are used to change the locations of
the bacteria in the optimization domain. These locations are recognized as the prominent
available locations. After the generation, the best result in each repetition is approved as
the final (correct) result.

In this work, the BFOA is integrated with the k-means clustering algorithm and matrix
factorization approach. The k-means acts as a clustering algorithm used to control the
big optimization space based on members’ personal features. It is used to reduce the large
number of members to a small number of clusters. These clusters are then controlled
using a weight for each cluster to control the optimization domain. Applying the natural
choice in the course of repeating generations, the BFOA decreases the number of poor
foraging members and increases the number of rich foraging strategies (Al-Hadi, Hashim
& Shamsuddin, 2011). After several generations, the poor foraging members are removed
or transformed to skilled members.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 11/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


EXPERIMENTAL SETTINGS
The CF technique is used to predict the interest of an active user. This takes into account
the calculated similarity values between the rating scores of common users (neighbours)
and the active user. However, sparse rating scores in the rating matrix negatively affect the
prediction accuracy of the CF technique. Thus, this research work aims to improve the
prediction of sparse rating scores in the rating matrix of each active user. The data sparsity
is an important issue which this research aims. The factorization approaches (including
temporal-based factorization) are used to predict missing rating scores in the rating matrix
which improves the prediction accuracy of CF. The percentage accuracy can be measured
using RMSE function where the lower values of RMSE refers to the accurate predicted
values for the missing rating score in the rating matrix and also refers to the heights accuracy
prediction of the recommendation list of items that can provided to the active user.

Datasets
To demonstrate the performance of LTO, three real-world datasets are used: MovieLens,
Netflix, and Epinions. Several experimental studies have utilized MovieLens [34], Netflix
Prize [19], and Epinions [35] to predict the performances of recommendation systems. A
brief description of the three datasets is given in Table 2. The customers of these datasets
assign a rating score from 1 to 5 to the movie or product, where 1 to 2 indicate an unliked
product and 3 to 5 indicate a liked product. In the concluding experiments, the sparsity
level of each dataset is considered to show its effect on the prediction performance of
these datasets. The sparsity level is computed by Eq. (17) (Abdelwahab et al., 2012), where
#Rating is the score provided by users from 1 to 5 and #Total is the product of #Customers
and #Products.

SparsityLevel =1−
#Rating
#Total

. (17)

Normalization
Data normalization is used in data transformation to reprocess the data with the aim of
enhancing the precision and effectiveness of mining methods and distance calculations
(Al-Hadi et al., 2016). In recommender systems, the scores of customers for products are
within 0–5. However, this range may result in low prediction accuracy. Thus, the rating
scores are normalized to a range (0–1) to reduce the prediction error. Table 3 shows the
values of the original scores and the normalized scores.

k-means and BFOA setting
Table 4 shows the number of clusters for the k-means clustering method and the short-term
periods for the three datasets. The MovieLens contains data for four periods (i.e., one day,
one week, two weeks, and one month) while Netflix contains an additional two periods
(i.e., one season and one year). For instance, the entire period of Netflix is about 2190
days which can be divided by 90 days to get 24 seasons. Here, the cluster number (i.e.,
24) is assigned to represent the users’ activities throughout the temporal convergence of
seasons. The k-means algorithm will divide the activities of users in the time matrix into

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 12/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Table 2 Experimental datasets.

MovieLens Netflix Prize Epinions Trustlet

#Customers 943 480,189 4,718
#Products 1,682 17,770 36,165
#Rating 100,000 100,480,507 346,035
Sparsity level 0.93 0.98 0.99
Date 1997–1998 1999–2005 1999
#Periods 7 months 6 years 11 months
Temporal vector #seconds #days #months
Products Movie Movie Product

Table 3 The normalization of the rating scores.

Type Range Rating Scores

Original [0–5] 0 1 2 3 4 5
Normalized [0–1] 0 0.2 0.4 0.6 0.8 1

Table 4 The k-means clustering method and short-term periods for different datasets.

Datasets # Days Periods # Clusters ( k) Success clustering

MovieLens 210 One month 7 X
Two weeks 15 X
One week 30 X
One day 210

Epinions 330 One month 11 X
Two weeks 23 X
One week 47

Netflix 2190 One year 6 X
One season 24 X
One month 73 X
Two weeks 156

24 clusters. Similarly, the interest-time for items in the time matrix will be divided into 24
clusters. However, when the number of clusters is greater than the number of customers in
some rating matrices (e.g., two weeks period), Netflix will not be appropriate for grouping
by the k-means algorithm. The periods of one month and one season are applied by
Epinions. The one-week period is not considered as it is not suitable for Epinions temporal
feature (Al-Hadi et al., 2018b). The period of two weeks by Netflix is inappropriate for
k-means clustering algorithm (Al-Hadi et al., 2017a). Therefore, in the Netflix dataset,
three temporal periods are used (i.e., one year, one month, and one season). The BFOA
factors and their values are determined according to the proper empirical execution of the
LTO approach in Table 5. In addition, a value is selected from the numbers of clusters
using the P parameter as demonstrated in Table 4.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 13/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Table 5 The parameters values of BFOA.

Parameters No Parameters No

No. of bacteria groups S 6 Elimination-dispersal l 4
The length of a swim 4 wrep 5
Run length unit Ci 0.1 wattr 0.2
No. of iteration 20 Probability of elimination-dispersal 0.25
Optimum RMSE 0.01 dattr 0.1
No. of chemotactic steps j 6 hrep 0.1
Reproduction steps k 4

RMSE value reduction
The LTO approach extracts the short-term features by deploying BFOA which uses the
temporal convergence in the time matrix of a short duration. For instance, the number
of months in each active user’s time matrix is considered to divide the time matrix into
k-weighted clusters. The BFOA trains the clusters’ weights to reduce the overfitted predicted
scores according to the smallest RMSE. The weights of a short duration are optimized based
on the positive effects on the factorization factors. This way, the prediction accuracy of the
CF technique is achieved.

The factorization factors are extracted from a rating matrix and fixed values are provided
for all iterations of the optimization process. The swarming action of bacteria provides
sensor values that are integrated with RMSE to guide the members of bacteria into the
direction of the rich nutrient or to avoid the detrimental area. Short-term periods are
determined from the time of all preferences in each experimental dataset and their effects
are shown under two kinds of scoring scales. Table 6 is an example of drift learning based
on overfitting values minimization. In this example, each row contains the active user’s
ID and the RMSE values for 10 iterations. The lowest RMSE value is selected by ensemble
selection and saved in the last column of this table. The optimum temporal weights are
also saved according to the selected RMSE value. Average RMSE values for 10 iterations
(column by column) are shown in the last row. The value in the last cell is the lowest
average RMSE value of the test-set members (0.875). This will be used for comparing the
prediction performance of LTO with the benchmark schemes.

The LTO learns the temporal weight of each user using the personality activation of the
users who rated the set of items during the long-term. Similarly, the LTO learns the temporal
weight of each item based on the personality activation of the set of users who rated the
item during the long-term. The temporal weights of the long duration are incorporated in
the baseline model to determine the interest of customers and the popularity of products.
In addition, the short duration weights are learned by the LTO approach. This is achieved
using minimized overfitting. LTO learns the drift and time decay features during the
optimization process. This LTO approach improves the performance of the CF technique
throughout the iteration loop by learning the accurate predicted sparse rating score values
in the rating matrix which reducing the RMSE values. In the next section, the effect of LTO
approach in learning the temporal features will be examined under the scoring of [0–5]
and [0–1].

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 14/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Table 6 An Example of how LTO reduces RMSE values in Movielens under scoring [0-5].

User Iteration number Min RMSE

1 2 3 4 5 6 7 8 9 10

1 0.827 0.814 0.801 0.797 0.793 0.786 0.782 0.782 0.780 0.777 0.777
2 0.796 0.758 0.758 0.755 0.751 0.751 0.752 0.750 0.751 0.754 0.750
3 1.274 1.255 1.245 1.231 1.231 1.229 1.243 1.226 1.229 1.222 1.222
4 0.704 0.645 0.640 0.635 0.632 0.629 0.628 0.631 0.627 0.628 0.627
5 1.128 1.116 1.058 1.036 1.018 1.018 1.009 1.005 1.002 1.000 1.000
Avg. RMSE 0.946 0.918 0.900 0.891 0.885 0.882 0.883 0.879 0.878 0.876 0.875

Table 7 Average personal vectors of the Test-set matrices.

Test-set MovieLens Netflix Epinions

Number of the rating matrices 31 20 5
Avg. No. of customers 915 953 107
Avg. No. of products 105 128 103
Avg. No. of rating scores >0 17491 11662 517
Avg. No. of total rating scores >=0 97855 125183 10481
Avg. percentage of SparsityLevel 79.18 % 89.12 % 94.59 %

EXPERIMENTAL RESULTS
This section discusses the performances of the benchmark and the proposed approaches for
improving the CF prediction performance technique under two scoring scales which are
[0–5] and [0–1]. The efficacy of LTO in resolving the decay and drift issues by reducing the
RMSE values are also discussed. Table 7 shows the personal vectors of the Test-set matrices
that can impact the experimental results. There are 31 rating matrices for MovieLens which
are selected by the sequence 30, 60, 90, ....., 930 for the Test-set. Each matrix in the Test-set
has different numbers of rows and columns. The sparsity levels are compared with other
matrices to provide unique results for each rating matrix. The average numbers of these
matrices and their factors are used for performance evaluation in the experiments.

LTO approach under the scoring [0–5]
The LTO approach is applied to five short-term periods (which are a week, two weeks,
a month, and a year) according to the tested dataset under the rating scale [0–5].
Figures 3–5 demonstrate the prediction accuracy while performing the iterations on the
datasets. Figure 3 shows that the two weeks period in MovieLens has a higher prediction
accuracy during 3–20 iterations compared to one week and one month. Users’ activities
during the long and short duration preferences are best learned within the two weeks
period. This makes it an accurate short-term preference.

In Fig. 4, the period of one month in the Epinions dataset has a higher prediction
accuracy during iterations 4 to 20 compared to the period of one season. Here, one
month is an accurate short-term period. This period has the best short and long-term
performances compared to one season. One year period in Netflix provides a greater

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 15/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Figure 3 LTO prediction accuracy improvement of CF for MovieLens.
Full-size DOI: 10.7717/peerjcs.331/fig-3

prediction performance from iterations 1 to 5 in Fig. 5. Across iterations 5 to 20, the period
of one year is more precise than one month but less than the period of one season.

The temporal attitudes are learned by the LTO approach over 20 iterations. This helps
to achieve precise predictions by optimizing the temporal weights of each duration. The
periods of Netflix are 6 years, 42 seasons or 73 months. Experimental results show that
LTO, in one season, achieved the highest prediction performance compared to its predicted
performance using the year and month periods. This is because the duration of a season is
an intermediate between a month and a year. Moreover, various customers’ activities are
performed therein.

LTO approach under the scoring [0–1]
This subsection demonstrates the normalizing effect on the performance of LTO for
reducing the RMSE under the scoring [0-1]. Figures 6, 7 and 8 track the effects of the
temporal vectors in improving the prediction accuracy of the CF using the LTO approach.
Figure 6 indicates that the RMSE of MovieLens for a week is better than that of a month
under the rating scale [0–1]. Additionally, the RMSE during the period of two weeks is the
best compared to those of a week and a month. This emphasizes the significance of the
two-week period in learning the drift of customers’ interests and products’ popularity time
decay.
Figure 7 shows the prediction accuracy using Epinions, Here, the period of one month has
a significant effect on reducing the RMSE in iterations 3 to 20 compared to the effect of
one season.

In Figure 8, the effect of a season using Netflix is equivalent to the effect of a year but
more than the effect of a month during iterations 1 to 13. Iterations 13 to 20 has the

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 16/25

https://peerj.com
https://doi.org/10.7717/peerjcs.331/fig-3
http://dx.doi.org/10.7717/peerj-cs.331


Figure 4 LTO prediction accuracy improvement of CF for Epinions.
Full-size DOI: 10.7717/peerjcs.331/fig-4

Figure 5 LTO prediction accuracy improvement of CF for Netflix.
Full-size DOI: 10.7717/peerjcs.331/fig-5

sharpest accuracy prediction in the season compared to the period of one year and one
month.

Figures 3–8 show the potential of BFOA in learning the temporal features by swarming
in the dimensional time-space. The effects of the equivalent time periods show that the
customers’ interests and the products’ popularity changed during these periods. The
proposed approach will be evaluated by the current factorization and temporal approaches
in the next subsection.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 17/25

https://peerj.com
https://doi.org/10.7717/peerjcs.331/fig-4
https://doi.org/10.7717/peerjcs.331/fig-5
http://dx.doi.org/10.7717/peerj-cs.331


Figure 6 LTO prediction accuracy improvement of CF for MovieLens.
Full-size DOI: 10.7717/peerjcs.331/fig-6

Figure 7 LTO improves accuracy prediction of CF for Epinions.
Full-size DOI: 10.7717/peerjcs.331/fig-7

Comparison of the performances of CF, MF, and Temporal-based
approaches
In this subsection, the LTO approach is evaluated by comparing its effectiveness in reducing
RMSE values with other benchmark approaches. Both LTO and the benchmarks are used
to predict the sparse scores as they all lower the RMSE values. Note that the lower the RMSE
value, the higher the prediction accuracy of the CF approach. In Table 8, seven approaches
are implemented to benchmark the prediction performance of LTO. The improvement in
the prediction performance of the CF technique is represented by two scoring scales: [0–5]

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 18/25

https://peerj.com
https://doi.org/10.7717/peerjcs.331/fig-6
https://doi.org/10.7717/peerjcs.331/fig-7
http://dx.doi.org/10.7717/peerj-cs.331


Figure 8 LTO prediction accuracy improvement of CF for Epinions.
Full-size DOI: 10.7717/peerjcs.331/fig-8

and [0–1]. First, the scores from 0 to 5 are provided by the users of the three experimental
datasets, and the benchmarks are categorized into three parts. The first part contains
the prediction accuracy of the CF technique by RMSE for the rating matrix of the active
user without predicting the sparse rating scores. The second part contains the prediction
accuracy of the CF by two factorization approaches which are Neighbour-based Baseline
(Bell & Koren, 2007) and the Ensemble Divide and Conquer (Al-Hadi et al., 2016). These
approaches are used to solve the sparsity issue and as well learn the accurate factorization
features. From the evaluations, it is observed that the prediction performance of Ensemble
Divide and Conquer is better than that of the CF and the Neighbours-based Baseline
approach. However, the approaches in the second category have weaknesses in learning
the overfitted predicted scores and temporal issues (such as drift and decay). For the third
category, five temporal approaches are used to solve five issues i.e., sparsity, accurately
learning latent features, overfitting, drift, and decay. The Temporal Dynamics (Koren,
2009) has a good prediction performance in solving these issues but it has a weakness
in learning the personalized features using the equaled time slices. Temporal Integration
using Netflix performs better compared to Temporal Dynamics and Short-Term based
Latent approach. However, Temporal Integration has a weakness with respect to drift
and decay. Short Temporal-based Factorization (Al-Hadi et al., 2017a) addresses all issues
except popularity decay. It improves the prediction performance of the CF technique
when compared to the above approaches using MovieLens and Netflix. The performance
of the Short Temporal-based Factorization approach is lower than that of the Ensemble
Divide and conquer approach using the Epinions dataset because the recorded timestamp
factors are registered using the number of months only (which represent weak temporal

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 19/25

https://peerj.com
https://doi.org/10.7717/peerjcs.331/fig-8
http://dx.doi.org/10.7717/peerj-cs.331


Table 8 The RMSE of several prediction approaches using three datasets.

Approach Scoring [0–5] Scoring [0–1]

MovieLens Epinions Netflix MovieLens Epinions Netflix

CF 0.9573 1.0536 0.9983 0.1915 0.2107 0.1997
Neighbors based Baseline (Bell & Koren, 2007) 0.9613 1.0562 0.9982 0.1923 0.2112 0.1996
Ensemble Divide and Conquer (Al-Hadi et al., 2016) 0.9481 1.0351 0.973 0.1896 0.2069 0.1948
Temporal Dynamics (Koren, 2009) 0.9514 1.0486 1.0173 0.1903 0.2097 0.2035
Short-Term based Latent (Yang et al., 2012) 0.9613 1.0562 0.9982 0.1923 0.2110 0.1996
Temporal Integration (Ye & Eskenazi, 2014) 0.9557 1.0563 0.9982 0.1912 0.2112 0.1996
Short Termporal based Factorization
(Al-Hadi et al., 2017a)

0.8716 1.0492 0.9704 0.1771 0.2088 0.1900

LTO 0.7933 0.9887 0.8136 0.1642 0.199 0.1564

features). Distinctively, the LTO approach addresses all issues including the limitations
in the benchmark schemes. It improves the prediction performance of the CF technique
through a perfect combination of various factorization and temporal features. It also tracks
the drift of users and the decay of items throughout the learning process. Table 8 shows
that LTO exhibits superior prediction performance when compared to all benchmarks.

The normalization process for rating scores has reduced the RMSE values by almost 80
% due to the percentage difference between the rating scores [0–5] and [0–1]. For example,
the percentage difference of LTO approach in MovieLens is calculated using Eq. (18).

Percentage Difference=1−
Scoring scale1−Scoring scale2

Scoring scale1
, (18)

where Scoring scale1 is from 1 to 5 and Scoring scale2 is from 0 to 1. The percentage difference
between 0.7933 and 0.1642 is 79.3%. Figure 9 indicates the high prediction accuracy
achieved by the LTO approach for the three datasets when compared with the benchmark
methods. Additionally, the graphs show the positive impact of the normalization in
reducing the RMSE by around 80%.

Comparison of the output prediction scores of CF and LTO
The CF technique utilizes the similarity function to calculate the similarity between the
active user and the common users or neighbors. In the second stage, CF utilizes the
prediction function according to the similarity values to recommend items to the active
user. However, CF’s predicted scores are not accurate because of the sparsity values in the
rating matrix. This is solved using the LTO approach. Table 9 is an example showing the
improved accuracy of the predicted scores achieved by the LTO approach. In this example,
the short duration is one year. Active users rate items from 1 to 5, where 1 and 2 indicate
unlike items while 3, 4, and 5 indicate the liked items.

The CF predicts rating scores from 2.4 to 3.3. This provides the active users with
recommended and unrecommended items (denoted by R and N, respectively in Table 9).
On the other hand, the LTO approach predicts rating scores from 0.4 to 5.0 which provides
more accurate prediction compared to the CF. Figure 10 visualizes the output in Table 9
and indicates the high prediction performance of the LTO approach.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 20/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Figure 9 The normalisation effects on the prediction accuracy of CF. LTO provides the highest accuracy
prediction for the CF under scoring [0–1].

Full-size DOI: 10.7717/peerjcs.331/fig-9

Table 9 Feedback prediction scores by CF and LTO approaches.

items i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12 i13 i14

Active user 1 4 1 3 1 2 2 1 5 2 5 2 5 2
CF 2.9 2.9 2.9 3.1 3.3 2.7 2.8 2.5 2.9 2.4 2.9 2.4 2.9 2.9

R R R R R R R R R N R N R R
LTO 0.4 4.3 2.4 2.3 2.2 2.5 1.8 2.4 3.9 2.3 5 3 3.7 3.3

N R N N N R N N R N R R R R

Figure 10 Feedback prediction scores by CF and LTO approaches.
Full-size DOI: 10.7717/peerjcs.331/fig-10

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 21/25

https://peerj.com
https://doi.org/10.7717/peerjcs.331/fig-9
https://doi.org/10.7717/peerjcs.331/fig-10
http://dx.doi.org/10.7717/peerj-cs.331


CONCLUSION
The CF performance is affected by several factors including changes in the customer’s taste,
time decay in the popularity of products, data sparsity, and the overfitting in the predicted
rating scores. Prior research has attempted to enhance the CF’s prediction function by
integrating the long and short-term preferences via the temporal interaction method (Ye &
Eskenazi, 2014) with the factorization factors. However, the achievement is low. The goal
of the long temporal-based factorization approach (Al-Hadi et al., 2018b) is to solve the
popularity decay problem and understand the drifting taste of clients over the long-term.
On the other hand, the main focus of the short temporal-based factorization approach
is to understand the behaviors of customers and solve the drift issue in the short-term.
Nonetheless, there are several limitations associated with predicting popularity decay issues
as well as the drift in customers’ preferences over time.

To address these problems, the LTO approach presented in this paper integrates both
short and long-term preferences. It utilizes the k-means and BFOA method which derived
the fitness value by combining the signal and RMSE values. The swarming function
represents the preferences of the short-term based on the sensitivity of bacteria to rich
nutrients or dangerous signals. According to the empirical findings, a higher prediction
precision is achieved by the LTO approach compared to the benchmark approaches. This
is attributed to the temporal-based factorization approach and its ability to enhance the
accuracy of the CF technique by understanding the temporal behaviors in both long and
short preferences.

Possible extensions of this work include integrating the LTO approach with other
factorization features such as neighbors’ latent feedbacks. This would contribute to
addressing issues such as the cold start when recommending new items to active users.
Besides, the genre’s features of movies can be integrated with the factors that are utilized
by LTO approach for the purpose of addressing challenges of new items by the MovieLens
and Yahoo! Music datasets.

ADDITIONAL INFORMATION AND DECLARATION

Funding
This publication is funded by the Asian Office of Airforce Research and Development
(AOARD) through a project on Deep Recurrent Q Learning for Recommendation System.
The funders had no role in study design, data collection and analysis, decision to publish,
or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors:
The Asian Office of Airforce Research and Development (AOARD) through a project on
Deep Recurrent Q Learning for Recommendation System.

Competing Interests
The authors declare there are no competing interests.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 22/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331


Author Contributions
• Ismail Ahmed Al-Qasem Al-Hadi conceived and designed the experiments, performed
the experiments, analyzed the data, performed the computation work, prepared figures
and/or tables, authored or reviewed drafts of the paper, developed the solutions proposed,
and approved the final draft.
• Nurfadhlina Mohd Sharef conceived and designed the experiments, analyzed the data,
prepared figures and/or tables, authored or reviewed drafts of the paper, supervised and
lead the project, and approved the final draft.
• Md Nasir Sulaiman and Norwati Mustapha conceived and designed the experiments,
analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.
• Mehrbakhsh Nilashi analyzed the data, authored or reviewed drafts of the paper, and
approved the final draft.

Data Availability
The following information was supplied regarding data availability:

Codes are available in the Supplemental Files.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/
peerj-cs.331#supplemental-information.

REFERENCES
Abdelwahab A, Sekiya H, Matsuba I, Horiuchi Y, Kuroiwa S. 2012. Feature optimiza-

tion approach for improving the collaborative filtering performance using particle
swarm optimization. Journal of Computational Information Systems 8(1):435–450.

Al-Badarneh Amer , Ali Mostafa SMG. 2016. An improved classifier for arabic text.
Journal of Convergence Information Technology 11:69–84.

Al-Hadi IAA-Q, Hashim SZM, Shamsuddin SMH. 2011. Bacterial Foraging Optimiza-
tion Algorithm for neural network learning enhancement. In: 2011 11th international
conference on hybrid intelligent systems (HIS). Piscataway: IEEE, 200–205.

Al-Hadi IAA-Q, Sharef NM, Nasir SM, Norwati M. 2018a. Temporal based factorization
approach for solving drift and decay in sparse scoring matrix. In: International
conference on soft computing and data mining. Cham: Springer, 340–350.

Al-Hadi IAA-Q, Sharef NM, Sulaiman MN, Mustapha N. 2016. Ensemble divide and
conquer approach to solve the rating scores’ deviation in recommendation system.
Journal of Computational Science 12(6):265–275 DOI 10.3844/jcssp.2016.265.275.

Al-Hadi IAA-Q, Sharef NM, Sulaiman MN, Mustapha N. 2017a. Bacterial foraging opti-
mization algorithm with temporal features to solve data sparsity in recommendation
system. In: Proceedings of the second international conference on internet of things, data
and cloud computing. 1–6.

Al-Hadi IAA-Q, Sharef NM, Sulaiman MN, Mustapha N. 2017b. Review of the temporal
recommendation system with matrix factorization. International Journal of Innova-
tive Computing, Information and Control 13(5):1579–1594.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 23/25

https://peerj.com
http://dx.doi.org/10.7717/peerj-cs.331#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.331#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.331#supplemental-information
http://dx.doi.org/10.3844/jcssp.2016.265.275
http://dx.doi.org/10.7717/peerj-cs.331


Al-Hadi IAA-Q, Sharef NM, Sulaiman MN, Mustapha N. 2018b. Temporal-based
approach to solve item decay problem in recommendation system. Advanced Science
Letters 24(2):1421–1426 DOI 10.1166/asl.2018.10762.

Alhijawi B, Kilani Y. 2020. A collaborative filtering recommender system us-
ing genetic algorithm. Information Processing & Management 57(6):102310
DOI 10.1016/j.ipm.2020.102310.

Bell RM, Koren Y. 2007. Lessons from the Netflix prize challenge. Association for Com-
puting Machinery SIGKDD Explorations Newsletter 9(2):75–79 DOI 10.1145/1345448.1345465.

Chu PM, Mao Y-S, Lee S-J, Hou C-L. 2020. Leveraging user comments for recommenda-
tion in E-commerce. Applied Sciences 10(7):1–18 DOI 10.3390/app10072540.

Han H, Huang M, Zhang Y, Bhatti UA. 2018. An extended-tag-induced matrix
factorization technique for recommender systems. Information 9(6):143
DOI 10.3390/info9060143.

Idrissi N, Zellou A. 2020. A systematic literature review of sparsity issues in recom-
mender systems. Social Network Analysis and Mining 10(1):15
DOI 10.1007/s13278-020-0626-2.

Jonnalagedda N, Gauch S, Labille K, Alfarhood S. 2016. Incorporating popularity
in a personalized news recommender system. PeerJ Computer Science 2:e63
DOI 10.7717/peerj-cs.63.

Kim D-H, Abraham A. 2007. A hybrid genetic algorithm and bacterial foraging ap-
proach for global optimization and robust tuning of PID controller with disturbance
rejection. In: Hybrid evolutionary algorithms. Berlin, Heidelberg: Springer, 171–199.

Koenigstein N, Dror G, Koren Y. 2011. Yahoo! music recommendations: modeling
music ratings with temporal dynamics and item taxonomy. In: Proceedings of the fifth
ACM conference on Recommender systems. New York: ACM, 165–172.

Koren Y. 2008. Factorization meets the neighborhood: a multifaceted collaborative
filtering model. In: Proceedings of the 14th ACM SIGKDD international conference
on Knowledge discovery and data mining. New York: ACM, 426–434.

Koren Y. 2009. Collaborative filtering with temporal dynamics. In: Proceedings of the 15th
ACM SIGKDD international conference on Knowledge discovery and data mining. New
York: ACM, 447–456.

Li F, Xu G, Cao L. 2016. Two-level matrix factorization for recommender systems. Neural
Computing and Applications 27(8):2267–2278 DOI 10.1007/s00521-015-2060-3.

Li G, Chi M. 2018. Expert CF: solving data matrix sparsity and computation complexity
problems. Transactions on Machine Learning and Artificial Intelligence 6(2):36–36.

Lin J, Li Y, Lian J. 2020. A novel recommendation system via L0-regularized convex
optimization. Neural Computing and Applications 32(6):1649–1663
DOI 10.1007/s00521-019-04213-w.

Mirbakhsh N, Ling CX. 2013. Clustering-based factorized collaborative filtering. In:
Proceedings of the 7th ACM conference on Recommender systems. New York: ACM,
315–318.

Nguyen L, Do M-PT. 2018. A novel collaborative filtering algorithm by bit mining
frequent itemsets. PeerJ Preprints 6:e26444v1.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 24/25

https://peerj.com
http://dx.doi.org/10.1166/asl.2018.10762
http://dx.doi.org/10.1016/j.ipm.2020.102310
http://dx.doi.org/10.1145/1345448.1345465
http://dx.doi.org/10.3390/app10072540
http://dx.doi.org/10.3390/info9060143
http://dx.doi.org/10.1007/s13278-020-0626-2
http://dx.doi.org/10.7717/peerj-cs.63
http://dx.doi.org/10.1007/s00521-015-2060-3
http://dx.doi.org/10.1007/s00521-019-04213-w
http://dx.doi.org/10.7717/peerj-cs.331


Nilashi M, Ahani A, Esfahani MD, Yadegaridehkordi E, Samad S, Ibrahim O, Sharef
NM, Akbari E. 2019. Preference learning for eco-friendly hotels recommendation:
a multi-criteria collaborative filtering approach. Journal of Cleaner Production
215:767–783 DOI 10.1016/j.jclepro.2019.01.012.

Nilashi M, Bin Ibrahim O, Ithnin N. 2014. Hybrid recommendation approaches
for multi-criteria collaborative filtering. Expert Systems with Applications
41(8):3879–3900 DOI 10.1016/j.eswa.2013.12.023.

Nilashi M, Jannach D, bin Ibrahim O, Ithnin N. 2015. Clustering-and regression-based
multi-criteria collaborative filtering with incremental updates. Information Sciences
293:235–250 DOI 10.1016/j.ins.2014.09.012.

Rabiu I, Salim N, Da’u A, Osman A. 2020. Recommender system based on temporal
models: a systematic review. Applied Sciences 10(7):2204 DOI 10.3390/app10072204.

Sardianos C, Ballas Papadatos G, Varlamis I. 2019. Optimizing parallel collaborative fil-
tering approaches for improving recommendation systems performance. Information
10(5):155 DOI 10.3390/info10050155.

Vo ND, Hong M, Jung JJ. 2020. Implicit stochastic gradient descent method for cross-
domain recommendation system. Sensors 20(9):2510 DOI 10.3390/s20092510.

Wang J, Han P, Miao Y, Zhang F. 2019. A collaborative filtering algorithm based
on svd and trust factor. In: 2019 international conference on computer, network,
communication and information systems (CNCI 2019). Atlantis Press,.

Wu X, Yuan X, Duan C, Wu J. 2019. A novel collaborative filtering algorithm of machine
learning by integrating restricted Boltzmann machine and trust information. Neural
Computing and Applications 31(9):4685–4692 DOI 10.1007/s00521-018-3509-y.

Yang D, Chen T, Zhang W, Yu Y. 2012. Collaborative filtering with short term pref-
erences mining. In: Proceedings of the 35th international ACM SIGIR conference on
Research and development in information retrieval. New York: ACM, 1043–1044.

Ye F, Eskenazi J. 2014. Feature-based matrix factorization via long-and short-term
interaction. In: Knowledge engineering and management. Springer, Berlin: Springer,
473–484.

Yuan Y, Zahir A, Yang J. 2019. Modeling implicit trust in matrix factorization-based
collaborative filtering. Applied Sciences 9(20):4378 DOI 10.3390/app9204378.

Zainal N, Al-Hadi IAA-Q, Ghaleb SM, Hussain H, Ismail W, Aldailamy AY. 2020.
Predicting MIRA patients’ performance using virtual rehabilitation programme
by decision tree modelling. In: Recent advances in intelligent systems and smart
applications. Cham: Springer, 451–462.

Zhang F, Qi S, Liu Q, Mao M, Zeng A. 2020a. Alleviating the data sparsity problem of
recommender systems by clustering nodes in bipartite networks. Expert Systems with
Applications 149(2020):1–10.

Zhang L, Wei Q, Zhang L, Wang B, Ho W-H. 2020b. Diversity balancing for two-
stage collaborative filtering in recommender systems. Applied Sciences 10(4):1257
DOI 10.3390/app10041257.

Al-Hadi et al. (2020), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.331 25/25

https://peerj.com
http://dx.doi.org/10.1016/j.jclepro.2019.01.012
http://dx.doi.org/10.1016/j.eswa.2013.12.023
http://dx.doi.org/10.1016/j.ins.2014.09.012
http://dx.doi.org/10.3390/app10072204
http://dx.doi.org/10.3390/info10050155
http://dx.doi.org/10.3390/s20092510
http://dx.doi.org/10.1007/s00521-018-3509-y
http://dx.doi.org/10.3390/app9204378
http://dx.doi.org/10.3390/app10041257
http://dx.doi.org/10.7717/peerj-cs.331