key: cord-0028662-4b0kdtmf
authors: Hao, Qingbo; Zhu, Ke; Wang, Chundong; Wang, Peng; Mo, Xiuliang; Liu, Zhen
title: CFDIL: a context-aware feature deep interaction learning for app recommendation
date: 2022-03-16
journal: Soft comput
DOI: 10.1007/s00500-022-06925-z
sha: c47c066abbe4488b330d7ca0c389e78a15bbf2d1
doc_id: 28662
cord_uid: 4b0kdtmf

The rapid development of mobile Internet has spawned various mobile applications (apps). A large number of apps make it difficult for users to choose apps conveniently, causing the app overload problem. As the most effective tool to solve the problem of app overload, app recommendation has attracted extensive attention of researchers. Traditional recommendation methods usually use historical usage data to explore users’ preferences and then make recommendations. Although traditional methods have achieved certain success, the performance of app recommendation still needs to be improved due to the following two reasons. On the one hand, it is difficult to construct recommendation models when facing with the sparse user–app interaction data. On the other hand, contextual information has a large impact on users’ preferences, which is often overlooked by traditional methods. To overcome the aforementioned problems, we proposed a context-aware feature deep interaction learning (CFDIL) method to explore users’ preferences and then perform app recommendation by learning potential user–app relationships in different contexts. The novelty of CFDIL is as follows: (1) CFDIL incorporates contextual features into users’ preferences modeling by constructing novel user and app feature portraits. (2) The problem of data sparsity is effectively solved by the use of dense user and app feature portraits, as well as the tensor operations for label sets. (3) CFDIL trains a new deep network structure, which can make accurate app recommendation using the contextual information and attribute information of users and apps. We applied CFDIL on three real datasets and conducted extensive experiments, which shows that CFDIL outperforms the benchmark methods.

With the vigorous development of the mobile Internet, the number of mobile applications (app) has increased dramatically, which provides great convenience to people's production and life. According to statistics, 1 as of 2020, there are more than 3.3 million and 2.1 million apps published on Google Play and App store, respectively. Facing the huge amount of apps, users often unable to accurately find the app that really meets their needs, i.e., users cannot solve the overload of apps. Therefore, it is of great significance to help users to personalize and accurately select apps that meet their needs.

As the most effective tool to solve the overload problem, recommender systems are widely used in news media, online shopping and social networking sites. Traditional recommen-dation method leverages users' historical usage information to explore their preferences, and then make recommendations. The current mainstream method not only uses the historical user's feedback on apps, such as user's rating, comment, frequency of use, but also introduces the additional information of users or apps, such as the users' own attributes, apps' version information, category information. As a result, the recommendation performance can be further improved by building recommendation model based on the above information.

In recent years, deep neural network techniques have achieved great success in computer vision and natural language processing. Many researchers attempt to capture the relationships between users and items using deep neural networks to improve the performance of recommendation model. Even though progress has been gained, most deep recommendation models mine users' preferences through user-item interactions. The effects of spatiotemporal information on users' preferences have not been considered, i.e., research on the space-time laws of users and apps is not enough.

In conclusion, the existing recommendation methods have two problems as follows: (1) context information has much influence on users' selections to apps. However, existing methods considered only additional information about users and apps, not context information.

(2) In the real world, userapp interaction data are very sparse. It is difficult to build recommendation model using spare data. In order to address these challenges, we proposed a context-aware feature deep interaction learning for app recommendation, called CFDIL.

CFDIL constructs feature portraits of users and apps, respectively, using context information and attribute information of users or apps. These feature portraits can describe exactly attributes and interactive features of users or apps, which is helpful for mining users' preferences in specific contexts. By introducing convolutional neural network and factorization machine to extract in-depth features of users and apps, CFDIL constructs deep learning framework based on these feature portraits to explore potential interactive features of users and apps in special contexts. Moreover, feature portraits enrich characterization data of users and apps, meanwhile, CFDIL processes sparse label set using tensor decomposition, which can effectively avoid the adverse effects on our model of sparse data to improve the performance of the recommendation method. The main contributions of this paper are as follows.

1. CFDIL proposes a method to construct contextual features of users and apps, and integrates the contextual features with the features of users or apps to form a feature portrait, respectively. Feature portraits can accurately describe the characteristics of users, apps and user-app interactions. Incorporating contextual information in fea-ture portrait provides powerful information support for mining users' preferences of using apps in specific contexts. 2. CFDIL trains a novel deep network framework for feature portraits. By introducing convolutional neural networks (CNN) and factorization machine (FM), CFDIL effectively extracts the deep features of users and apps to explore the potential interactions in specific contextual conditions, which can provide more accurate recommendations for users. 3. The feature portraits of users and apps effectively enrich the representation data of users and apps, and make the model input denser. At the same time, CFDIL uses tensor factorization technique to process the label set for model back-propagation and weight updates. The use of dense feature portrait and label effectively avoid the adverse effects of sparse data on model learning. 4. We deployed CFDIL on three real datasets and implemented a large number of experiments. The experimental results show that CFDIL achieves state-of-the-art performance.

In this section, we introduce the related work of this paper. First, we introduce the classification of traditional app recommendation methods and then introduce the app recommendation methods based on deep learning.

The traditional app recommendation methods mainly include the following categories.

-Collaborative filtering-based recommendation method (CF-based method). CF-based method is a similarity-oriented recommendation method. This method is based on the assumption that a target user has similar preferences with users who have similar historical item experiences. Therefore, the most important part in CF-based method is to calculate the similarity between users or apps. The basic similarity construction method in CF-based method is based on user-app interaction matrix to calculate the similarity of users or apps. For instance, Kim et al. (2013) identify the most similar social members of target users based on semantic relations between apps, and then make app recommendations. Yankov et al. (2013) identify and analyze the relationship between apps in the apps ecosystem. Xia et al. (2014) leverage the app description text to calculate the similarity between apps, and then make app recommendations. Similar to Xia et al. (2014) , Hao et al. (2016) also use the description of app to calculate the similarity between apps. Liu and Wu (2016) leverage users' log of using apps to design a latent factor-based collaborative filter method. Hu et al. (2018) leverage the idea of userbased collaborative filtering to make recommendations. CF-based recommendation method has been widely used in many fields, including app recommendation, because of its simple logic and easy implementation. However, in the real world, user-app interaction matrix unable to reflect users' preferences due to it is extremely spare, which leads to poor performance. Aiming at the problem of data sparsity in app recommendation, the model-based app recommendation method is proposed. -Model-based app recommendation method.

The representative method among model-based recommendation methods is matrix factorization-based method (MF-based method). The basic idea of this method is as follows. First, MF-based method constructs an original user-app interaction matrix. The elements in the original matrix are the feedback information of a specific user for a specific app, such as rating, usage frequency. The matrix is a sparse matrix, which is similar with the user-app matrix in CF-based method. Then, the sparse user-app original matrix is processed into a non-empty matrix by matrix factorization technique. The non-empty elements in this matrix are considered to be a user's preference for an app. For example, Liu et al. (2013) give the introduction about MF-based app recommendation. Lin et al. (2014) leverage the probabilistic matrix factorization to explore users' preferences. Zhu et al. (2014a) leverage latent dirichlet allocation (LDA) model to map the interaction contextual information between users and apps into low-dimensional space and then make recommendation. This method is the same as MFbased recommendation. Yao et al. (2017) construct a user-app version rating matrix, and use matrix factorization method to explore users' preferences. Although MF-based method can resist the adverse effects of data sparsity on recommendation, the low order vectors generated in the processing of matrix factorization has no clear physical meaning, which leads to poor interpretability. This also leads to the lack of personalization in MF-based method. At the same time, due to the fact that apps are easy to developed, and new apps are constantly generated, and usually need to be added to the user-app interaction matrix for retraining, which also leads to the lack of scalability of MF-based method. -The additional information-based recommendation method.

There is lots of additional information in interaction between users and apps. For example, the contextual information of users using the app (Zhu et al. 2014a, b; Pu et al. 2018; Wang et al. 2016) , user comments on apps (Zheng et al. 2014; Fu et al. 2013) , app version information (Yao et al. 2017) , app permission information (Liu et al. 2015) , app description information (Chen et al. 2015) , etc. The rich additional information can effectively supplement the user-app interaction matrix, so as to improve the performance of app recommendation.

In recent years, deep learning technique has achieved great success in image recognition, natural language processing, speech recognition and other fields because of its excellent nonlinear expression ability, which can automatically learn the potential relationships in features. Deep learning technique emphasizes learning from massive data, which solves the problem that traditional machine learning algorithms are difficult to deal with high-dimensional, heterogeneous and noisy data. At the same time, researchers also explore the application of deep learning technique in recommender system. For example, due to deep learning technique has powerful ability to mine the potential interaction features, Cheng et al. (2016) , Guo et al. (2017) , Shan et al. (2016) based on the idea of features interaction to explore the combination of different features between users and items, in order to mine users' preferences. The difference of these methods is that Cheng et al. (2016) explore the influence of the depth and width combination between users and apps on target users' choice. Guo et al. (2017) use the factorization machine to mine the low-level interaction information between users and app features, and use DNN to mine the deep interaction information between features, so as to achieve the purpose of users' preference mining. Unlike the above two methods, Shan et al. (2016) directly embed the features of users and apps, MLP network to reduce a lot of artificial feature engineering. Harada et al. (2019) propose CNCF to recommend game apps for users, which leverages contextual information to enhance the recommendation performance. Kim et al. (2016) leverage CNN model to extract users' preferences from their comments on items. Xu et al. (2019) incorporate contextual information into deep learning model to explore users' preferences. Bobadilla et al. (2020) proposed a deep neural architecture based on classification. Notably, its collaborative filtering method can be generalized to most of the existing recommender systems. Liang et al. (2020) explore the features of user-app interactions which models the interactions of features from different views through the attention mechanism.

In this section, we give the general definition of the problem to be studied in this paper at first, and then give the motivation of CFDIL.

The goal of app recommendation is to recommend apps for target users, which meet their preferences under a specific contextual condition. Without loss of generality, for a given user set U and an app set A, our task is to make an app recommendation list R u for u, u ∈ U , where R u = {a|u, a ∈ A, u ∈ U }. The a in the recommendation list R u conforms the preferences of user u.

The recommendation is based on the probability of the target user's preference for an app under a specific context. Therefore, to recommend apps for users, we first need to calculate the probability of the preferences of all apps in A for the target user u under specific context conditions, and then select the top-k apps with the highest preference probability for recommendation. In this process, the prediction of the target users' preferences probability for a given app is particularly critical, which directly determines the accuracy of the recommendation.

Next, we give the motivation of the proposed CFDIL.

Most of the existing app recommendation methods use userapp interaction vectors to explore user's preferences, and then make recommendations. By combing real datasets, we found that user-app interactions show highly aggregated characteristics in spatiotemporal dimensions. However, the existing recommendation methods often ignore this point, so we hope to integrate the spatiotemporal information into feature matrix to improve the accuracy of recommendation. In this paper, we refer to the combination of time and location information as contextual information.

We visualize the spatial characteristics of user-app interactions at a given time slot in the form of a heat map. We leverage a scatter chart to show the temporal characteristics of user-app interactions. Figure 1 (a) shows the heat map of the number of times a user used apps in a certain time slot and (b) shows the heat map of the number of times an app was used in a certain time slot. Figure 2 (a) shows the number of times a user used apps in different time periods and (b) shows the number of times an app was used in different time slots. It is easy to find that the temporal and spatial interaction between users and apps are always aggregative, which indicates that contextual information has strong relevance to users and apps. We present several examples to explain context information. For a user, his/her daily life is regular in the long term, no matter he/she is a worker, retiree or teenager. Such regularity can be reflected by context of users using apps. For example, a worker often signs in an app at the workplace and uses food-ordering apps at lunchtime; a retiree often reads news using an app after breakfast. For an app, it always appears in a context accord with its function, i.e., apps also have their regularities. For example, food-ordering apps are often used at lunchtime, and Metro apps are often used in subway stations. Therefore, we construct their feature portraits for users and apps to show their temporal and spatial regularities. We believe that introducing context information to our model will be helpful to improve the accuracy of preference prediction.

The existing app recommendation methods are based on user-app interaction vector for preference modeling. It should be noted that the user-app interaction information in the real world is extremely sparse, and the recommendation model is difficult to effectively explore user preferences by using sparse data. Inspired by the above discussion, the incorporation of contextual information can enrich and supplement the feature matrices of users and apps to alleviate the problem of data sparseness. Based on the aforementioned consideration, we proposed a context-aware feature deep interaction learning (CFDIL) method to improve the app recommendation performance. 

The feature portrait of a in t Lab u,a,t Number of times user u interacts with app a in time slot t

The fields set of P a t R u The app recommendation list for user u Fig. 3 The overall framework of CFDIL. CFDIL consists of four main parts: information extraction and integration, label processing, multi-order interaction learning of feature portraits, making app recommendation

To facilitate the clear presentation of this paper, we give the notations and descriptions of the symbols used in CFDIL, as shown in Table 1 .

In this section, we first introduce the framework of CFDIL and then introduce each module of CFDIL in detail.

The overall flow of CFDIL is shown in Fig. 3 , which consists of four main parts.

1. Information extraction and integration. In this part, CFDIL first extracts and constructs the feature portraits of users and apps, which both contain two parts: (1) the own attribute information of users and apps and (2) the contextual information of users and apps (spatial characteristic in a certain time slot).

2. Label processing. CFDIL constructs a three-dimensional tensor with user, app and time as coordinates. The elements in the tensor are the number of interactions between users and apps in a certain time slot. To further explore users' preference for untouched apps and to solve the problem of label data sparsity, this section performs tensor factorization on labels. 3. Multi-order interaction learning of feature portraits. CFDIL constructs new networks by Factorization machine (FM) and convolutional neural network (CNN) to learn multi-order and deep potential interaction features of users and apps. Multi-order learning is performed on user and app feature portraits to effectively explore users' preferences. 4. Making app recommendation. Based on the trained model, CFDIL can accurately predict target users' preferences in specific contexts and complete app recommendations.

Next, we will describe the key technologies used in each stage in detail.

This part is to construct users and apps feature portraits, and mainly contains two sub-steps, information extraction and information integration. Figure 4 shows the overall construction process of app portraits and user portraits. Next, we describe these two steps in detail, respectively.

We mainly extract two kinds of information from users and apps, attribute information and contextual information. The detailed extraction method is as follows. 1. According to Chen et al. (2015) , the attribute information of users and apps has great impact on users' preferences. Therefore, we select users' attribute information, such as gender, age and device model, and apps' attribute information, such as category, and developer, a total of five kinds of attribute information.

In order to facilitate the training of the subsequent model, we represent the attribute information of users and apps with the same dimension and length vector. In this paper, we map the attribute information of users (Table 3) or apps (Table 4 ) into an attribute vector V A, V A ∈ R 200×1 , respectively. 2. We also extract the contextual information of users and apps, in addition to their own attribute information.

The contextual information mainly refers to time and location information, which can be found in Table 5 . We use the following steps to extract contextual information of users and apps.

-We first split the day into 7 time periods T = {00 : 00 − 06 : 00, 06 : 00 − 09 : 00, 09 : 00 − 11 : 00, 11 : 00 − 13 : 00, 13 : 00 − 17 : 00, 17 : 00 − 19 : 00, 19 : 00 − 24 : 00}. We split the geographic region which contains of all interaction between users and apps into a matrix V C, V C ∈ R 200×199 according to longitude and latitude, i.e., we map this geographic region into a matrix which contains 200 × 199 geographic cells. -For each app and user in a certain time slot, we construct app contextual matrix and user contextual matrix, respectively. For an app, we count the historical number of times the app has been used by all users, and fill the value into the corresponding cell in V C. For a user, we use the same method to construct the user contextual matrix. The difference is that the elements in user contextual matrix are the historical number of times the user used apps.

For each app and user in each time slot, we integrate the attribute information and contextual information of users and apps, respectively, to construct the portrait of apps and users.

For an app a, we connect the app's contextual matrix V C with its attribute vector V A to form an app feature portrait P a t , P a t ∈ R 200×200 , t ∈ T , a ∈ A, as shown in Fig. 4 . Similarly, for a user u, we use user's contextual matrix V C and user's attribute vector V A of u to construct the user's feature portrait

The data used to train app recommendation model are

where P u t is the feature portrait of user u in t, P a t is the feature portrait of app a in time slot t, and Lab u,a,t is the historical interaction number between user u and app a in t. Generally speaking, Lab u,a,t is used as a label for model training. However, there are two problems for Lab u,a,t applied to train app recommendation model:

1. Lab u,a,t only represents the selection result of user u in a given context, and cannot equivalently represent user's preference. For example, Lab u,a,t = 0 means that u did not use a during t, but this does not mean that u does not like a. Because user u may not be aware of the existence of app a. 2. For a large number of apps, people will only use a few apps. The vast majority of Lab u,a,t is 0, i.e., the label data are extremely sparse. The sparse label data make the recommendation model unable to fully perceive the positive feedback label data, which leads to the weak generalization ability of the model recommendation.

To solve the above two problems, we leverage tensor factorization technique to process Lab u,a,t . The detailed method is as follows:

1. We construct a tensor L A with user, app and time slot as coordinates. The elements in L A is Lab u,a,t , which is the historical number of times user u interacts with app a in time slot t. 2. We use the same method as Zhu et al. (2021) to decompose L A. The principle of the decomposition process is shown in Fig. 5 . After decomposition, we get a new tensor L A * with the same size as the original tensor L A. The difference is that the data in L A * is non-sparse.

3. We use the elements of tensor L A * to update the value of Lab u,a,t , and then use the updated value of Lab u,a,t to represent u's preference label for a in t.

CFDIL constructs user's portraits P u t and app's portraits P a t , and processes the sparse label data Lab u,a,t . In this part, we will describe how CFDIL uses P u t , P a t and Lab u,a,t for multiorder interactive learning in detail. CFDIL mainly contained two parts, FM part and CNN part. The overall framework of CFDIL is shown in Fig. 6 . Next, we will describe how CFDIL performs multi-order interactive learning in detail.

CFDIL uses FM to learn low-order features interaction between users and apps, as shown in the FM part in Fig.  6 .

The input data of FM are a two-tuple which formally expressed as {F u t ∪ F a t , Lab u,a,t }, where F u t and F a t are the field sets of P u t and P a t .

M is the number of attribute of P a t . In this paper, N is 4, F u t is {gender, age, device model, user contextual matrix}; M is 3, F a t is {category, developer, app contextual matrix}. Lab u,a,t is the label of F u t ∪ F a t , which is obtained in Sect. 4.3. We use an objective function to learn the low-order interaction features of users and apps, which is expressed as Eq. 1:

where x i and x j are the fields of apps and users portraits, Fig. 6 Details about multi-order interaction learning of feature portraits. CFDIL uses FM to learn low-order interaction features between users and apps, and uses CNN to learn the higher-order interaction features between users and apps of linear regression, and M+N i=1 M+N j=i+1 < V i , V j > x i x j describes its second-order combinatorial features, which are used to learn the interaction features between users and apps in different contexts. Since the FM algorithm uses all univariate and two-by-two feature interactions, it can effectively learn the low-order features interaction between users and apps. In Eq. 1, < ·, · > denotes the dot product of two vectors of dimension k, which is expressed as Eq. 2

and v j, f are the elements in V i and V j , respectively. k is the dimension of the implicit parameter. In order to reduce the time complexity of the model, the quadratic term of FM can be simplified according to Eq. 3.

i is a constant value, so we only need the nonzero terms of x i to train the objective function of Eq. 3. Therefore, the time complexity of the objective function of FM is O(k(M + N )), and FM model can quickly extract the low-order interactive features.

We transform recommendation task into exploring the probability of a target user's preference for an app, i.e., into a regression problem, so we select the loss function loss( y f , y f ) = ( y f − y f ) 2 .

In addition, we use stochastic gradient descent to train the parameters, and the gradient formula of the parameters is shown in Eq. 4.

In summary, we choose FM to explore the low-order interactive features between users and apps, because FM has the following advantages. (1) The complexity of FM model is linear, which can effectively train sparse user-app interaction data.

(2) The unique structure of FM makes the interactive learning of low-order features between users and apps more reasonable.

CNN model is mainly used to learn the higher-order interaction features of user portraits P u t and app portraits P u t , as shown in the CNN part in Fig. 6 . We choose CNN to extract the high-order interactive feature between users and apps, because CNN has the following three advantages. (1) The data of P u t and P a t are in matrix format, which is the same as the single-channel image data. CNN networks have a great advantage in processing image format data. (2) CNN can effectively explore deep features in image format data by using convolutional operations. (3) CNN uses a shared convolution kernel mechanism to reduce the complexity of model training. As a result, CNN can efficiently process the high-dimensional users and apps portrait data.

The input data of CNN are a two-tuple which formally expressed as {P u t P a t , Lab u,a,t }, where P u t P a t indicates that P u t and P a t in the same t-time slot are stitched up and down. Lab u,a,t is the label of P u t P a t , and the value of Lab u,a,t is equal to that described in Sect. 4.3.

The input layer of CNN is the start of the interactive feature learning of P u t and P a t . The weight is learned through the hidden layer, and the nonlinear segmentation ability of the network is enhanced with the help of the excitation function. By learning parameters and information transfer layer by layer, CNN can effectively learn the high-order interactive features of P u t and P a t . The main structure of CNN used for our experiments is:

Input layer − > convolutional layer − > activation layer − > · · · − > pooling layer − > · · · − > fully connected layer.

It should be noted that we performed an average pooling operation near the middle convolutional layers, which reduced features by half. The learning process of CNN is shown in Eq. 5.

where l represents the lth layer of the neural network. x (l) is the output of lth layer. W (l) and b (l) are the model parameters and deviations for lth layer. x (l+1) is the output of lth layer, and also used as the input of l + 1th layer. In order to avoid the inefficiency of error back-propagation and to avoid the problem of gradient explosion, ReLu is used as the activation function in our model. Since we consider user-app recommendation problem as a regression problem, we adopt the mean square error (MSE) as the loss function in the convolution neural network, expressed by Eq. 6.

CFDIL uses FM to explore the low-order interaction features of users and apps from the feature fields of {F u t ∪ F a t , Lab u,a,t }, and uses CNN to explore the high-order interaction features of users and apps from {P u t P a t , Lab u,a,t }. We weighted and summed the learning results of FM and CNN according to Eq. 7 as the final probability result of CFDIL.

where 0 and 1 are model parameters, satisfying 0 + 1 = 1.

In the subsequent experiments in this paper, setting the values of 0 and 1 to 0.65 and 0.35, respectively, will get the best recommendation performance. After the CFDIL model training is completed, we follow the steps below to recommend apps to users in specific contexts. (1) We construct a feature profile P u t for a target user u in time slot t. (2) We construct feature portraits P a t of all candidate apps in time slot t. (3) We input P u t and P a t into the trained CFDIL network and then get the recommendation probability of each app for user u. (4) We sort all candidate apps according to the recommended probability. The top N apps are selected based on the ranking to form the final recommendation list.

In this section, we deploy CFDIL on a real-world dataset to verify the recommendation performance of CFDIL. First, we give the default experimental settings and then evaluate the performance of CFDIL from different perspectives.

In this section, we first introduce the dataset of our experiment, and then give the evaluation metrics. Finally, we introduce the baseline methods for comparison.

The experimental data are log files of users using apps in real-world scenarios in three cities, Beijing, Shanghai and Guangzhou. The datasets contain a total of 8198 users, 2671 apps, and 405,837 user-app usage log records. Each app is used at least 10 times, and each user has at least 10 app usage logs. Each user has an average of 49.5 logs. The detailed information of the dataset is shown in Table 2 . For example, a user (User_id:2007, Gender: Male, Age: 23, Device: XiaoMi) uses a game app (App_id:147, Category: Game, Developer: Tencent) in a specific context (Time_stamp: 2018-07-15 20: 14, longitude: 113.287, latitude: 23.139) . All this information is contained in our dataset, as shown in Tables 3, 4 and 5.

In our experiments, we split the experimental data into training set, validation set and test set. The ratio of these three parts is 7:2:1. The training set is used to train CFDIL. The validation set is used to adjust the hyper-parameters of the model, including the number of layers in the CFDIL's neural network, the size and number of convolution kernels. The test set is used to verify the recommendation performance and generalization ability of CFDIL. During the test, we take apps used by users without using recommender systems as standard, to evaluate the recommendation performance. 

We recommend an app list for target users and evaluate it with two evaluation metrics: precision and recall. We recommend a top-N recommendation list to users, and the length of the recommendation list is N , so we choose Precisi-on@N and Recall@N to measure the proposed method. However, precision and recall are two related metrics, and when one goes down, it causes the other to go up. In order to consider the two metrics synthetically, we use the F α − measure-@N to measure the recommendation quality. Precision@N and Recall@N are defined as follows:

where T P is the intersection between the recommendation list and the ground truth, N is the length of the recommendation list, and M is the length of the ground truth. F α −measure@N is defined as follows:

where precision and recall are the results in Eqs. 8 and 9, respectively; α is used to balance recall and precision. Here, we choose 1 as the value of α, which means that recall and precision are equally important.

To get the best hyper-parameters of CFDIL, we use mean absolute error (MAE) and root mean squared error (RMSE) to adjust the model parameters. MAE and RMSE are two indicators that are widely used to measure the accuracy of recommender system. The detailed definitions of these two indicators are as follows:

where T in RMSE and MAE is the number of records in the validation set. y i is the ith prediction value of the model, and y i is the corresponding true value of the ith position.

We use the following methods to measure the performance of CFDIL. UCF (User-based Collaborative Filtering Method). UCF is a collaborative filtering method oriented to user vector similarity which is calculated based on users' rating data. We adapt UCF to our problem by the usage frequency of an app to explore users' preferences and generate the recommendation list. We leverage frequency information to find users who are similar to the target user and recommend apps that similar users have used but the target user has not used, to target users.

ICF (Item-based Collaborative Filtering method). ICF is a collaborative filtering method oriented to app vector similarity. We adapt ICF to our problem by using the following methods. First, we carry out matrixing process on the dataset according to the frequency of an app to form a user-app usage matrix. Then, we extract the app vector from the matrix and use the cosine similarity coefficient to build the app similarity model. Finally, we leverage the app similarity model to generate an app recommendation list for target users.

MF (Matrix Factorization). MF recommendation method is a classic model-based method in the recommendation domain. We input the user-app usage matrix by matrix factorization to obtain a non-empty matrix that contains the same information as the original matrix.

TF (Tensor Factorization). The aim of the model is to compute the factors for the user U n×d , item A r ×d and context C l×d matrices using historical usage data. TF method is also an MF-based recommendation method.

Similar to the MF method, we use the frequency of an app according to certain contextual information to denote the app usage information of a target user and construct a user-app-context tensor. Then, we use TF method to obtain a recommendation list. Here, the elements in the tensor are the app usage information at a certain time.

DNN (Deep Neural Networks). DNN model is a classic deep learning method. DNN model has strong nonlinear expression ability, which can learn the complex potential interactions between users and apps. We apply DNN to app recommendation. First, the portrait information of users and apps are used as the input of DNN. Then the interaction infor-mation of a target user and a target app is used as the label of model training. Finally, a DNN-based app recommendation model is trained.

CNN (Convolutional Neural Networks) . CNN is also a classic deep learning method. We apply CNN model to app recommendation. In order to train the app recommendation model based on CNN, we take the portraits of users and apps as the input of CNN model and take the interaction information of users and apps in a specific contextual information as the output of CNN model. DeepFM. DeepFM model is a novel deep learning-based recommendation model. The idea of this recommendation model is feature cross recommendation, i.e., leveraging the combination of different features of users and items to predict the recommendation probability. It combined by two deep models. One part is FM (factorization machine), which mainly learns the low-order features interaction between users and items. Another is deep model, which learns the high-order features interaction between users and items. We apply DeepFM to app recommendation by inputting the feature information of users and apps, and train a DeepFM model by using the interaction information as output label. Figure 7 shows the precision, recall and F-measure values of recommendation results of CFDIL and benchmark methods in 3 different city datasets and in different recommendation list lengths. We can get the following conclusions from the results.

is unacceptable. This is because the interaction data between users and apps is sparse, which hinders the effective construction of users and apps vectors by these two methods. Sparse vectors of users and apps lead to poor performance. Besides, these two methods do not consider the impact of contextual information on users' choice of apps. 2. The recommendation performance of MF and TF is better than UCF and ICF, but their recommendation results are still poor. This is because MF and TF have relative advantages in combating sparsity, which can help the model deal with sparse user and app vectors more effectively. However, since these two models are naturally not personalized, their recommendation lack generalization capabilities. In addition, the result of TF is better than that of MF because TF considers context information. 3. Deep learning-based models show good recommendation performance. Among them, CNN has better recommendation performance than DNN. Both CNN and DNN use deep models to extract the deep latent interac- Fig. 7 The comparisons of recommendation performance between CFDIL and benchmark methods. There are seven benchmark methods: ICF, UCF, MF, TF, DNN, CNN, and DeepFM tions between users and apps. DNN leverages fully connected method to learn deep interaction information between users and apps, while CNN uses convolution kernels to extract deep interaction information between users and apps. DNN's fully connected structure makes a lot of unnecessary information in user and app matrix to be added to the interaction process. Excessive information doping interferes with the ability of DNN to mine users' preferences. CNN uses convolution kernels to effectively eliminate interference information, so that the model can mine users' preferences more efficiently. The performance of DeepFM is better than DNN and CNN. In addition to using the deep part to learn high-order feature interactions, DeepFM also uses FM part to learn low-order interaction information between users and apps. Thus, the disadvantages of the above two methods are eliminated by DeepFM. 4. The recommendation performance of CFDIL is better than all benchmark methods. The reasons are as follows: (1) CFDIL considers contextual information in the matrix construction process, which helps the model to judge users' preference.

(2) The FM part in CFDIL fully expresses the low-order interaction features between users and apps.

(3) The CNN part of CFDIL effectively explores interaction features of between users and apps under contextual conditions.

In this section, we mainly show some exploration of CNN parameters in CFDIL. We leverage datasets from three cities, Beijing, Shanghai and Guangzhou, to determine the number of layers of CFDIL and the number of convolution kernels in each layer. Inspired by He et al. (2016) , the size of CFDIL convolution kernels used in convolution layers is 3 × 3. The number of convolution kernels is set according to the following principle: when the portrait size is halved, the number of convolution kernels should be doubled to ensure the complexity of learning.

We apply CFDIL on three cities dataset and set the convolutional layers of CFDIL to {2, 4, 6, 8, 10, 12, 14} and set the initial convolution kernels to {8, 16, 32, 64, 128, 256, 512} . We use MAE and RSME to evaluate the model performance to determine the hyper-parameters of CFDIL. The hyper-parameters that minimize MAE and RSME are the best.

We use the dataset of Beijing City as the representative to show the experimental results, as shown in Fig. 8 . As can be seen from Fig. 8 , CFDIL performs best when the number of convolutional layers is 8 and the initial number of convolution kernels is 128.

In addition, we use Fig. 9 to show the performance of CFDIL with different number of convolutional layers when 

We use the feature matrices of users and apps as the input of CFDIL and train a new model named CFDIL-I. The difference between CFDIL-I and CFDIL is that CFDIL-I does not consider the contextual matrices of users and apps. Then we compare the Precision@N of CFDIL and CFDIL-I to judge the validity of the proposed contextual matrices of users and apps. Figure 11 shows the Precision@N results between CFDIL and CFDIL-I. It can be seen from the figure that the Pre-cision@N of CFDIL is significantly better than these of CFDIL-I under all recommendation lists. This is because CFDIL constructs user and app portraits that fully consider contextual information, which can explore users' preferences more accurately.

First, we use portraits of users and apps as input and eliminate the CNN part of CFDIL to get a new model, named CFDIL-C. Then we compare the Precision@N of CFDIL and CFDIL-C on three cities datasets to judge the effectiveness of the CNN part of CFDIL. Figure 12 shows the experimental results. It can be seen from the experimental results, the Precision@N of CFDIL are better than these of CFDIL-C. The main reason is as follows. CNN has a strong ability to extract features from two-dimensional data, which has been verified in the field of image processing. The user and app portraits we construct are mainly composed of two-dimensional geographic information generated by interaction between users and apps in a specific time slot. This data structure is consistent with the image data structure. CNN can effectively extract interaction features from the two-dimensional data by using convolution kernel mechanism. The experimental results show that CNN can effectively extract the portrait features of users and apps, and efficiently mine users' preferences under specific contextual conditions.

First, we use portraits of users and apps as input and eliminate the FM part in CFDIL to get a new model, named CFDIL-F. Then, we compare the Precision@N of CFDIL and CFDIL-F to judge the effectiveness of the FM part of CFDIL.

The experimental results show that CFDIL is more effective than CFDIL-F, which indicates that the FM part of CFDIL also plays an important role in mining users' preferences. The network structure of FM is similar to the FM part of DeepFM model that has been successful in CTR field. The difference between the two FMs is the type of problem to be solved. The application field of DeepFM is click-through rate prediction, which is a classification problem. The FM part we designed is mainly to help CFDIL predict users' preference probability, which is a regression problem. The role of FM part is to help the model obtain the low-order feature cross information of users and apps effectively, so that the model can comprehensively consider the low-order cross features and high-order potential interaction features of users and apps. 

The tensor factorization model in CFDIL is used to process label data. We eliminate the tensor factorization model in CFDIL to get a new model, named CFDIL-T. The difference between CFDIL and CFDIL-T is that the label data of CFDIL-T is extremely unbalanced and sparse. We compare the Precision@N of CFDIL and CFDIL-T to judge the validity of the tensor model in CFDIL for label processing.

The experimental results are shown in Fig. 14. It can be seen from Fig. 14 that the performance of CFDIL is better than CFDIL-T. The experimental results show that the pro-posed tensor model in CFDIL can effectively handle sparse label data.

The sparse label data of user-app interactions make the training data extremely unbalance. As a result, CFDIL-T cannot fully receive positive user-app feedback data during the training process. The tensor model added in CFDIL can make the label data in user-app-context tensor smoother. The tensor model in CFDIL decomposes the label data, so that even if a user has not touched an app (the original label is 0), the label data of the corresponding user and app will get a nonzero value. The nonzero elements in the user-app-context tensor represent the probability that the user will use the app in a spe- 

In this paper, we proposed the recommendation framework for mobile apps based on contextual feature profiling (CFDIL). As far as we know, this is the first attempt to use contextual feature portraits to explore deep user-app interactions. CFDIL uses contextual feature matrices and the features of users and apps to form feature portraits. Based on these portraits, a deep network framework is trained to provide more accurate recommendations for users, by using decomposers and convolutional neural networks to mine the multi-order interaction between users and apps under specific contextual conditions. We conducted extensive experiments on real-world datasets to prove the effectiveness of CFDIL and the value of each step.

CFDIL constructs feature portraits of users and apps, respectively, using context information and attribute information of users or apps. These feature portraits including space-time laws and attribute features of users and apps. Through a series of experiments, it was proved that these feature portraits are very effective for exploring users' preferences in specific contexts. In CFDIL, FM network can learn shallow features of users and apps, and CNN is more suitable for deep mining of interactive features, which is beneficial to improve the performance of recommendation model. The introduction of feature portraits, combined with the TF processing of labels can resolve the problem of data sparsity to some extent. In addition, CFDIL mines the features of real long-term interactions through feature portraits, which can avoid the problem of unfair recommendations (D'Angelo et al. 2019).

However, there still exist the following shortcomings in CFDIL: (1) CFDIL depends on feature portraits of users and apps to complete recommendation. But these portraits are macro and stable, and sudden events were unable to display. For example, people need to work at home due to the sudden outbreak of the COVID-19, which cannot be reflected in these portraits. This is because users have never done this activity in this context, making the performance of CFDIL not good enough. However, with the frequency of home office increases, this feature will gradually emerge to improve recommendation performance. How to tackle the problem of hysteresis in CFDIL will be our research direction in future. Our preliminary idea is that we introduce attention mechanism to emphasize the importance of recent behavior to shorten the hysteresis of CFDIL as possible. But we know the introduction of attention mechanism will emphasize sudden events, which may have a detrimental effect on the robustness of the model. Therefore, it is important to find the balance between the two. (2) We learn user-app interactions by context information. But the essence of these interactions is diverse instead of single. For example, users play games selectively in free time (active), but it is necessary to punch in (passive). Active choice can better reflect users' preferences; passive choice is more stable. However, this distinction is not drawn in CFDIL, but should be treated equally. All of this need us to do further research and exploration.

Classificationbased deep neural network architecture for collaborative filtering recommender systems

SimApp: a framework for detecting similar mobile applications by online kernel learning

Wide & deep learning for recommender systems

Detecting unfair recommendations in trust-based pervasive environments

Why people hate your app: making sense of user feedback in a mobile app store

Deepfm: a factorizationmachine based neural network for ctr prediction

Global and personal app networks: characterizing social relations among mobile apps

Contextregularized neural collaborative filtering for game app recommendation

Deep residual learning for image recognition

A user similarity-based top-n recommendation approach for mobile in-application advertising

Convolutional matrix factorization for document context-aware recommendation

Recommendation algorithm of the app store by using semantic relations between apps

Multi-view factorization machines for mobile app recommendation based on hierarchical attention

Personalized news recommendation via implicit social experts

Personalized mobile app recommendation: Reconciling app functionality and user privacy preference

Large-scale recommender system with compact latent factor model

A survey of context-aware mobile recommendations

A sequential recommendation for mobile apps: what will user click next app?

Deep crossing: web-scale modeling without manually crafted combinatorial features

A contextual collaborative approach for app usage forecasting

Multi-objective mobile app recommendation: a system-level collaboration approach

Leveraging app usage contexts for app recommendation: a neural approach

Interoperability ranking for mobile applications

Version-aware rating prediction for mobile app recommendation

Personalized recommendation based on review topics

Mining mobile user preferences for personalized context-aware recommendation

Popularity modeling for mobile apps: a sequential approach

Incorporating contextual information into personalized mobile applications recommendation

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Author Contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by QH, KZ and CW. QH and KZ contribute the same to this article. The first draft of the manuscript was written by QH and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.Funding The authors have not disclosed any funding.Data availability Enquiries about data availability should be directed to the authors.

The authors declare that they have no conflict of interest.Human participants or animals This article does not contain any studies with human participants or animals performed by any of the authors.