Hybrid recommendations for mobile commerce based on mobile phone features Hybrid recommendations for mobile commerce based on mobile phone features Chuen-He Liou and Duen-Ren Liu Institute of Information Management, National Chiao Tung University, Hsinchu, Taiwan Email: dliu@iim.nctu.edu.tw Abstract: Mobile data communications have evolved as the number of third generation (3G) subscribers has increased. The evolution has triggered an increase in the use of mobile devices, such as mobile phones, to conduct mobile commerce and mobile shopping on the mobile web. There are fewer products to browse on the mobile web; hence, one-to-one marketing with product recommendations is important. Typical collaborative filtering (CF) recommendation systems make recommendations to potential customers based on the purchase behaviour of customers with similar preferences. However, this method may suffer from the so-called sparsity problem, which means there may not be sufficient similar users because the user-item rating matrix is sparse. In mobile shopping environments, the features of users’ mobile phones provide different functionalities for using mobile services; thus, the features may be used to identify users with similar purchase behaviour. In this paper, we propose a mobile phone feature (MPF)-based hybrid method to resolve the sparsity issue of the typical CF method in mobile environments. We use the features of mobile phones to identify users’ characteristics and then cluster users into groups with similar interests. The hybrid method combines the MPF-based method and a preference-based method that uses association rule mining to extract recommendation rules from user groups and make recommendations. Our experiment results show that the proposed hybrid method performs better than other recommendation methods. Keywords: mobile web, one-to-one marketing, product recommendation, collaborative filtering, mobile phone features, association rules 1. Introduction In the last decade, mobile communications have evolved from 2G=2.5G to 3G=3.5G. As a result, the data transfer rate has been progressively upgraded from 64 Kbps (2.5G=GPRS) to 384 Kbps (3G=WCDMA), and 3.5 Mbps (3.5G= HSDPA), which is comparable to the wired Internet. The evolution has triggered an increase in the use of mobile devices, such as mobile phones, to conduct mobile commerce (m- commerce) on the mobile web (Chae & Kim, 2003; Venkatesh et al., 2003; Ngai & Guna- sekaran, 2007). M-commerce covers a large num- ber of services, one of which is mobile shopping. Retailers have also increased their investment in mobile shopping channels to deliver dedicated products, content and promotions to customers. Recommender systems have emerged in m-commerce or e-commerce applications to support product recommendation, which pro- vide individual product recommendation for each customer. Recommender systems assist business in implementing one-to-one marketing strategies, relying on customer purchase history to determine preferences and identify products that a customer may purchase. Recommender DOI: 10.1111/j.1468-0394.2010.00566.x Article _____________________________ c� 2010 Blackwell Publishing Ltd Expert Systems, May 2012, Vol. 29, No. 2 108108 c� 2010 Blackwell Publishing LtdExpert Systems, May 2012, Vol. 29, No. 2 systems increase the probability of cross-selling, establish customer loyalty and fulfill customer needs by discovering products in which they may be interested in purchasing (Schafer et al., 2001). Recommender systems are widely used to recommend various items, such as consumer products, movies and music, to customers based on their interests (Hill et al., 1995; Shardanand & Maes, 1995). Generally, recommender sys- tems can be classified as collaborative or con- tent-based filtering techniques. Collaborative filtering (CF), which has been used successfully in various applications, utilizes preference rat- ings given by customers with similar interests to make recommendations to a target customer (Resnick et al., 1994; Linden et al., 2003; Lee, 2004; Cho et al., 2005; Liu & Shih, 2005). In contrast, content-based filtering derives recom- mendations by matching customer profiles with content features (Mooney & Roy, 2000; Martinez et al., 2007). A number of product recommendation sys- tems have been developed for m-commerce on the mobile web (Kim et al., 2004; Choi et al., 2007; Lee & Park, 2007). For example, VISCOR is a mobile recommender that combines colla- borative and content-based filtering to provide better wallpaper recommendations (Kim et al., 2004). MCORE considers users’ context data to recommend mobile services (Choi et al., 2007). In addition, mobility information about user locations obtained from global positioning sys- tems (GPS) is usually used in m-commerce. A number of mobile recommendation systems use customers’ mobility patterns to make recommen- dations (Brunato & Battiti, 2003; Yang et al., 2008). Mobile phone features (MPF) such as Bluetooth and card slots have been used as product attributes to recommend mobile phone products. iTVMobi recommends mobile phone products based on the users’ preferences for MPF (Virvou & Savvopoulos, 2007). Existing works use MPF as product attributes to recom- mend mobile phone products, instead of using the features of mobile phones as the users’ characteristics (profiles) to recommend products. The typical CF method relies on finding users with similar interests to make recommendations. However, it may suffer from the so-called spar- sity problem because users only rate a few items. As a result, the user-item rating matrix is very sparse, so the recommendation quality is poor due to the difficulty of finding users with similar interests. In mobile shopping environments, active users may only browse=purchase a few items on the mobile web; thus, it is difficult to find users with similar interests based on the product preferences derived from users’ brow- sing=purchasing histories. In this study, we propose a MPF-based hy- brid method to resolve the sparsity issue of the typical CF method used in mobile environ- ments. The MPF-based method uses the fea- tures of users’ mobile phones as user profiles to cluster users into groups with similar character- istics and then makes recommendations. The MPF indicate users’ motivations for using mo- bile services; thus, they can be used to identify users with similar product preferences. For example, the profiles of businessmen or sales representatives who own mobile phones with intelligence and GPS features may indicate a strong interest in high-tech 3C (Computer, Communication and Consumer) products. Thus, we consider MPF as user characteristics to help find users with similar interests. How- ever, some users who own mobile phones with the similar features may not have the similar product preferences. Hence, we still need to refer users’ product preferences for making recom- mendations. Thus, we propose a hybrid method which combines the MPF-based method and the preference-based method to improve recom- mendation quality by considering both MPF and product preferences. Similar to the MPF- based method, the preference-based method makes recommendations based on user groups that are clustered according to the users’ pro- duct preferences. Experiments were conducted to compare the performance of the proposed hybrid method with that of MPF-based, prefer- ence-based and typical CF methods. The results show that the hybrid method outperforms the other methods. The remainder of this paper is organized as follows. In Section 2, we illustrate the c� 2010 Blackwell Publishing LtdExpert Systems, May 2012, Vol. 29, No. 2109 109c� 2010 Blackwell Publishing Ltd Expert Systems, May 2012, Vol. 29, No. 2 background of related methods. In Section 3, we describe the proposed MPF-based, preference- based and hybrid recommendation methods. In Section 4, we present the evaluation metrics and the experiment results. Then in Section 5, we summarize our findings and draw some conclusions. 2. Background Our proposed method is based on MPF, and uses association rule-based and most frequent item-based recommendation methods. In this section, we briefly introduce the concepts and methods that are used in our research. This section also illustrates the typical CF method that is compared with our approach in experi- ment evaluation. 2.1. MPF Mobile phones have evolved from the tradi- tional voice communication model to advanced digital convergence platforms with various fea- tures, such as Bluetooth technology, cameras, card slots, flash lights, as well as java, MP3, radio, touch screen, video and Wi-Fi functions (Ojanpera, 2006). These features enable users to access related mobile services, for example, download MP3 files, upload photos to blogs, video streaming and on-line shopping (Ko et al., 2007). Ling et al. (2006) investigated the impact of MPF on user satisfaction and analysed the feature preferences of diverse ethnic groups as well as preferences based on gender. Virvou and Savvopoulos (2007) developed an intelligent application called iTVMobi, which recommends mobile phone products on an interactive televi- sion. The system uses K-means clustering to group users based on their preferences for the attributes of mobile phones. The system then applies an association rule-based approach to recommend mobile phones based on the users’ preferences. Existing works use MPF as product attributes to recommend mobile phone products, instead of using the features of mobile phones as user characteristics (profiles) to recommend products. The features of different types of mobile phones can be obtained from the respec- tive websites. In this study, we log users’ mobile phone types when they browse products on a mobile shopping website. Then, we derive the phone features preferred by each user and use them to compile MPF-based user profiles. 2.2. Users clustering Clustering techniques, which are usually used to segment users (Punj & Stewart, 1983; Chen et al., 1996), seek to maximize the variance among groups while minimizing the variance within groups. Many clustering algorithms have been developed, such as K-means, hierarchical and fuzzy c-means algorithms (Omran et al., 2007). K-means clustering (MacQueen, 1967) is a similarity grouping method widely used to partition a dataset into k groups. The K-means algorithm assigns instances to clusters based on the minimum distance principle, which assigns an instance to a cluster such that the distance to the centre of the cluster is the minimum over all k clusters. 2.3. Association rule-based recommendation method Association rule mining tries to find the associa- tions between two sets of products in a transac- tion database. Agrawal et al. (1993) formalized the problem of finding association rules that satisfy the minimum support and the minimum confidence requirements. For example, assume that a set of purchase transactions includes a set of product items I. An association rule is an implication of the form: X ) Y, where X � I, Y � I and X T Y ¼ F. X is the antecedent (body) and Y is the consequent (head) of the rule. Two measures, support and confidence, are used to indicate the quality of an association rule. The support of a rule is the percentage of transactions that contain both X and Y, whereas the confidence of a rule is the fraction of transactions that contain X and also contain Y. Sarwar et al. (2000) described the association rule-based recommendation method as follows. For each customer, a customer transaction is c� 2010 Blackwell Publishing Ltd Expert Systems 3110 c� 2010 Blackwell Publishing LtdExpert Systems, May 2012, Vol. 29, No. 2 created to record all the products he=she pur- chased previously. An association rule mining algorithm is then applied to find all the recom- mendation rules that satisfy the given minimum support and minimum confidence. The top-N products to be recommended to a customer u, are then determined as follows: Let Xu be the set of products previously purchased by u. The method first finds all the recommendation rules X ) Y in the rule set. If X � Xu then all products in Y � Xu are deemed candidate pro- ducts for recommendation to the customer u. The candidate products are then sorted and ranked according to the associated confidence of the recommendation rules, and the top-N candidate products are selected as the top-N recommended products. 2.4. Most frequent item-based recommendation method The most frequent item-based recommendation method (Sarwar et al., 2000) counts the pur- chase frequency of each product by scanning the products purchased=browsed by the users in a cluster. Next, all the products are sorted by the purchase frequency in descending order. Final- ly, the method recommends the top-N products that have not been purchased=browsed by the target customer. 2.5. Typical CF method CF (Resnick et al., 1994; Shardanand & Maes, 1995) utilizes the nearest-neighbour principle to recommend products to a target audience. The neighbours are identified by computing the similarity of customers’ purchase behaviour or tastes. The similarity is measured by Pearson’s correlation coefficient, which is defined as fol- lows: corrPðCi; CjÞ ¼ P s2I ðrCi;s � �rCiÞðrCj;s � �rCjÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP s2I ðrCi;s � �rCiÞ2 P s2I ðrCj;s � �rCjÞ2 q ð1Þ where �rCi and �rCj denote the average number of products purchased by customers Ci and Cj, respectively; variable I denotes the mix of the set of products; and rCi;s and rCj;s indicate, respectively, that customers Ci and Cj purchased product item s. The typical CF method utilizes k-nearest neighbours (k-NN) to recommend N products to a target user (Sarwar et al., 2000). The k-NN are identified by computing the similarity of customers’ purchase behaviour or tastes. The similarity is measured by Pearson’s coeffi- cient, as shown in equation (1). After the neighbourhood has been formed, the N recom- mended products are determined by the k-NN as follows. The frequency count of products is calculated by scanning the data about the products purchased=browsed by the k-NN. The products are then sorted based on the frequency count, and the N most frequent pro- ducts that have not been purchased by the target customers are selected as the top-N recommen- dations. 3. Proposed MPF-based hybrid recommendation method In this section, we describe the proposed hybrid recommendation method, which combines an MPF-based method and a preference-based method, as shown in Figure 1. First, the MPF- based method extracts the features of users’ mobile phones from the respective phone web- sites, as shown on the left-hand side of the figure. The features of users’ mobile phones are taken as user profiles to identify users with similar characteristics. The system then applies the K-means clustering method to cluster users into groups based on the similarity of users’ MPF. Next, the association rules and frequently browsed products are extracted from each clus- ter. The system then recommends products based on the association rules and frequently browsed products. However, there may be very few products recommended according to the association rules because of the limited number of products that can be browsed on the mobile web. If the association rule-based recommenda- tions are not sufficient, the most frequent item- based recommendations are used to recommend 4 Expert Systems c� 2010 Blackwell Publishing Ltd111c� 2010 Blackwell Publishing Ltd Expert Systems, May 2012, Vol. 29, No. 2 products to users. Similar to the MPF-based method, the preference-based method, shown on the right-hand side of Figure 1, clusters users by the K-means clustering method based on Pearson’s correlation coefficient of users’ pro- duct preferences. It then recommends products based on the association rules and the most frequent items. Finally, the hybrid recom- mendation scheme combines the MPF-based recommendations and preference-based recom- mendations with the hybrid ratio determined by the preliminary analytical data to recommend products. We discuss the recommendation phase of the MPF-based, preference-based and hybrid recommendation schemes in Sections 3.2 and 3.3, respectively. 3.1. Data pre-processing and clustering We obtained the features of each mobile phone from one of the mobile phone websites. There are more than 100 features on a mobile phone. It is hard to analyse all of them. Therefore, we selected the features based on the following three criteria. (1) Advertisements of a mobile phone retailer: The advertisements of a mobile phone retailer often list the important features for users’ preferences and comparison; (2) Fea- tures with too many missing values are not suitable for analysis and thus are not selected; and (3) Features with values that can discrimi- nate the differences of mobile phones. Table 1 lists the selected features, including Bluetooth Figure 1: An overview of the proposed hybrid recommendation scheme. c� 2010 Blackwell Publishing Ltd Expert Systems 5112 c� 2010 Blackwell Publishing LtdExpert Systems, May 2012, Vol. 29, No. 2 technology, cameras, card slots, flash lights, as well as java, MP3, radio and video functions. The price feature is complicated for analysis, since the prices of mobile phones may vary under different subscription fees provided by various service providers. Thus, we do not select the price feature. The price feature has been somewhat implicitly considered and depends on the selected eight features, because mobile phones with more features are often more ex- pensive. The display feature is not listed in the advertisements of the mobile phone retailer and is a combination of three discrete data type features including screen size, colour and mate- rial. These values of the display features are missing and are difficult to collect. Thus, we do not select the display feature. We calculate the similarity of users based on the selected features. The camera quality feature, which is a discrete data type, and the other seven features are Boolean data types. The camera resolution pixels (3.2, 2.0 and 1.3 mega-pixel resolution) need to be normalized to the seman- tic values of high, medium and low, as shown in equation (2) (Lin et al., 2003). Therefore, we use three Boolean operators to represent high, med- ium and low quality camera resolution (1, 0, 0) represents high quality (0, 1, 0) represents med- ium quality, and (0, 0, 1) represents low quality. Zcamera ¼ Xcamera � MðXcameraÞ sXcamera ð2Þ where Xcamera is the camera quality; and M(Xcamera) and sXcamera are, respectively, the mean value and the standard deviation of the camera quality. Next, we identify all the users’ mobile phones and expand the phones’ features to form a user-mobile phone feature matrix, as shown in Table 2. In the matrix, the values of the camera resolution mega-pixels are transformed into semantic values based on equation (2), with Zcamera < � 0.8, � 0.8 % Zcamera % 0.8 and Zcamera > 0.8, representing low-level, medium- level and high-level quality cameras, respec- tively. We then use the matrix to cluster the users into groups. The MPF-based method clusters users by the K-means clustering method with Pearson’s correlation coefficient based on the users’ preferred MPF. User product preference clustering is more intuitive than user MPF clustering, as it clusters users directly based on the user-product prefer- ence matrix. The preference-based method clus- ters users by the K-means clustering method with Pearson’s correlation coefficient based on users’ product preferences. 3.2. The MPF-based and preference-based recommendation phase After clustering users into groups based on similar MPF or product preferences, the asso- ciation rules and the most frequent items in each group (cluster) are generated for the next step of the recommendation phase. The steps of the MPF-based and preference-based recommenda- tion phase are shown in Figure 2 and described as follows. Let Xu represent the set of products browsed previously by a user u. For each asso- ciation rule X k ! Yk, if Xk � Xu then all pro- ducts in Y k � Xu, denoted by Yuk, are regarded as candidate products for recommendation to the user u. Let Yu AR be the set of all candidate products generated from all association rules that satisfy X k � Xu. The products in YuAR are ranked according to c(Yu k ), that is, the associated confidence of the association rule (AR) X k ! Yk. We compare the number of candidate pro- ducts jYARu jand the top-N recommendations. If the former is greater than the latter, the system Table 1: Mobile phone features No Feature Data type Value 0 Bluetooth Boolean (0, 1) 1 Camera quality Discrete (Low, Medium, High) 2 Card slot Boolean (0, 1) 3 Flash light Boolean (0, 1) 4 Java Boolean (0, 1) 5 MP3 Boolean (0, 1) 6 Radio Boolean (0, 1) 7 Video Boolean (0, 1) 6 Expert Systems c� 2010 Blackwell Publishing Ltd113c� 2010 Blackwell Publishing Ltd Expert Systems, May 2012, Vol. 29, No. 2 Table 2: User-mobile phone feature matrix User ID Phone type Bluetooth Camera Card slot Flash light Java MP3 Radio VideoH M L 1 MOTO V191 0 0 0 1 0 0 1 1 0 1 2 Nokia N70 1 1 0 0 1 1 1 1 1 1 3 SAMSUNG SGH-Z238 1 0 1 0 1 0 1 1 0 1 4 Sony Ericsson K800i 1 1 0 0 1 1 1 1 1 1 Figure 2: The MPF-based and preference-based recommendation phase. c� 2010 Blackwell Publishing Ltd Expert Systems 7114 c� 2010 Blackwell Publishing LtdExpert Systems, May 2012, Vol. 29, No. 2 recommends the top-N products among the products in Yu AR . On the other hand, if the number of candidate products jYARu j is less than the number of top N recommendations ðjYARu j < NÞ, the remaining N � jYARu j pro- ducts for recommendation are selected from Yu MF . The selected products are the most fre- quent items ranked according to the frequency count of products browsed by the users in the target user’s cluster. Then, products in Yu MF that have not been browsed by the user u and have not been included in Yu AR are added to the recommended product list so that the number of top-N recommendations is sufficient. 3.3. The hybrid recommendation phase The hybrid recommendation phase combines the MPF-based method and the preference- based method, as shown in Figure 3. Similar to the MPF-based method, the hybrid method first recommends products based on the association rules (AR); and then recommends products based on the most frequent item (MF) count. Let X Mi ! YMi and XPj ! YPj be the associa- tion rules extracted from an MPF-based cluster (M) and a preference-based cluster (P), respec- tively; and let their associated confidence scores be c Mi and c Pj , respectively. In addition, let Xu represent the set of products previously browsed by the target user u; and let Yu AR be the set of all candidate products generated from all associa- tion rules that satisfy X Mi � Xu or XPj � Xu. The products in Yu AR are ranked according to the weighted sum of their confidence scores. cH ¼ wM � cMi þ wP � cPj ð3Þ where wM and wP are the weights assigned to the MPF-based approach and the preference-based approach, respectively. Similar to the MPF-based method and the preference-based method, if the number of can- didate products jYARu jis less than the number of top N recommendations ðjYARu j